This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: kprobe fault handling


On Thu, 2006-02-09 at 13:35 -0800, Jim Keniston wrote:

> > +		/*
> > +		 * In case the user-specified fault handler returned zero,
> > +		 * try to fix up.
> > +		 */
> > +
> > +		if (fixup_exception(regs))
> > +			return 1;
> 
> I think it's OK to call fixup_exceptions() here, but I believe it's
> redundant.  I understood Suparna to say
> (http://sourceware.org/ml/systemtap/2006-q1/msg00423.html) that if we
> return 0, do_page_fault() will call fixup_exceptions() instead of trying
> to bring in the missing page (since it's a kernel instruction -- in a
> handler -- that faulted).  Her explanation made sense to me.

But experimentally things don't work the way they should.
I see lots of these

Feb  2 23:15:53 monkey2 kernel: Debug: sleeping function called from
invalid context at mm/page_alloc.c:618
Feb  2 23:15:53 monkey2 kernel: in_atomic():0[expected: 0],
irqs_disabled():1
Feb  2 23:15:53 monkey2 kernel:  [<c011df50>] __might_sleep+0x7d/0x89
Feb  2 23:15:53 monkey2 kernel:  [<c014b802>] __alloc_pages+0x3a/0x2f7
Feb  2 23:15:53 monkey2 kernel:  [<c0157a48>] do_no_page+0x55/0x3bf
Feb  2 23:15:53 monkey2 kernel:  [<c011a19e>] pte_alloc_one+0x18/0x49
Feb  2 23:15:53 monkey2 kernel:  [<c015553d>] pte_alloc_map+0x66/0x12d
Feb  2 23:15:53 monkey2 kernel:  [<c0157f6d>] handle_mm_fault+0xb0/0x1fd
Feb  2 23:15:53 monkey2 kernel:  [<c011a8ed>] do_page_fault+0x1ac/0x4dc
Feb  2 23:15:53 monkey2 kernel:  [<c02ab2c1>] sock_aio_write+0x106/0x113
Feb  2 23:15:53 monkey2 kernel:  [<c0119263>] kprobe_exceptions_notify
+0xc6/0x123
Feb  2 23:15:53 monkey2 kernel:  [<c011a741>] do_page_fault+0x0/0x4dc
Feb  2 23:15:53 monkey2 kernel:  [<c030fa4f>] error_code+0x2f/0x38
Feb  2 23:15:53 monkey2 kernel:  [<c01e6028>] __copy_from_user_ll
+0x30/0x48
Feb  2 23:15:53 monkey2 kernel:  [<e0b9dac7>] _stp_copy_from_user
+0x2d/0x4f [copy]
Feb  2 23:15:53 monkey2 kernel:  [<c0168211>] sys_read+0x0/0x62
Feb  2 23:15:53 monkey2 kernel:  [<e0b9dc40>] inst_sys_read+0x15/0x45
[copy]
Feb  2 23:15:53 monkey2 kernel:  [<c0119020>] kprobe_handler+0x1f0/0x230
Feb  2 23:15:53 monkey2 kernel:  [<c01191ce>] kprobe_exceptions_notify
+0x31/0x123
Feb  2 23:15:53 monkey2 kernel:  [<c0130c59>] notifier_call_chain
+0x17/0x2e
Feb  2 23:15:53 monkey2 kernel:  [<c01076f7>] do_int3+0x3d/0xcf
Feb  2 23:15:53 monkey2 kernel:  [<c0143260>] audit_syscall_entry
+0x124/0x13d
Feb  2 23:15:53 monkey2 kernel:  [<c030fbaf>] int3+0x1f/0x30
Feb  2 23:15:53 monkey2 kernel:  [<c0168212>] sys_read+0x1/0x62
Feb  2 23:15:53 monkey2 kernel:  [<c030f8cb>] syscall_call+0x7/0xb
Feb  2 23:15:53 monkey2 kernel:  [<c030007b>] xfrm_policy_gc_kill
+0x39/0x68

The above only happens on non-smp machines.  On SMP, I usually get
crashes.  Putting the fixup_exception() call in got rid of the messages
and crashes for me.

That's as far as I have investigated. 


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]