This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: kprobe fault handling






systemtap-owner@sourceware.org wrote on 08/02/2006 04:38:31:

>
> Recalling the way this was supposed to work ... AFAICR
>
> There are 2 main situations where one could land up into
> the page fault handler during kprobe processing:
>
> (1) We have completed the probe handler and are single-stepping the
probed
>     instruction. The original instruction being single-stepped causes a
>     page fault. This would typically happen if that instruction issues a
>     memory access that is paged out.
>

Yes you are correct, the code does do this, but its not supposed to. The
pagefault handler was intended solely to protect the system from the
probe-handler, not to catch faults in the original instruction.

I think what's happened is that two needs have become confused.  Going back
to the original dprobes implementation:

The pagefault callback was there to allow the probe-handler to continue
even if it touched inaccessible memory.
The dprobes interpreter always hooked faults and protected the system from
the probe handler.
On single-stepping the original instruction any exception was percolated up
to the systems, however the log buffer would be discarded by the dprobes
event handler unless logonfault had been specified. This was implemented to
support tracing and to suppress multiple trace entries for tracepoint that
would temporarily fault, then subsequently retry execution successfully.

When we removed the dprobes interpreter to a device drive and created a
minimal kernel API set (krpobes) to support the device driver, we needed to
support this capability through the kprobes APIs. So what appears to have
happened is that the single-step fault and the handler fault have become
handler through the same mechanism. So the question is: is this
functionally correct?

They reason why I doubt that it is correct is that any one using the
kprobes APIs is going to need a very thorough understanding of the
motivation behind this interface to use it correctly. I would propose
separating out the notification of single-stepping (original instruction
execution) exceptions from self-generated probe-handder exceptions.

These are some possible options:

1)
leave the page-fault handler to be a called routine that's invoked instead
of the post-handler when single-step generates an exception.
create a setjump, longjump mechanism to allow the probe handler to catch
any self-generated exception but always protect the rest of the system from
any exception that is not intercepted.

2)

leave the page-fault handler as it - a called routine, but don't invoke it
for single-step exceptions.
allow the page-fault handler to deal with creating or simulating a
setjump/longjump back to the original pre or post handler code.
pass a switch to the post-handler to indicate whether the single-stepping
resulted in a exception.

3) a combination of 1 & 2, (which I think I prefer)
dispense with the page-fault handler routine.
create a setjump/long mechanism for the pre- and post-handler routines to
use if they wish.
always shield any unhandled exception that has been generated by the pre-
and post- handlers from the rest of the system
indicate via a switch to the post-handler whether the singe-stepping
resulted in an exception (other than a debug exception of course :-))


Finally in response to how are things broken:

Well I've just conducted an experiment on kprobes where I deliberately
generate a pagefault in the pre-handler and it appears that my pae-fault
handler is never called, be we do seem to be protecting the system from may
pagefault exception.


> (2) We are executing the probe handler, and an instruction in the probe
>     handler causes a page fault. This could, for example, happen if the
>     probe handler attempts to access a user-space address by issuing
>     copy_from_user or get_user etc, and that address is currently
>     paged out or invalid.
>
> The check KPROBE_HIT_SS enables us to distinguish between these two
cases.
>
> If set, it indicates that (1) must have occured. In which case, we want
> normal page fault handling to continue to that the data accessed by the
> instruction is paged in, and instruction execution continue after
> fault handling as would happen in normal execution. However, since
> this can cause a sleep, other probe hits etc, we turn off kprobes
> related state (hence the resume execution) before letting it continue.
Once
> the data is paged in, the page fault handler would resume execution from
> the original probed address and thus trap again on the breakpoint
instruction.
> So we encounter the probe handler a second time, and this time
single-stepping
> would succeed since the address accessed by the instruction is already
> in memory, and finally we would get a single-step trap and the post
handler
> would be executed.
>
> In KPROBE_HIT_SS is not set, we enter case (2). Now since we are (or at
> least were) running without interrupts, the default behaviour was to
> soft-fail the access, i.e. cause the copy_from_user/get_user call in
> the probe handler to fail with -EFAULT. How does this happen ? The
> in_atomic() check or its equivalent in do_page_fault succeeds and hence
> control jumps to bad_area_nosemaphore where it finds that the access
> happened in kernel mode and hence moves to no_context, where
> fixup_exception() takes care of the rest.
> More sophisticated or customised behaviour could be achieved by providing
> a ->fault_handler(), we could play some neat tricks (like Richard
mentioned)
> perform fixups etc.
>
> I hope that makes it a little clearer.
>
> Could you now help me understand what is broken now, if at all ?
>
> Regards
> Suparna
>
> On Tue, Feb 07, 2006 at 11:49:22AM -0800, Jim Keniston wrote:
> > On Tue, 2006-02-07 at 09:51, Martin Hunt wrote:
> > > On Tue, 2006-02-07 at 09:31 -0800, Jim Keniston wrote:
> > > [...]
> > > > > > And that's it. kprobe_fault_handler returns 0.  No call to
> > > > > > fixup_exceptions()!  So do_page_fault() will have to do
> the fixups, but
> > > > > > first it will print nasty might_sleep warnings and maybe
> actually sleep!
> > > >
> > > > You have the right idea of running fixup_exceptions() in the fault
> > > > handler, to prevent do_page_fault() from attempting its
demand-paging
> > > > stuff.  I just think it's OK to have the user's fault handler,
rather
> > > > than kprobe_fault_handler, run fixup_exceptions.
> > >
> > > I think if the user's fault handler does not handle the page fault,
> > > kprobes should attempt to do so. Returning without handling the page
> > > fault could leave the system in a bad state. So your proposal would
mean
> > > it is necessary to specify a fault handler for any kprobe that might
> > > attempt user-space access or otherwise trigger a page fault,
otherwise
> > > the system could crash.
> >
> > I just had a long chat with Richard Moore about this whole topic.  I
> > agree with you on this, and I think Richard would, too.
> >
> > So unless there's a user-specified handler and that handler specifies
> > (by returning 1) that it has handled the exception,
> > kprobe_fault_handler() should run fixup_exception(), right?
> >
> > Now I'm looking, later in that function, at the code (on i386) where we
> > handle an exception while single-stepping.  I don't think
> > resume_execution() is the right thing to do here.  We haven't
> > successfully executed the probed instruction, and the eip still points
> > at that instruction, right?  I think we're just hosed at this point.
> > Comments?
> >
> > >
> > > Any more importantly, fixup_exception() and all the exception table
code
> > > are currently not exported so cannot be called from a module.
> >
> > It's hard to argue with that. :-)
> >
> > >
> > > Martin
> >
> > Jim
> >
> >


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]