This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: resuming after stop at syscall_entry
- From: David Smith <dsmith at redhat dot com>
- To: Roland McGrath <roland at redhat dot com>
- Cc: utrace-devel at redhat dot com, Systemtap List <systemtap at sources dot redhat dot com>
- Date: Tue, 28 Apr 2009 10:17:58 -0500
- Subject: Re: resuming after stop at syscall_entry
- References: <20090418042722.5B584FC35F@magilla.sf.frob.com> <49EF81AA.3060306@redhat.com> <20090425005837.44FB8FC262@magilla.sf.frob.com>
Roland McGrath wrote:
>> This processing makes sense I think. It is a bit complicated of course,
>> but not unnecessarily so.
>
> Glad to hear it!
>
>>> A tracing-only engine that just wants to see the syscall that is going
>>> to be done can just do:
>>>
>>> if (utrace_resume_action(action) == UTRACE_STOP)
>>> return UTRACE_REPORT;
>>>
>>> at the top of report_syscall_entry, so it just doesn't think about it
>>> until it thinks the call will go now through.
>> Systemtap currently doesn't support changing syscall arguments, if it
>> does, obviously a few things would need to change.
>>
>> But, I think systemtap would probably fall here - only see the syscall
>> that is actually going to be done. So systemtap could possibly get
>> multiple callbacks for the same syscall, but only pay attention to the
>> last one, correct?
>
> Correct. The advice quoted above is what its callbacks would do to ignore
> the callbacks before the last one.
>
> Note that you'll only be sure you're seeing "actually going to be done"
> state if yours is the "first" engine attached. (Thus, by the new special
> case calling order, its will be the last report_syscall_entry callback to
> run.) This is just the general "engine priority" thing, not anything new.
>
> In cases like ptrace and kmview (Renzo's thing), even if these engines are
> first (i.e. called after yours), you will still be seeing the "final" state
> because they did their changes asynchronously before resuming. But some
> other engine might do its changes directly in its own callback instead
> (whether it used UTRACE_STOP and got a repeat callback, or just on the
> first time through without stopping), so those changes would happen only
> after your "last" callback.
>
> In the same vein, "earlier" engines (i.e. here called after yours) might
> use UTRACE_STOP after your first callback had every reason to believe it
> was the "last" one (i.e. that if did not hit). In that case, you will get
> a repeat call (with UTRACE_SYSCALL_RESUMED flag). On that call, you need
> to cope with the fact that you already did your entry tracing work before
> (but now things may have changed).
>
> If the theory is that you want to respect your place in the engine order,
> whatever that is (i.e., if your tracing just reported a lie, it was the lie
> you were supposed to believe), then "coping" just means ignoring the
> repeat. (This is no different in kind from an "earlier" engine/later
> callback changing the registers after your callback and never stopping.)
>
> For that you need to keep track of whether you already handled it or not.
> (Depending on your relative order and the actions of the other engines, you
> might get either UTRACE_STOP or UTRACE_SYSCALL_RESUMED either before or
> after "you handled it". So you can't use those alone.) You can do this in
> two ways. One is to use your own per-thread state (engine->data, etc.).
> The other is to disable the SYSCALL_ENTRY event when you've handled it, so
> you won't get more callbacks. Then you can re-enable the event in your
> report_syscall_exit callback (or report_quiesce/report_signal, or whatever
> is most convenient to be sure you'll run before it goes back to user mode).
> i.e., use utrace_set_events() from the callbacks.
It sounds like disabling SYSCALL_ENTRY then re-enabling it in the
report_syscall_exit() callback is a reasonable way to go.
>> This is understandable, but does hurt my head a *little* bit. I think
>> if you put the above full text somewhere and provided some examples this
>> would make sense to people.
>
> The utrace-syscall-resumed branch puts this in the kerneldoc text for
> struct utrace_engine_ops (where callback return values and common arguments
> are described):
>
> * When %UTRACE_STOP is used in @report_syscall_entry, then @task
> + * stops before attempting the system call. In this case, another
> + * @report_syscall_entry callback follows after @task resumes; in a
> + * second or later callback, %UTRACE_SYSCALL_RESUMED is set in the
> + * @action argument to indicate a repeat callback still waiting to
> + * attempt the same system call invocation. This repeat callback
> + * gives each engine an opportunity to reexamine registers another
> + * engine might have changed while @task was held in %UTRACE_STOP.
> + *
> + * In other cases, the resume action does not take effect until @task
> + * is ready to check for signals and return to user mode. If there
> + * are more callbacks to be made, the last round of calls determines
> + * the final action. A @report_quiesce callback with @event zero, or
> + * a @report_signal callback, will always be the last one made before
> + * @task resumes. Only %UTRACE_STOP is "sticky"--if @engine returned
> + * %UTRACE_STOP then @task stays stopped unless @engine returns
> + * different from a following callback.
>
> I don't know where the longer explanation and/or examples belong.
> Perhaps in a new section in utrace.tmpl? We could start with putting
> together some text on the wiki. Another idea is to add a few example
> modules in samples/utrace/. Those can illustrate things with good
> comments, and also could be built verbatim to load multiple
> ones/instances in different orders and demonstrate what happens, etc.
The wiki would be fine - just somewhere that people could see this stuff.
> It would be nice to have folks like you and Renzo work up this text
> and/or examples. What's needed is stuff that makes sense to you guys
> as users of the API, rather than what makes sense to me who has
> thought too much already about all this stuff.
We should probably just dump your email into the wiki.
--
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)