This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

systemtap bug #6500 fallout


Frank/Ananth,

I've fixed bug #6500 (utrace support doesn't handle multithread apps
properly), but caused some interesting fallout.

Originally, here's how the utrace probe layer worked with the task
finder layer.  The task finder attached a utrace engine to all
non-kernel threads.  A utrace probe gets registered with the task finder
layer by telling it which threads it finds "interesting" (by specifying
either the pid or pathname).  When the task finder layer found an
interesting thread, it called a callback function that the utrace probe
layer provides.  Its signature looks like this:

typedef int (*stap_task_finder_callback)(struct task_struct *tsk,

                                         int register_p,

                                         struct stap_task_finder_target
*tgt);

If register_p is 1, the utrace probe layer attaches an engine with any
other events it is interested in (like syscall events) to the thread.
If the user was interested in an exec event, the probe handler gets
called in the callback (if you add an utrace engine that wants to handle
exec while you are in an exec event, it won't get called).

If register_p was 0, the utrace probe layer is notified that the thread
is dying and cleans up.  It also calls any death probe handler.

This system worked well for traditional fork/exec style programs (except
I now realize you'd miss syscalls executed in the child process between
the fork and the exec).


So, to fix this problem for multi-threaded programs, the callback gets
called (with register_p=1) for clone *and* exec events on "interesting"
threads.  This seems to work well for multi-threaded programs.  However,
for traditional fork/exec style programs, this has caused a behavior
change (besides that any exec probes are probably getting called too
often).  Let's say you have the following script:

  probe process("/bin/bash").syscall {
    printf("/bin/bash(%d) syscall\n", tid())
  }
  probe process("/bin/ls").syscall {
    printf("/bin/ls(%d) syscall\n", tid())
  }

Then, from a bash session run ls.  Originally the output would look
something like this:

1: the syscalls for bash (from pid X - note that you wouldn't see
syscalls in the newly forked bash)
2: the syscalls for the exec'ed ls (from new pid Y)
3: the syscalls for bash (from original pid X)

After the bug #6500 fix, you'll see something like this:

1: the syscalls for bash (from pid X)
2: after the fork, syscalls from the new bash (from new pid Y)
3: after the exec, the syscalls for ls (from new pid Y) reported for
both bash *and* ls
4: the syscalls for bash (from original pid X)

Number 3 above doesn't seem correct to me.  Once the exec happens, the
pid is no longer 'bash'.  But, because the bash utrace engine wasn't
detached from that pid, the events still get reported.


So, my tentative plan is to have the task finder layer call the callback
 (potentially) twice when an exec happens - once to unregister the "old"
path and once to register the "new" path.

Does this seem reasonable?

Thanks.

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]