This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] stap/staprun do not terminate properly


David Smith writes:
 > On 03/06/2014 03:30 PM, Torsten Polle wrote:
 >> Hi,
 >> 
 >> I'm using the uprobes-inode with task_finder2.c and had two problems,
 >> when I wanted to terminate my probe runs.
 >> 
 >> I tested the patches with uprobes-inode and the utrace based version.
 >> 
 >> Kind Regards,
 >> Torsten

 > Torsten,

 > Thanks *so* much for the patches. I've seen a hang in stap around
 > this area, but I could never reproduce it.

David,

I could easily reproduce the problem for half a year now 100%, but I
never got the time to find the root cause.

 > I checked the 1st patch in as commit e695d46 and the 2nd patch
 > (tweaked) in as commit 9ee1bfe.

 > I tweaked the 2nd patch just a bit. Originally the flow went like:

 > ====
 > stap_stap_task_finder()
 > {
 >   // ...

 >   // Note that utrace_exit() calls stp_task_work_exit()
 >   utrace_exit();

 >   __stp_tf_cancel_task_work();
 > }
 > ====

 > Your patch changed it to this:

 > ====
 > stap_stap_task_finder()
 > {
 >   // ...

 >   utrace_exit();

 >   // Note that __stp_tf_cancel_task_work() calls
 >   // stp_task_work_exit()
 >   __stp_tf_cancel_task_work();
 > }
 > ====

 > I saw what you were doing, but that didn't "feel" quite right.
 > utrace_init() calls stp_task_work_init(), so it made sense for
 > utrace_exit() to call stp_task_work_exit().

 > So, instead I did this:

 > ====
 > stap_stap_task_finder()
 > {
 >   // ...

 >   __stp_tf_cancel_task_work();

 >   // Note that utrace_exit() calls stp_task_work_exit()
 >   utrace_exit();
 > }
 > ====

 > This moves canceling all outstanding task_work items before shutting
 > down utrace (and calling stp_task_work_exit()). I think the end
 > result is the same as your patch, and I think this makes a little
 > more sense.  This way we've canceled all the task_work items before
 > shutting down utrace (and freeing all the memory allocated for
 > utrace).

 > If this doesn't work for you or you see a hole in this logic please
 > let me know.

I can't beat your logic. It should work for me. Unfortunately, I don't
have direct access to my target for two weeks.

 > BTW, if you have a good idea for a reproducer for the original
 > problem I'd like to see it. Perhaps I could add a test case for it.

I simply define a process probe and cross compile the module "foo" for
an ARM target. Then I run "staprun -o /tmp/probes.txt foo". After a
while I (try to) terminate the execution by "Ctrl-C".

If there is a process that is never scheduled, the task worker for the
process is never executed. Thus, staprun hangs. Usually, there are a few
processes that exhibit this behaviour on my target.

 > Thanks again for the patches!

 > -- 
 > David Smith
 > dsmith@redhat.com
 > Red Hat
 > http://www.redhat.com
 > 256.217.0141 (direct)
 > 256.837.0057 (fax)

Torsten


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]