This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: User application hangs with systemtap 2.3


Thanks for your help.  I tried it on a VM where I had installed Ubuntu
from scratch, and it works fine there.  The machine where it's failing
is our continuous integration server, running on AWS.  It was created
from scratch maybe 6 months ago, but originally had an earlier kernel
and I upgraded the kernel in the hopes of getting systemtap to work.

Thanks for the note about -g.  In our actual application we need it,
but I just tried without it, and got the same problem.

Here's the dmesg output on the problem machine:

stap_1f8d3a6e59c6857e0a55d3689cfb95_15281: systemtap: 2.3/0.152, base:
ffffffffa0156000, memory: 54data/40text/0ctx/2058net/9alloc kb,
probes: 1

and on the working machine:

stap_f91baab8de4703e9d5e2594fc982b45_3593: systemtap: 2.3/0.152, base:
ffffffffa02d9000, memory: 54data/40text/1ctx/2058net/17alloc kb,
probes: 1

Is the "1ctx" vs "0ctx" significant?  What about "9alloc kb" vs "17alloc kb"?

Where do I give the -DSTP_ALIBI?  When compiling systemtap, when
compiling my user code, or on the stap command line?  From poking
around, it seems like on the stap command line.  Here's the result on
the broken system:

$ g++-4.7 -Wall -Wextra ./test.cpp -o test && sudo stap -DSTP_ALIBI -c
'./test' temp.stp
About to hit probe.
Back from probe.
^CWARNING: Child process exited with signal 2 (Interrupt)
----- probe hit report:
WARNING: /usr/bin/staprun exited with status: 1
Pass 5: run failed.  [man error::pass5]

and on the working system:

$ g++-4.7 -Wall -Werror ./test.cpp -o test && sudo stap -DSTP_ALIBI=1
-c ./test temp.stp
About to hit probe.
Back from probe.
This output doesn't show up.
----- probe hit report:
process("/home/likewise-open/SILVERLINING/martin/test").statement(0x40077f),
(temp.stp:1:1), hits: 1, from: process("./test").statement(0x40077f)
from: process("./test").provider("nfs").mark("writeBackendFailed")
from: process("./test").provider("nfs").mark("writeBackendFailed"),
index: 0

I just tried uninstalling & reinstalling systemtap on the broken
system, no luck.  Any suggestions for what I should try next?

Thanks,
Martin


On Thu, Sep 26, 2013 at 3:28 PM, Josh Stone <jistone@redhat.com> wrote:
> On 09/26/2013 10:02 AM, Martin Martin wrote:
>> I just realized I forgot to include the systemtap script in my last
>> email.  Here it is:
>>
>> probe process("./test").provider("nfs").mark("writeBackendFailed") {
>>   print("Hello!\n")
>> }
>
> Note, nothing about this script requires the -g flag for guru mode.  And
> if you add yourself to stapusr and stapdev, you won't need to use sudo
> with stap (and it will do only the bare minimum with root).  But neither
> of those things should cause the trouble you're seeing.
>
>> On Thu, Sep 26, 2013 at 11:40 AM, Martin Martin <martin@infinio.com> wrote:
>>> Hi,
>>>
>>> I'm trying to use systemtap on Ubuntu 12.04.3 LTS, Linux kernel
>>> 3.5.0-40-generic.  I installed systemtap from source.  Currently, a
>>> simple DTRACE_PROBE is causing my application to hang.  Here's a
>>> simple reproduction:
>>>
>>> $ cat test.cpp
>>> #include <sys/sdt.h>
>>> #include <iostream>
>>>
>>> using namespace std;
>>>
>>> int main() {
>>>   cerr << "About to hit probe.\n";
>>>   DTRACE_PROBE(nfs, writeBackendFailed);
>>>   cerr << "Back from probe.\n";
>>>   cerr << "This output doesn't show up.\n";
>>> }
>>>
>>> $ g++ -Wall -Wextra ./test.cpp -o test && sudo stap -c './test' -g temp.stp
>>> About to hit probe.
>>> Back from probe.
>>> [my application hangs here, then I hit Ctrl-C]
>>> ^CWARNING: Child process exited with signal 2 (Interrupt)
>>> WARNING: /usr/bin/staprun exited with status: 1
>>> Pass 5: run failed.  [man error::pass5]
>
> On kernel-3.11.1-200.fc19.x86_64, it works for me:
>
> $ g++ -Wall -Wextra ./test.cpp -o test && stap -c './test' temp.stp
> About to hit probe.
> Back from probe.
> This output doesn't show up.
> Hello!
>
> The placement of the "Hello!" doesn't mean the probe was in the wrong
> place -- it's just due to buffered output.
>
> Odd that you don't even get the "Hello!" anywhere in yours though, since
> the probe point was clearly passed, and it doesn't seem that stap was
> hung itself since it cleaned up ok.
>
>>> What can I do to track down the problem?
>
> Was anything noted in dmesg?
>
> We have an option -DSTP_ALIBI which nullifies most of stap's handler, so
> you can try to test if this is a kernel issue.
>
> You can also try probing it manually with perf or debugfs tracing, to
> completely rule out stap, as mjw did to reproduce bug 15972.
>   https://sourceware.org/bugzilla/show_bug.cgi?id=15972#c4
>
> Note that Linux 3.5 is the first release that had in-kernel uprobes, so
> it's quite possible you're just hitting some early bug there.  I have no
> idea whether Ubuntu LTS is tracking any of those fixes.
>
>>> Here's how I installed it:
>>>
>>>     sudo apt-get install elfutils linux-headers-$(uname -r)
>>>     sudo apt-get build-dep systemtap
>>>     wget --no-check-certificate
>>> https://sourceware.org/systemtap/ftp/releases/systemtap-2.3.tar.gz
>>>     tar xavf systemtap-2.3.tar.gz
>>>     cd systemtap-2.3 && ./configure --prefix=/usr && make all && sudo
>>> make install
>
> This procedure looks fine.
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]