This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: How to get the detailed kernel stack trace if specified fuction takes too long to finish?


I tried the last scheduler script, the output just like this:
# stap --all-modules y.stp
WARNING: missing unwind/symbol data for module 'uprobes'
long sys_write in tid 10162
Systemtap probe: kernel.trace("sched_switch")
long sys_write in tid 11510
Systemtap probe: kernel.trace("sched_switch")
long sys_write in tid 11288
Systemtap probe: kernel.trace("sched_switch")

I could not figure out why "print_backtrace()" didn't output any log:)

On Sat, Mar 29, 2014 at 12:30 AM, Josh Stone <jistone@redhat.com> wrote:
> On 03/27/2014 09:22 PM, liang xie wrote:
>> Hi all,
>>
>> Probably it's a newbie question, to be honest, i am new to SystemTap,
>
> No problem, welcome!
>
>> currently i want to capture the kernel stack trace if "sys_write" runs
>> too long, if that i could know where it sucks.
>>
>> My OS:2.6.32-220.el6.x86_64
>> # stap -V
>> Systemtap translator/driver (version 1.6/0.152 non-git sources)
>>
>> My tried script:
>> global i=0
>> global tb=0
>> global te=0
>> probe kernel.function("sys_write") {
>>   tb=gettimeofday_us();
>> }
>> probe kernel.function("sys_write").return {
>>   te=gettimeofday_us();
>>   if(te-tb > 100000) {
>>    i++;
>>    printf("time_us: %d\n", te-tb);
>>    print_backtrace();
>>    if(i > 15) {
>>     exit();
>>    }
>>   }
>> }
>>
>> # stap -x <pid> xxx.stp
>> time_us: 226795
>> Returning from:  0xffffffff81176f90 : sys_write+0x0/0x90 [kernel]
>> Returning to  :  0xffffffff8100b0f2 : system_call_fastpath+0x16/0x1b [kernel]
>> time_us: 101400
>> Returning from:  0xffffffff81176f90 : sys_write+0x0/0x90 [kernel]
>> Returning to  :  0xffffffff8100b0f2 : system_call_fastpath+0x16/0x1b [kernel]
>> time_us: 527853
>> Returning from:  0xffffffff81176f90 : sys_write+0x0/0x90 [kernel]
>> Returning to  :  0xffffffff8100b0f2 : system_call_fastpath+0x16/0x1b [kernel]
>> time_us: 133948
>> Returning from:  0xffffffff81176f90 : sys_write+0x0/0x90 [kernel]
>> Returning to  :  0xffffffff8100b0f2 : system_call_fastpath+0x16/0x1b [kernel]
>>
>> It doesn't show my "expected" kernel stack trace,e.g. file system or
>> block layer related stack trace. How to archive that? Any pointers
>> will be highly appreciated!
>
> Right, since you waited until return to print_backtrace, it will only be
> from the end of sys_write.  (Though it appears to be the beginning,
> sys_write+0x0, because of the way a kretprobe captures returns.)
>
> Since you want info from deeper kernel internals, you'll need additional
> probes.  One option is to use the para-callgraph example script, where
> sys_write is your first argument to start a trace, and the second
> argument gives a broad subsystem to dig through.  Like:
>
> $ stap para-callgraph.stp 'kernel.function("sys_write")' \
>         'kernel.function("*@fs/*")'
>
> This script will show you timestamps, but you'll get all output whether
> it's slow or not.  Maybe you can modify it to filter some of the
> information.
>
> Another option is to use timer.profile, and then you can watch for your
> long waits.  Something like:
>
> global start_time
> probe syscall.write { start_time[tid()] = gettimeofday_us() }
> probe syscall.write.return { delete start_time[tid()] }
> probe timer.profile {
>   t = tid()
>   if (t in start_time && gettimeofday_us() - start_time[t] > 100000)
>   {
>     printf("long sys_write in tid %d\n", t)
>     print_backtrace()
>     delete start_time[t]
>   }
> }
>
> Even this assumes that the long wait is busy on the cpu, but I think
> most likely it's an IO wait, which means you'll be scheduled off while
> it completes.  So then instead of timer.profile, or in addition, you
> could use a scheduler tracepoint.  Like:
>
> global start_time
> probe syscall.write { start_time[tid()] = gettimeofday_us() }
> probe syscall.write.return { delete start_time[tid()] }
> probe kernel.trace("sched_switch") {
>   t = task_tid($next)
>   if (t in start_time && gettimeofday_us() - start_time[t] > 100000)
>   {
>     printf("long sys_write in tid %d\n", t)
>     print_backtrace()
>     delete start_time[t]
>   }
> }
>
> So as your thread is scheduled back onto the cpu, if it has been a long
> time, the backtrace will tell you what path it took to get scheduled off
> in the first place.
>
>
> Hope that's enough to give you some direction...
>
> Josh


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]