This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: How to get the detailed kernel stack trace if specified fuction takes too long to finish?


On 03/27/2014 09:22 PM, liang xie wrote:
> Hi all,
> 
> Probably it's a newbie question, to be honest, i am new to SystemTap,

No problem, welcome!

> currently i want to capture the kernel stack trace if "sys_write" runs
> too long, if that i could know where it sucks.
> 
> My OS:2.6.32-220.el6.x86_64
> # stap -V
> Systemtap translator/driver (version 1.6/0.152 non-git sources)
> 
> My tried script:
> global i=0
> global tb=0
> global te=0
> probe kernel.function("sys_write") {
>   tb=gettimeofday_us();
> }
> probe kernel.function("sys_write").return {
>   te=gettimeofday_us();
>   if(te-tb > 100000) {
>    i++;
>    printf("time_us: %d\n", te-tb);
>    print_backtrace();
>    if(i > 15) {
>     exit();
>    }
>   }
> }
> 
> # stap -x <pid> xxx.stp
> time_us: 226795
> Returning from:  0xffffffff81176f90 : sys_write+0x0/0x90 [kernel]
> Returning to  :  0xffffffff8100b0f2 : system_call_fastpath+0x16/0x1b [kernel]
> time_us: 101400
> Returning from:  0xffffffff81176f90 : sys_write+0x0/0x90 [kernel]
> Returning to  :  0xffffffff8100b0f2 : system_call_fastpath+0x16/0x1b [kernel]
> time_us: 527853
> Returning from:  0xffffffff81176f90 : sys_write+0x0/0x90 [kernel]
> Returning to  :  0xffffffff8100b0f2 : system_call_fastpath+0x16/0x1b [kernel]
> time_us: 133948
> Returning from:  0xffffffff81176f90 : sys_write+0x0/0x90 [kernel]
> Returning to  :  0xffffffff8100b0f2 : system_call_fastpath+0x16/0x1b [kernel]
> 
> It doesn't show my "expected" kernel stack trace,e.g. file system or
> block layer related stack trace. How to archive that? Any pointers
> will be highly appreciated!

Right, since you waited until return to print_backtrace, it will only be
from the end of sys_write.  (Though it appears to be the beginning,
sys_write+0x0, because of the way a kretprobe captures returns.)

Since you want info from deeper kernel internals, you'll need additional
probes.  One option is to use the para-callgraph example script, where
sys_write is your first argument to start a trace, and the second
argument gives a broad subsystem to dig through.  Like:

$ stap para-callgraph.stp 'kernel.function("sys_write")' \
        'kernel.function("*@fs/*")'

This script will show you timestamps, but you'll get all output whether
it's slow or not.  Maybe you can modify it to filter some of the
information.

Another option is to use timer.profile, and then you can watch for your
long waits.  Something like:

global start_time
probe syscall.write { start_time[tid()] = gettimeofday_us() }
probe syscall.write.return { delete start_time[tid()] }
probe timer.profile {
  t = tid()
  if (t in start_time && gettimeofday_us() - start_time[t] > 100000)
  {
    printf("long sys_write in tid %d\n", t)
    print_backtrace()
    delete start_time[t]
  }
}

Even this assumes that the long wait is busy on the cpu, but I think
most likely it's an IO wait, which means you'll be scheduled off while
it completes.  So then instead of timer.profile, or in addition, you
could use a scheduler tracepoint.  Like:

global start_time
probe syscall.write { start_time[tid()] = gettimeofday_us() }
probe syscall.write.return { delete start_time[tid()] }
probe kernel.trace("sched_switch") {
  t = task_tid($next)
  if (t in start_time && gettimeofday_us() - start_time[t] > 100000)
  {
    printf("long sys_write in tid %d\n", t)
    print_backtrace()
    delete start_time[t]
  }
}

So as your thread is scheduled back onto the cpu, if it has been a long
time, the backtrace will tell you what path it took to get scheduled off
in the first place.


Hope that's enough to give you some direction...

Josh


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]