This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: Top syscalls
- From: fche at redhat dot com (Frank Ch. Eigler)
- To: Hien Nguyen <hien at us dot ibm dot com>
- Cc: SystemTAP <systemtap at sources dot redhat dot com>
- Date: 13 Sep 2005 12:06:32 -0400
- Subject: Re: Top syscalls
- References: <43260ED4.9070203@us.ibm.com>
hien wrote:
> This script will list the top 20 system calls during the interval of
> 2000 jiffies contineuosly. [...]
>
> function syscall_name:string () %{
> [...]
> %}
I expect that when the system call tapset is finally constructed,
strings such as the plain system call names will be made available in
the new probe point aliases. Is anyone now working on this?
> function reset_maxaction () %{
> if (CONTEXT && CONTEXT->actioncount)
> CONTEXT->actioncount=0;
> %}
As you realize, this defeats the infinite loop detection logic. I
suspect we'll need a formally supported function to do this, and not
just with the broad feature described in bug #1182. (Does that answer
your subsequent question, Tom?)
> [...]
> function print_top () {
> [...]
> reset_maxaction ()
> foreach ([syscall] in syscall_count) {
> sys_cnt = syscall_count[syscall]
> if (sys_cnt > lcnt) {
> lsyscall = syscall
> lcnt = sys_cnt
> }
> }
> [...]
Yeah, no wonder this, repeated 20 times, is running out of the
MAXACTION quota, which is around 1000 statements. I hope we can think
of a way of expressing "top N" type queries using statistics objects,
or a sorting primitive, or something.
Be aware that this processing loop in fact does/should (bug #1275)
lock the syscall_count array for the duration of the iteration. That
that means that concurent processes on other CPUs trying to perform
syscalls will be held in a spinlock. So the MAXACTION limit performs
a useful function even in this case.
> function reset_syscall_count () {
> foreach ([syscall] in syscall_name)
> syscall_count[syscall] = 0
> }
You could say "delete syscall-count" instead.
> [...]
> probe kernel.function("sys_*") {
> if (pid() != get_daemon_pid())
> accumulate ()
> }
Did you find that this filtering was necessary? Was stpd showing up a
lot? I thought it had learned to batch its kernel-to-user message
passing, so as to produce a smaller footprint.
> probe timer.jiffies(2000) [...]
Perhaps someone would undertake the first part of my bug #1276 - to
add support for timing intervals measured in seconds and milliseconds.
It would be a good warmup exercise for translator hacking.
- FChE