This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: get the mapping of <key, syscallname>


Frank Ch. Eigler ??:
Hi -

Why not use the "name" variable set by the syscalls tapset?
LKET does not currently use the syscall tapset because it doesnt use the extra information that this tapset collects from each systemcall. [...]

But, due to the optimization pass in the translator, only that portion of the information actually *used* by your script is actually collected. Please try it out (check out stap -p2 or -p3 output to confirm).

I tried and found it works really well. It compiles away all the unnecessary codes.

But it still has one issue. e.g, I tried to use:
probe syscall.* { ... } to trace all system calls on ppc64. But it failed since sys64_time(defined in SystemTap's ppc64/syscall.stp) is removed from 2.6.16.5/ppc64. I think this issues has been addressed in bug 2645.


If this bug #2645 is resolved , then I think I can use the syscall.stp tapsets shipped with SystemTap. Another advantage of using it is that I can simply use the $name instead of getting it using lines of codes.

But one disadvantage is that if using "syscall.*", then every syscall will use a different function to be its probe handler. It causes the generated much larger and compiling time much longer compared with current LKET's syscall tapsets:

/usr/local/share/systemtap/tapset> ll -h /tmp/stap6HKAzF/*
  127 May 13 07:36 /tmp/stap6HKAzF/Makefile
 2.0M May 13 07:36 /tmp/stap6HKAzF/stap_5910.c
 4.1M May 13 07:37 /tmp/stap6HKAzF/stap_5910.ko
 2.8K May 13 07:37 /tmp/stap6HKAzF/stap_5910.mod.c
  60K May 13 07:37 /tmp/stap6HKAzF/stap_5910.mod.o
 4.0M May 13 07:37 /tmp/stap6HKAzF/stap_5910.o
 712K May 13 07:36 /tmp/stap6HKAzF/stap-symbols.h
 518K May 13 07:36 /tmp/stap6HKAzF/symbols.sorted

[...]  Its not just the string printing that's costly, its also
searching the system call name with in CONTEXT->probe_point.  [...]

Rolling your own code to manually scan through "probe_point" is not really necessary or appropriate. There is a "probefunc()" function you can use for now.

I once used the code of probefunc() inside log_syscall_extra() function. But during the performance analysis of LKET, I found it's a little expensive considered that log_syscall_extra() will be called very frequently. So I change to use these ugly manual written codes for a better performance.

I think use SystemTap as a trace tool cares much about the performance considered that it will be used in a heavy loaded system and a slight performance differece will add up to a big one. But probefunc() is not a problem if SystemTap is used as a diagnosis purpose.

Below is the email I sent out internally on this. I used specjbb as the probed application.

>>> FYI:

  As discovered in the former analysis of LKET, strpbrk() is one of
the most costly operations. It is a kernel function and is called by
strsep() to get the current syscall name.

I use another way to get the syscall name:

ptr = CONTEXT->probe_point;

while(*ptr++ != '"');

        do  {
                buffer[i++] = *ptr++;
        }  while(*ptr != '@');

buffer[i] = '\0';

  And I did some testings to compare the new method and the old one. I
also use get_cycles() to get the cycles consumed by these two method.

================= old method using strsep() ================

OProfile:

samples  %        app name                 symbol name
4239923  67.2099  java                     (no symbols)
207747    3.2931  lglzy                    ._stp_vsnprintf
109242    1.7317  vmlinux-2.6.16.5.debug   .strpbrk
58739     0.9311  libc-2.3.4.so            memset
58344     0.9249  lglzy                    .dwarf_kprobe_1_enter
56300     0.8924  libjvm.so                fastLocalMark
55565     0.8808  libjvm.so               cacheAllocWithInitialization
54983     0.8716  vmlinux-2.6.16.5.debug   restore
54123     0.8579  vmlinux-2.6.16.5.debug   .kernel_unknown_bucket
50853     0.8061  libjitc.so               sd9

get_cycles(): ~165

================= new method ======================


OProfile:


samples  %        app name                 symbol name
4373512  69.2051  java                     (no symbols)
211307    3.3437  lglzy                    ._stp_vsnprintf
59605     0.9432  lglzy                .function_log_syscall_tracedata
58020     0.9181  libc-2.3.4.so            memset
57804     0.9147  vmlinux-2.6.16.5.debug   restore
56514     0.8943  libjvm.so                fastLocalMark
55085     0.8716  libjvm.so              cacheAllocWithInitialization
53948     0.8537  lglzy                    .dwarf_kprobe_1_enter
50175     0.7940  libjitc.so               sd9
~

get_cycles(): ~40

================= SUMMARY ====================

oprofile shows that the total samples in lglzy & vmlinux decrease
about 2% using the new method.

And get_cycles() shows that the total cycles using the new method is
~40, much less than ~165 with the old method.

I've committed this change into TLA
- FChE



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]