This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: syscall tracing overheads: utrace vs. kprobes


Frank Ch. Eigler wrote:
> Hi -
> 
> In a few contexts, it comes up as to whether it is faster to probe
> process syscalls with kprobes or with something higher level such as
> utrace.  (There are other hypothetical options too (per-syscall
> tracepoints) that could be measured this way in the future.)  

These scenarios are a bit wrong:

> Now we compare these scenarios:
> 
> # stap -e 'probe never {}' -t --vp 00001 -c a.out
> 
> Here, no actual probing occurs so we get a measurement of the plain
> uninstrumented run time of ten million close(2)s.

The above one is fine.

> # stap -e 'probe process.syscall {}' -t --vp 00001 -c a.out
> 
> Here, we intercept sys_close with a kprobe.  If the system is not too
> busy, we should pick up only the close(2)s coming from a.out, though a
> few close(2)'s executed by other processes may show up.
> 
> # stap -e 'probe syscall.close {}' -t --vp 00001 -c a.out
> 
> Here, we intercept all a.out's syscalls with utrace.  Other processes
> are not affected at all, but other syscalls by a.out would be --
> though in our test, there are hardly any of those.

These 2 are swapped:  the 'process.syscall' probe is a utrace-based
probe and the 'syscall.close' probe is a kprobe-based probe.

Note that in the results, the description and probe types matched correctly.

> Some typical results on my 2.66GHz 2*Xeon5150 machine runnin Fedora 9 -
> 2.6.27.12:
> 
> never:  
> Pass 5: run completed in 740usr/3310sys/4155real ms.
> 
> kprobe: 
> probe syscall.close (<input>:1:1), hits: 10000028, cycles: 176min/202avg/3632max
> Pass 5: run completed in 750usr/9320sys/10193real ms.
> 
> utrace: 
> probe process.syscall (<input>:1:1), hits: 10000025, cycles: 176min/209avg/184392max
> Pass 5: run completed in 1670usr/6860sys/8645real ms.
> 
> So utrace added 4.5 seconds, and kprobes added 6.0 seconds to the
> uninstrumented 4.1 second run time.  But wait: we should subtract the
> time taken by the probe handler itself: 200ish cycles at 2.66 GHz,
> which is about 0.75 seconds.  So the overheads are approximately:
> 
> never: n/a
> kprobe: 5.2 seconds => 0.52 us per hit
> utrace: 3.6 seconds => 0.36 us per hit
> 
> 
> Note that these are microbenchmarks that represent an ideal case
> compared to a larger run, since they probably fit comfily inside
> caches.  They probably also undercount the probe handler's run time.

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]