This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: syscall tracing overheads: utrace vs. kprobes
- From: David Smith <dsmith at redhat dot com>
- To: "Frank Ch. Eigler" <fche at redhat dot com>
- Cc: systemtap at sources dot redhat dot com, utrace-devel at redhat dot com
- Date: Tue, 28 Apr 2009 13:19:47 -0500
- Subject: Re: syscall tracing overheads: utrace vs. kprobes
- References: <20090428170117.GA30001@redhat.com>
Frank Ch. Eigler wrote:
> Hi -
>
> In a few contexts, it comes up as to whether it is faster to probe
> process syscalls with kprobes or with something higher level such as
> utrace. (There are other hypothetical options too (per-syscall
> tracepoints) that could be measured this way in the future.)
These scenarios are a bit wrong:
> Now we compare these scenarios:
>
> # stap -e 'probe never {}' -t --vp 00001 -c a.out
>
> Here, no actual probing occurs so we get a measurement of the plain
> uninstrumented run time of ten million close(2)s.
The above one is fine.
> # stap -e 'probe process.syscall {}' -t --vp 00001 -c a.out
>
> Here, we intercept sys_close with a kprobe. If the system is not too
> busy, we should pick up only the close(2)s coming from a.out, though a
> few close(2)'s executed by other processes may show up.
>
> # stap -e 'probe syscall.close {}' -t --vp 00001 -c a.out
>
> Here, we intercept all a.out's syscalls with utrace. Other processes
> are not affected at all, but other syscalls by a.out would be --
> though in our test, there are hardly any of those.
These 2 are swapped: the 'process.syscall' probe is a utrace-based
probe and the 'syscall.close' probe is a kprobe-based probe.
Note that in the results, the description and probe types matched correctly.
> Some typical results on my 2.66GHz 2*Xeon5150 machine runnin Fedora 9 -
> 2.6.27.12:
>
> never:
> Pass 5: run completed in 740usr/3310sys/4155real ms.
>
> kprobe:
> probe syscall.close (<input>:1:1), hits: 10000028, cycles: 176min/202avg/3632max
> Pass 5: run completed in 750usr/9320sys/10193real ms.
>
> utrace:
> probe process.syscall (<input>:1:1), hits: 10000025, cycles: 176min/209avg/184392max
> Pass 5: run completed in 1670usr/6860sys/8645real ms.
>
> So utrace added 4.5 seconds, and kprobes added 6.0 seconds to the
> uninstrumented 4.1 second run time. But wait: we should subtract the
> time taken by the probe handler itself: 200ish cycles at 2.66 GHz,
> which is about 0.75 seconds. So the overheads are approximately:
>
> never: n/a
> kprobe: 5.2 seconds => 0.52 us per hit
> utrace: 3.6 seconds => 0.36 us per hit
>
>
> Note that these are microbenchmarks that represent an ideal case
> compared to a larger run, since they probably fit comfily inside
> caches. They probably also undercount the probe handler's run time.
--
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)