This is the mail archive of the
systemtap@sources.redhat.com
mailing list for the systemtap project.
Re: Experiences with kprobes
- From: Richard J Moore <richardj_moore at uk dot ibm dot com>
- To: Baruch Even <baruch at ev-en dot org>
- Cc: systemtap at sources dot redhat dot com
- Date: Tue, 22 Mar 2005 11:53:45 +0000
- Subject: Re: Experiences with kprobes
- Sensitivity:
We're in the process of moving the original design point of kprobes from
being centered on debugging to performance and debugging. To address the
performance measurement needs we will need to smarten-up the probing
mechanisms.
- -
Richard J Moore
IBM Advanced Linux Response Team - Linux Technology Centre
MOBEX: 264807; Mobile (+44) (0)7739-875237
Office: (+44) (0)1962-817072
Baruch Even
<baruch@ev-en.o
rg> To
Sent by: systemtap@sources.redhat.com
systemtap-owner cc
@sources.redhat
.com bcc
Subject
22/03/2005 Experiences with kprobes
11:22
Hello,
Just thought to share my current experience with kprobes, it might
interest some of you.
I'm trying to improve the performance of the Linux TCP stack (as an
end-host not a router), as such I need to measure the current
performance in order to search for bottlenecks.
I had a first version where I simply wrapped the the calls I needed with
rdtsc calls inline and added other measurements (number of packets acked
for each ACK packet and such). This worked beautifully, and I got some
nice results and some pretty good improvements as well.
They say "if it ain't broken don't fix it", but if it's not broken it's
no fun[0], so I tried to use kprobes as a way to get the measurement
code out of my current code patches. The thinking was that it will be a
lot easier to maintain the patches ready for LKML submission.
I also ported my code to 2.6.11 (since that's where kprobes is
available, I was on 2.6.6 before and no kprobes there[1]), and got
abysmal performance. After a bit of digging the overhead of the kprobes
approach was the only possible problem, if with the old method I got a
timing of about 3000 clocks on my machine[2], with the new one I got at
least 10000 with about 3 kprobes and 3 jprobes.
I ported kprobes to 2.6.6 and the same performance patterns appeared on
the formerly working code, with the only conclusion left that kprobes is
not suitable for this kind of performance measurements under very high
loads.
The specifics for me is that the tests are running using dummynet
network to simulate a very high speed long distance network (about 300ms
rtt and 300Mbit/s bandwidth) so the packet rates are very high with
BDP of about 8000 packets, i.e. lots of ack packets to process).
Baruch
[0] As a grad student, at least part of the idea is to have fun :-)
[1] If someone needs a back port of kprobes for 2.6.6 on i386 send me a
note.
[2] 3GHz P-IV Xeon