This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
[Bug kprobes/2062] Return probes does not scale well on SMP box
- From: "jkenisto at us dot ibm dot com" <sourceware-bugzilla at sourceware dot org>
- To: systemtap at sources dot redhat dot com
- Date: 13 Jul 2006 19:44:02 -0000
- Subject: [Bug kprobes/2062] Return probes does not scale well on SMP box
- References: <20051216010933.2062.anil.s.keshavamurthy@intel.com>
- Reply-to: sourceware-bugzilla at sourceware dot org
------- Additional Comments From jkenisto at us dot ibm dot com 2006-07-13 19:44 -------
(In reply to comment #19)
> ====== 2.6.17.3/ppc64/8-way, without Jim's patch ================
>
> no probe
> Total cpus: loops = 40000000, average = 6202 ns
>
> kretprobe using stap:
> Total cpus: loops = 40000000, average = 43702 ns
>
> kretprobe using getsid.c:
> Total cpus: loops = 40000000, average = 36456 ns
>
>
> ======= 2.6.17.4/ppc64/8-way, with Jim's patch ===================
>
> kretprobe using stap:
> Total cpus: loops = 40000000, average = 26621 ns
>
> kretprobe using getsid.c:
> Total cpus: loops = 40000000, average = 24975 ns
This is actually pretty much what I'd expect to see. All the CPUs are hitting
the same probepoint repeatedly and piling up on the same per-kretprobe lock.
But the performance gain is reasonably good, and even better when stap's
involved. The stap-generated handler takes longer than the empty kprobes
handler, so we get more benefit from not holding a lock while the handler runs.
We should see more benefit with multiple kretprobes -- e.g.
probe syscall.*.return { ... }
Keep the comments and improvements coming. Thanks.
--
http://sourceware.org/bugzilla/show_bug.cgi?id=2062
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.