This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: [PATCH -tip v9 25/26] kprobes: Introduce kprobe cache to reduce cache misshits
- From: Ingo Molnar <mingo at kernel dot org>
- To: Masami Hiramatsu <masami dot hiramatsu dot pt at hitachi dot com>
- Cc: linux-kernel at vger dot kernel dot org, Andi Kleen <andi at firstfloor dot org>, Ananth N Mavinakayanahalli <ananth at in dot ibm dot com>, Sandeepa Prabhu <sandeepa dot prabhu at linaro dot org>, Frederic Weisbecker <fweisbec at gmail dot com>, x86 at kernel dot org, Steven Rostedt <rostedt at goodmis dot org>, fche at redhat dot com, mingo at redhat dot com, systemtap at sourceware dot org, "H. Peter Anvin" <hpa at zytor dot com>, Thomas Gleixner <tglx at linutronix dot de>
- Date: Thu, 24 Apr 2014 11:01:34 +0200
- Subject: Re: [PATCH -tip v9 25/26] kprobes: Introduce kprobe cache to reduce cache misshits
- Authentication-results: sourceware.org; auth=none
- References: <20140417081636 dot 26341 dot 87858 dot stgit at ltc230 dot yrl dot intra dot hitachi dot co dot jp> <20140417081931 dot 26341 dot 47154 dot stgit at ltc230 dot yrl dot intra dot hitachi dot co dot jp>
* Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> wrote:
> Introduce kprobe cache to reduce cache misshits for
> massive multiple kprobes.
> For stress testing kprobes, we need to activate kprobes
> as many as possible. This situation causes cache miss
> hit storm on kprobe hash-list. kprobe hashlist is already
> enlarged to 4k entries and this is still small for 40k
> kprobes.
>
> For example, when registering 40k probes on the hlist and
> enabling 20k probes, perf tools shows still a lot of
> cache-misses are on the get_kprobe.
> ----
> Samples: 633 of event 'cache-misses', Event count (approx.): 3414776
> + 68.13% [k] get_kprobe
> + 4.38% [k] ftrace_lookup_ip
> + 2.54% [k] kprobe_ftrace_handler
> ----
>
> Also, I found that the most of the kprobes are not hit.
> In that case, to reduce cache-misses, we can reduce the
> random memory access by introducing a per-cpu cache which
> caches the address of frequently used kprobe data structure
> and its probe address.
>
> With kpcache enabled, the get_kprobe_cached goes down to
> around 4-5% of cache-misses with 20k probes.
> ----
> Samples: 729 of event 'cache-misses', Event count (approx.): 690125
> + 14.49% [k] ftrace_lookup_ip
> + 5.61% [k] kprobe_trace_func
> + 5.17% [k] kprobe_ftrace_handler
> + 4.62% [k] get_kprobe_cached
> ----
>
> Of course this reduces the enabling time too.
>
> Without this fix (just enlarge hash table):
> (2934 sec, 1 min intervals for each 2000 probes enabled)
>
> ----
> Enabling trace events: start at 1393921862
> 0 1393921864 a2mp_chan_alloc_skb_cb_38581
> ...
> 19999 1393924928 nfs4_open_confirm_done_11785
> ----
>
> With this fix:
> (2025 sec, 1 min intervals for each 2000 probes enabled)
That's a nice speedup.
So I don't think this should be a Kconfig entry, just enable it
unconditionally. That will further simplify the code.
Thanks,
Ingo