This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: kernel function probe overhead

From: fche at redhat dot com (Frank Ch. Eigler)
To: Aubrey Li <aubreylee at gmail dot com>
Cc: systemtap at sourceware dot org
Date: Mon, 28 Nov 2016 14:19:10 -0500
Subject: Re: kernel function probe overhead
Authentication-results: sourceware.org; auth=none
References: <CAGPKeUKWAvBv=SL_ebVkR3ZwCeGZoCQ1MEJ5xQi-VXuF=kt3Wg@mail.gmail.com> <y0m1sxzpj9i.fsf@fche.csb> <CAGPKeUJjf5qqxO2Gqo-_LzzaFbwQ=-wZ=sq0fwTzm=c+MEAtSw@mail.gmail.com>

aubreylee wrote:

>> A couple of things come to mind.  Use of gettimeofday_ns() function uses
>> a software-estimated real-time clock, which may not be steady enough for
>> high-grained measurements like this
>
> Timestamp function: gettimeofday_ns()
> [...]
> Timestamp function: get_cycles()
> [...]
> Timestamp function: local_clock_ns()
> [...]

Thanks for testing!  The distribution of those three seem remarkably
similar - a nice central peak, and a second small peak at 4-8x the
first.  It suggests we can probably rule out clock problems.

>>  The average is pretty low.  Consider printing a
>> histogram of t_end, or (if using git systemtap) its @variance.
>>
>> Another possibility is contention over the t_start[] variable, if
>> multiple processors are running this region of code, so trigger the
>> systemtap probes concurrently.  systemtap automatically locks such
>> global variables, esp. if they contain non-aggregate values like the
>> scalar timestamps.
>
> Do we have an approach to probe the function on CPU0 only?

Not directly; the kernel does not give us such control.  The probe could
include a

   if (cpu() != 0) next;

at the top of the probe handler body.  That way locks still are taken,
but released quickly on the other CPUs.

Another solution would be to have per-thread global variables.  Since
the kernel doesn't help us out with this, we'd have to emulate it
somehow, and for this to be wortwhile, it'd have to work faster than
ordinary locked/shared array[tid()] accesses.

>> Consider using "stap -t FOO.stp" to have the tool print internal
>> estimates of its own probes' runtimes.  That would include lock
>> contention, but exclude kernel-side overheads like the code involved in
>> the its kprobes machinery.
>
> "-t" option output.

Ah, I note that we don't track occurrences of lock contention,
even though we could.  Opened https://sourceware.org/bugzilla/show_bug.cgi?id=20879

> [...]
> kernel.function("arch_cpu_idle_enter@arch/x86/kernel/process.c:277").call,
> (test2.stp:13:1), hits: 2801, cycles: 560min/34290avg/230611max, from:
> [...]
> kernel.function("local_touch_nmi@arch/x86/kernel/nmi.c:561").call,
> (test2.stp:18:1), hits: 2801, cycles: 396min/5118avg/160096max, from:
> [...]

That inconsistency dramatic contention or caching or some other source
of jitter.  Having the variance figure in there would be useful though.
Would anyone like to give that a try?  (translate.cxx line 2283ish and
runtime/linux/common_session_state.h _stp_stat_init area).

- FChE

Follow-Ups:
- Re: kernel function probe overhead
  - From: Aubrey Li

References:
- kernel function probe overhead
  - From: Aubrey Li
- Re: kernel function probe overhead
  - From: Frank Ch. Eigler
- Re: kernel function probe overhead
  - From: Aubrey Li

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]