This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: kernel function probe overhead
- From: Aubrey Li <aubreylee at gmail dot com>
- To: "Frank Ch. Eigler" <fche at redhat dot com>
- Cc: systemtap at sourceware dot org
- Date: Tue, 29 Nov 2016 11:30:54 +0800
- Subject: Re: kernel function probe overhead
- Authentication-results: sourceware.org; auth=none
- References: <CAGPKeUKWAvBv=SL_ebVkR3ZwCeGZoCQ1MEJ5xQi-VXuF=kt3Wg@mail.gmail.com> <y0m1sxzpj9i.fsf@fche.csb> <CAGPKeUJjf5qqxO2Gqo-_LzzaFbwQ=-wZ=sq0fwTzm=c+MEAtSw@mail.gmail.com> <y0mmvgjo1jl.fsf@fche.csb>
On Tue, Nov 29, 2016 at 3:19 AM, Frank Ch. Eigler <fche@redhat.com> wrote:
>
>
> aubreylee wrote:
>
> >> A couple of things come to mind. Use of gettimeofday_ns() function uses
> >> a software-estimated real-time clock, which may not be steady enough for
> >> high-grained measurements like this
> >
> > Timestamp function: gettimeofday_ns()
> > [...]
> > Timestamp function: get_cycles()
> > [...]
> > Timestamp function: local_clock_ns()
> > [...]
>
> Thanks for testing! The distribution of those three seem remarkably
> similar - a nice central peak, and a second small peak at 4-8x the
> first. It suggests we can probably rule out clock problems.
>
>
> >> The average is pretty low. Consider printing a
> >> histogram of t_end, or (if using git systemtap) its @variance.
> >>
> >> Another possibility is contention over the t_start[] variable, if
> >> multiple processors are running this region of code, so trigger the
> >> systemtap probes concurrently. systemtap automatically locks such
> >> global variables, esp. if they contain non-aggregate values like the
> >> scalar timestamps.
> >
> > Do we have an approach to probe the function on CPU0 only?
>
> Not directly; the kernel does not give us such control. The probe could
> include a
>
> if (cpu() != 0) next;
>
> at the top of the probe handler body. That way locks still are taken,
> but released quickly on the other CPUs.
>
Thanks for the suggestion.
Original result:
t_end @count=4800 @min=740 @max=300281 @sum=90589877 @avg=18872
Add if(cpu()!=0) next in the probe handler:
t_end @count=161 @min=653 @max=114856 @sum=1780522 @avg=11059
And if I add "maxcpus=1" to the kernel cmdline:
t_end @count=1190 @min=573 @max=14529 @sum=1406437 @avg=1181
The last one is much better, so it looks like we have a scalability
opportunity here.
> Another solution would be to have per-thread global variables. Since
> the kernel doesn't help us out with this, we'd have to emulate it
> somehow, and for this to be wortwhile, it'd have to work faster than
> ordinary locked/shared array[tid()] accesses.
>
Is this a suggestion for me? How to do this?
>
> >> Consider using "stap -t FOO.stp" to have the tool print internal
> >> estimates of its own probes' runtimes. That would include lock
> >> contention, but exclude kernel-side overheads like the code involved in
> >> the its kprobes machinery.
> >
> > "-t" option output.
>
> Ah, I note that we don't track occurrences of lock contention,
> even though we could. Opened https://sourceware.org/bugzilla/show_bug.cgi?id=20879
okay, thanks, I'll keep an eye on this ticket.
>
> > [...]
> > kernel.function("arch_cpu_idle_enter@arch/x86/kernel/process.c:277").call,
> > (test2.stp:13:1), hits: 2801, cycles: 560min/34290avg/230611max, from:
> > [...]
> > kernel.function("local_touch_nmi@arch/x86/kernel/nmi.c:561").call,
> > (test2.stp:18:1), hits: 2801, cycles: 396min/5118avg/160096max, from:
> > [...]
>
> That inconsistency dramatic contention or caching or some other source
> of jitter. Having the variance figure in there would be useful though.
> Would anyone like to give that a try? (translate.cxx line 2283ish and
> runtime/linux/common_session_state.h _stp_stat_init area).
>
If there is a patch delivered against git head, I can apply it and
paste the output here.
Thanks,
-Aubrey
> - FChE