This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Some newbie questions




On 08/11/2016 10:51 AM, Avi Kivity wrote:


On 08/10/2016 07:47 PM, Frank Ch. Eigler wrote:
Hi -

On Wed, Aug 10, 2016 at 06:40:02PM +0300, Avi Kivity wrote:
[...]
Yes.  The problem is that if the function is called often (with a usual
short running time), then systemtap will eat all of the cpu time
spinning on an internal lock.
Well, not just that ...  trapping each function entry/exit has
unavoidable kernel uprobes context-switchy-type overheads.

Like you say, those are unavoidable. But at least those costs are handled by scalable resources.


   Your perf
report may well be misattributing the cost.

I think it's unlikely. When perf says __raw_spin_lock is guilty, it usually is.

   (Have you tried a stap
script that merely traps all the same the function calls, and has
empty probe handlers?)

I can try it.


It was actually pretty bad with empty handlers:

#
66.25% scylla [kernel.kallsyms] [k] _raw_spin_lock
                     |
                     ---_raw_spin_lock
                        |
                        |--49.95%-- 0x62ab
                        |          syscall_trace_leave
                        |          int_check_syscall_exit_work
                        |          |
                        |          |--99.08%-- epoll_pwait
| | _ZN21reactor_backend_epoll16wait_and_processEiPK10__sigset_t | | _ZNSt17_Function_handlerIFbvEZN7reactor3runEvEUlvE5_E9_M_invokeERKSt9_Any_data
                        |          |          _ZN7reactor3runEv
                        |          |          |
| | |--91.80%-- _ZZN3smp9configureEN5boost15program_options13variables_mapEENKUlvE1_clEv.constprop.2783
                        |          |          | _Z19dpdk_thread_adaptorPv
                        |          |          | eal_thread_loop
                        |          |          | start_thread
                        |          |          |          __clone
                        |          |          |
| | --8.20%-- _ZN12app_template14run_deprecatedEiPPcOSt8functionIFvvEE
                        |          |                     |
                        |          | --100.00%-- main
                        |          | __libc_start_main
                        |          | _start
                        |           --0.92%-- [...]
                        |
                        |--49.46%-- 0x619b
                        |          syscall_trace_enter_phase2
                        |          tracesys_phase2
                        |          |
                        |          |--98.82%-- epoll_pwait
| | _ZN21reactor_backend_epoll16wait_and_processEiPK10__sigset_t | | _ZNSt17_Function_handlerIFbvEZN7reactor3runEvEUlvE5_E9_M_invokeERKSt9_Any_data
                        |          |          _ZN7reactor3runEv
                        |          |          |
| | |--91.94%-- _ZZN3smp9configureEN5boost15program_options13variables_mapEENKUlvE1_clEv.constprop.2783
                        |          |          | _Z19dpdk_thread_adaptorPv
                        |          |          | eal_thread_loop
                        |          |          | start_thread
                        |          |          |          __clone
                        |          |          |
| | --8.06%-- _ZN12app_template14run_deprecatedEiPPcOSt8functionIFvvEE
                        |          |                     |
                        |          | --100.00%-- main
                        |          | __libc_start_main
                        |          | _start
                        |           --1.18%-- [...]
                         --0.58%-- [...]


I don't have any system call probes. Just two empty static probes, and a timer.profile handler.




Note though that such analysis probably cannot be performed based only
upon PC samples - or even backtrace samples.  We seem to require
trapping individual function entry/exit events.
That's why I tried systemtap.  It worked well on my desktop, but very
badly in production.
It may be worth experimenting with "stap --runtime=dyninst" if your
function analysis were restricted to basic Cish userspace that dyninst
can handle.


Will timer.profile work with dyninst?



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]