This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Some newbie questions




On 08/26/2016 11:07 PM, David Smith wrote:
On 08/26/2016 01:39 PM, Avi Kivity wrote:
On 08/25/2016 08:37 PM, David Smith wrote:
On 08/25/2016 11:21 AM, Avi Kivity wrote:
Hi,

Should I wait for ongoing improvement here, or shall I look elsewhere
for my tracing needs?

It would be a pity (for me) if I have to find another solution, because
systemtap has all the expressiveness and integration I need.  But it has
a dramatic impact on my application runtime.

I was able to extract some useful data with perf probe/perf record, but
as soon as I need to qualify a probe point with runtime information,
perf falls short.
As Frank mentioned in a previous email, it might be possible for us to
switch to using straight kprobes instead of syscall tracing to handle
mmap tracing. In your use case of calling epoll_wait() lots of times per
second, that might be a *big* win.

I'll see what can be done to add that feature.

Thanks a lot.  I'll be happy to test patches.
OK, since you asked...

Here's a patch I'm testing that tries to do prefiltering when a syscall
occurs, so we don't have to take that lock.

Please rebuild with it, and let me know if it (a) works, and (b) has
lower overhead in your situation.


With an unloaded system, systemtap almost vanishes from the profile. This is on a 2s24c48t system, running epoll_pwait() and polling on user memory locations in a tight loop.

When I load my system, I start to see contention on systemtap locks:

     9.26%  scylla         [kernel.kallsyms]    [k] delay_tsc
            |
            ---delay_tsc
               |
                --9.25%--__const_udelay
                          stp_lock_probe.constprop.104
                          |
                          |--4.94%--probe_2819
                          |          stapiu_probe_prehandler
                          |          uprobe_notify_resume
                          |          exit_to_usermode_loop
                          |          prepare_exit_to_usermode
                          |          retint_user
                          |          reactor::run_tasks
                          |          reactor::run
                          |          |
| --4.94%--smp::configure(boost::program_options::variables_map)::{lambda()#3}::operator()
                          | posix_thread::start_routine
                          |                     start_thread
                          |                     __clone
                          |
                           --4.30%--probe_2822
                                     stapiu_probe_prehandler
                                     uprobe_notify_resume
                                     exit_to_usermode_loop
                                     prepare_exit_to_usermode
                                     retint_user
                                     reactor::run_tasks
                                     reactor::run
                                     |
--4.30%--smp::configure(boost::program_options::variables_map)::{lambda()#3}::operator()
posix_thread::start_routine
                                                start_thread
                                                __clone



3.75% scylla [kernel.kallsyms] [k] queued_spin_lock_slowpath
            |
            ---queued_spin_lock_slowpath
               |
                --3.74%--_raw_spin_lock
                          |
                           --3.63%--task_utrace_struct
                                     |
|--2.49%--utrace_report_syscall_entry
                                     | syscall_trace_enter_phase2
                                     |          syscall_trace_enter
                                     |          do_syscall_64
                                     | return_from_SYSCALL_64
                                     |          |
                                     | --2.46%--epoll_pwait
| reactor_backend_epoll::wait_and_process
                                     | |
| |--1.65%--std::_Function_handler<bool (), reactor::run()::{lambda()#7}>::_M_invoke
                                     | |          reactor::run
                                     | |          |
| | --1.55%--smp::configure(boost::program_options::variables_map)::{lambda()#3}::operator() | | posix_thread::start_routine
                                     | |                     start_thread
                                     | |                     __clone
                                     | |
| --0.81%--std::_Function_handler<bool (), reactor::run()::{lambda()#8}>::_M_invoke
| logalloc::tracker::impl::compact_on_idle
|                                reactor::run
|                                |
| --0.76%--smp::configure(boost::program_options::variables_map)::{lambda()#3}::operator()
| posix_thread::start_routine
|                                           start_thread
|                                           __clone
                                     |
--1.14%--utrace_report_syscall_exit
syscall_slow_exit_work
                                                do_syscall_64
return_from_SYSCALL_64
                                                |
--1.12%--epoll_pwait
reactor_backend_epoll::wait_and_process
|
--0.75%--std::_Function_handler<bool (), reactor::run()::{lambda()#7}>::_M_invoke
reactor::run
|
--0.69%--smp::configure(boost::program_options::variables_map)::{lambda()#3}::operator()
posix_thread::start_routine
start_thread
__clone


I may be able to move the probe points to a less frequently accessed point (at some loss in accuracy), so this patch ought to get me started. Thanks a lot!


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]