This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: Some newbie questions
On 08/26/2016 11:07 PM, David Smith wrote:
On 08/26/2016 01:39 PM, Avi Kivity wrote:
On 08/25/2016 08:37 PM, David Smith wrote:
On 08/25/2016 11:21 AM, Avi Kivity wrote:
Hi,
Should I wait for ongoing improvement here, or shall I look elsewhere
for my tracing needs?
It would be a pity (for me) if I have to find another solution, because
systemtap has all the expressiveness and integration I need. But it has
a dramatic impact on my application runtime.
I was able to extract some useful data with perf probe/perf record, but
as soon as I need to qualify a probe point with runtime information,
perf falls short.
As Frank mentioned in a previous email, it might be possible for us to
switch to using straight kprobes instead of syscall tracing to handle
mmap tracing. In your use case of calling epoll_wait() lots of times per
second, that might be a *big* win.
I'll see what can be done to add that feature.
Thanks a lot. I'll be happy to test patches.
OK, since you asked...
Here's a patch I'm testing that tries to do prefiltering when a syscall
occurs, so we don't have to take that lock.
Please rebuild with it, and let me know if it (a) works, and (b) has
lower overhead in your situation.
With an unloaded system, systemtap almost vanishes from the profile.
This is on a 2s24c48t system, running epoll_pwait() and polling on user
memory locations in a tight loop.
When I load my system, I start to see contention on systemtap locks:
9.26% scylla [kernel.kallsyms] [k] delay_tsc
|
---delay_tsc
|
--9.25%--__const_udelay
stp_lock_probe.constprop.104
|
|--4.94%--probe_2819
| stapiu_probe_prehandler
| uprobe_notify_resume
| exit_to_usermode_loop
| prepare_exit_to_usermode
| retint_user
| reactor::run_tasks
| reactor::run
| |
|
--4.94%--smp::configure(boost::program_options::variables_map)::{lambda()#3}::operator()
| posix_thread::start_routine
| start_thread
| __clone
|
--4.30%--probe_2822
stapiu_probe_prehandler
uprobe_notify_resume
exit_to_usermode_loop
prepare_exit_to_usermode
retint_user
reactor::run_tasks
reactor::run
|
--4.30%--smp::configure(boost::program_options::variables_map)::{lambda()#3}::operator()
posix_thread::start_routine
start_thread
__clone
3.75% scylla [kernel.kallsyms] [k]
queued_spin_lock_slowpath
|
---queued_spin_lock_slowpath
|
--3.74%--_raw_spin_lock
|
--3.63%--task_utrace_struct
|
|--2.49%--utrace_report_syscall_entry
| syscall_trace_enter_phase2
| syscall_trace_enter
| do_syscall_64
| return_from_SYSCALL_64
| |
| --2.46%--epoll_pwait
|
reactor_backend_epoll::wait_and_process
| |
|
|--1.65%--std::_Function_handler<bool (),
reactor::run()::{lambda()#7}>::_M_invoke
| | reactor::run
| | |
| |
--1.55%--smp::configure(boost::program_options::variables_map)::{lambda()#3}::operator()
| |
posix_thread::start_routine
| | start_thread
| | __clone
| |
|
--0.81%--std::_Function_handler<bool (),
reactor::run()::{lambda()#8}>::_M_invoke
| logalloc::tracker::impl::compact_on_idle
| reactor::run
| |
|
--0.76%--smp::configure(boost::program_options::variables_map)::{lambda()#3}::operator()
| posix_thread::start_routine
| start_thread
| __clone
|
--1.14%--utrace_report_syscall_exit
syscall_slow_exit_work
do_syscall_64
return_from_SYSCALL_64
|
--1.12%--epoll_pwait
reactor_backend_epoll::wait_and_process
|
--0.75%--std::_Function_handler<bool (),
reactor::run()::{lambda()#7}>::_M_invoke
reactor::run
|
--0.69%--smp::configure(boost::program_options::variables_map)::{lambda()#3}::operator()
posix_thread::start_routine
start_thread
__clone
I may be able to move the probe points to a less frequently accessed
point (at some loss in accuracy), so this patch ought to get me
started. Thanks a lot!