This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
[Bug runtime/12960] New: _stp_ctl_send tries to msleep when out of memory
- From: "mjw at redhat dot com" <sourceware-bugzilla at sourceware dot org>
- To: systemtap at sourceware dot org
- Date: Tue, 5 Jul 2011 11:00:35 +0000
- Subject: [Bug runtime/12960] New: _stp_ctl_send tries to msleep when out of memory
- Auto-submitted: auto-generated
http://sourceware.org/bugzilla/show_bug.cgi?id=12960
Summary: _stp_ctl_send tries to msleep when out of memory
Product: systemtap
Version: unspecified
Status: NEW
Severity: normal
Priority: P2
Component: runtime
AssignedTo: systemtap@sourceware.org
ReportedBy: mjw@redhat.com
_stp_ctl_send tries to msleep when out of memory which seems to cause problems
and eventually kernel crashes. It isn't very reproducable, but the following
triggers it for me pretty often:
/usr/local/install/systemtap/bin/stap -d
/usr/lib64/python2.7/lib-dynload/_ssl.so -d /usr/lib64/libssl.so.1.0.0d -d
/lib64/libcrypto.so.1.0.0d -DMAXTRACE=128 -d /usr/bin/gdb --ldd -e 'probe
syscall.open { if (pid() == target()) { log(filename); print_ubacktrace();
log("--"); } }' -c 'gdb --version'
[ 1534.651888] stap_afa62ad505a7aaf8c957387db22ba031_16869: systemtap:
1.6/0.152, base: ffffffffa08a0000, memory: 6199data/35text/26ctx/13net/34alloc
kb, probes: 2
[ 1534.711077] ctl_send msleep because of err: -12
[ 1534.712955] BUG: scheduling while atomic: kworker/0:0/0/0x00000100
[ 1534.715360] Modules linked in: stap_afa62ad505a7aaf8c957387db22ba031_16869
uprobes netconsole configfs nfs lockd fscache nfs_acl auth_rpcgss sco bnep
l2cap bluetooth sunrpc rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
ip6table_filter ip6_tables snd_intel8x0 snd_ac97_codec i2c_piix4 ac97_bus
snd_seq snd_seq_device snd_pcm snd_timer 8139too i2c_core 8139cp mii snd
soundcore snd_page_alloc virtio_balloon microcode ipv6 [last unloaded:
stap_a893656df50ecd18787fb7e563e535cc_16869]
[ 1534.731101] CPU 1
[ 1534.731552] Modules linked in: stap_afa62ad505a7aaf8c957387db22ba031_16869
uprobes netconsole configfs nfs lockd fscache nfs_acl auth_rpcgss sco bnep
l2cap bluetooth sunrpc rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
ip6table_filter ip6_tables snd_intel8x0 snd_ac97_codec i2c_piix4 ac97_bus
snd_seq snd_seq_device snd_pcm snd_timer 8139too i2c_core 8139cp mii snd
soundcore snd_page_alloc virtio_balloon microcode ipv6 [last unloaded:
stap_a893656df50ecd18787fb7e563e535cc_16869]
[ 1534.739225]
[ 1534.739453] Pid: 0, comm: kworker/0:0 Not tainted 2.6.38.8-32.fc15.x86_64 #1
Bochs Bochs
[ 1534.740674] RIP: 0010:[<ffffffff8102a145>] [<ffffffff8102a145>]
native_safe_halt+0xb/0xd
[ 1534.741835] RSP: 0018:ffff88007a81dee8 EFLAGS: 00000246
[ 1534.742524] RAX: 0000000000000000 RBX: ffffffff810b1374 RCX:
0000016553d82980
[ 1534.743451] RDX: 000000f800000000 RSI: 0000000000000001 RDI:
0000000000000001
[ 1534.744408] RBP: ffff88007a81dee8 R08: 0000000000000000 R09:
ffffffff81b3a320
[ 1534.745393] R10: 00000000008849e7 R11: ffff880078922a00 R12:
ffffffff8100a58e
[ 1534.746342] R13: ffff88007a81de68 R14: ffffffff81010150 R15:
ffff88007a81de48
[ 1534.747281] FS: 0000000000000000(0000) GS:ffff88007fc80000(0000)
knlGS:0000000000000000
[ 1534.748333] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1534.749124] CR2: 0000003ceecab970 CR3: 0000000033c55000 CR4:
00000000000006e0
[ 1534.750072] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 1534.750995] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[ 1534.751956] Process kworker/0:0 (pid: 0, threadinfo ffff88007a81c000, task
ffff88007a820000)
[ 1534.753099] Stack:
[ 1534.753400] ffff88007a81def8 ffffffff81010d36 ffff88007a81df28
ffffffff81008321
[ 1534.754561] ffff88007a81df18 ab7ab4059d172ee8 0000000000000000
0000000000000000
[ 1534.755772] ffff88007a81df48 ffffffff81464dba 0000000000000000
e5c548fc55f4519c
[ 1534.756920] Call Trace:
[ 1534.757304] [<ffffffff81010d36>] default_idle+0x4e/0x86
[ 1534.758022] [<ffffffff81008321>] cpu_idle+0xa5/0xdf
[ 1534.758700] [<ffffffff81464dba>] start_secondary+0x20c/0x20e
[ 1534.759446] Code: 1f 44 00 00 57 9d 5d c3 55 48 89 e5 0f 1f 44 00 00 fa 5d
c3 55 48 89 e5 0f 1f 44 00 00 fb 5d c3 55 48 89 e5 0f 1f 44 00 00 fb f4 <5d> c3
55 48 89 e5 0f 1f 44 00 00 f4 5d c3 55 48 89 e5 0f 1f 44
[ 1534.765194] Call Trace:
[ 1534.765541] [<ffffffff81010d36>] default_idle+0x4e/0x86
[ 1534.766260] [<ffffffff81008321>] cpu_idle+0xa5/0xdf
[ 1534.766921] [<ffffffff81464dba>] start_secondary+0x20c/0x20e
[ 1534.767683] bad: scheduling from the idle thread!
[ 1534.768298] Pid: 0, comm: kworker/0:0 Not tainted 2.6.38.8-32.fc15.x86_64 #1
[ 1534.769245] Call Trace:
[ 1534.769593] <IRQ> [<ffffffff810425b3>] dequeue_task_idle+0x29/0x35
[ 1534.770489] [<ffffffff81047e92>] dequeue_task+0x85/0x94
[ 1534.771203] [<ffffffff81047ecb>] deactivate_task+0x2a/0x32
[ 1534.771965] [<ffffffff81473f16>] schedule+0x22b/0x66a
[ 1534.772630] [<ffffffff81474752>] schedule_timeout+0xa7/0xde
[ 1534.773420] [<ffffffff81060bb4>] ? process_timeout+0x0/0x10
[ 1534.774198] [<ffffffff814747a7>] schedule_timeout_uninterruptible+0x1e/0x20
[ 1534.775158] [<ffffffff810615dd>] msleep+0x1b/0x22
[ 1534.775814] [<ffffffffa08a06b0>] _stp_ctl_send+0x3f/0x9c
[stap_afa62ad505a7aaf8c957387db22ba031_16869]
[ 1534.777013] [<ffffffffa08a1099>] _stp_ctl_work_callback+0x81/0xa6
[stap_afa62ad505a7aaf8c957387db22ba031_16869]
[ 1534.778353] [<ffffffff81061378>] run_timer_softirq+0x1a4/0x266
[ 1534.779145] [<ffffffff81076a8c>] ? timekeeping_get_ns+0x18/0x3a
[ 1534.779924] [<ffffffffa08a1018>] ? _stp_ctl_work_callback+0x0/0xa6
[stap_afa62ad505a7aaf8c957387db22ba031_16869]
[ 1534.781240] [<ffffffff8105ae4c>] __do_softirq+0xd2/0x19d
[ 1534.782165] [<ffffffff81072750>] ? hrtimer_interrupt+0x11a/0x1b5
[ 1534.783289] [<ffffffff8100aadc>] call_softirq+0x1c/0x30
[ 1534.784387] [<ffffffff8100c101>] do_softirq+0x46/0x81
[ 1534.785238] [<ffffffff8105afd0>] irq_exit+0x49/0x8b
[ 1534.786143] [<ffffffff8147c09b>] smp_apic_timer_interrupt+0x7e/0x8c
[ 1534.787246] [<ffffffff8100a593>] apic_timer_interrupt+0x13/0x20
[ 1534.788754] <EOI> [<ffffffff810b1374>] ? rcu_needs_cpu+0x111/0x1c2
[ 1534.790608] [<ffffffff8102a145>] ? native_safe_halt+0xb/0xd
[ 1534.792171] [<ffffffff81010d36>] default_idle+0x4e/0x86
[ 1534.793133] [<ffffffff81008321>] cpu_idle+0xa5/0xdf
[ 1534.793960] [<ffffffff81464dba>] start_secondary+0x20c/0x20e
[ 1534.796124] BUG: unable to handle kernel NULL pointer dereference at
(null)
[ 1534.797096] IP: [< (null)>] (null)
[ 1534.797096] PGD 0
[ 1534.797096] Oops: 0010 [#1] SMP
[ 1534.797096] last sysfs file:
/sys/module/virtio_balloon/sections/__mcount_loc
[ 1534.797096] CPU 1
[ 1534.797096] Modules linked in: stap_afa62ad505a7aaf8c957387db22ba031_16869
uprobes netconsole configfs nfs lockd fscache nfs_acl auth_rpcgss sco bnep
l2cap bluetooth sunrpc rfkill ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
ip6table_filter ip6_tables snd_intel8x0 snd_ac97_codec i2c_piix4 ac97_bus
snd_seq snd_seq_device snd_pcm snd_timer 8139too i2c_core 8139cp mii snd
soundcore snd_page_alloc virtio_balloon microcode ipv6 [last unloaded:
stap_a893656df50ecd18787fb7e563e535cc_16869]
[ 1534.797096]
[ 1534.797096] Pid: 686, comm: rs:main Q:Reg Not tainted
2.6.38.8-32.fc15.x86_64 #1 Bochs Bochs
[ 1534.797096] RIP: 0010:[<0000000000000000>] [< (null)>]
(null)
[ 1534.797096] RSP: 0018:ffff8800789ff738 EFLAGS: 00010046
[ 1534.797096] RAX: ffffffff8160a4e0 RBX: ffff88007a820000 RCX:
ffff88007fc80000
[ 1534.797096] RDX: 0000000000000001 RSI: ffff88007a820000 RDI:
ffff88007fc93840
[ 1534.797096] RBP: ffff8800789ff760 R08: ffff88007fc8dbb0 R09:
000000000000024b
[ 1534.797096] R10: 0000000000000010 R11: ffff88007a820000 R12:
ffff88007fc93840
[ 1534.797096] R13: 0000000000000001 R14: 0000000000000001 R15:
0000000000000001
[ 1534.797096] FS: 00007ff1a617d700(0000) GS:ffff88007fc80000(0000)
knlGS:0000000000000000
[ 1534.797096] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1534.797096] CR2: 0000000000000000 CR3: 000000007a0a8000 CR4:
00000000000006e0
[ 1534.797096] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 1534.797096] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[ 1534.797096] Process rs:main Q:Reg (pid: 686, threadinfo ffff8800789fe000,
task ffff880037960000)
[ 1534.797096] Stack:
[ 1534.797096] ffffffff81047f30 ffff8800789ff750 ffffffff00000001
ffff88007fc93840
[ 1534.797096] ffff88007fc93840 ffff8800789ff780 ffffffff81047f69
ffff88007fc8dbb0
[ 1534.797096] ffff88007a820000 ffff8800789ff7e0 ffffffff8104df47
ffffffff00000001
[ 1534.797096] Call Trace:
[ 1534.797096] [<ffffffff81047f30>] ? enqueue_task+0x5d/0x6b
[ 1534.797096] [<ffffffff81047f69>] activate_task+0x2b/0x33
[ 1534.797096] [<ffffffff8104df47>] try_to_wake_up+0x1f7/0x226
[ 1534.797096] [<ffffffff8102ac09>] ? pvclock_clocksource_read+0x48/0xb7
[ 1534.797096] [<ffffffff8104df9f>] wake_up_process+0x15/0x17
[ 1534.797096] [<ffffffff81060bc2>] process_timeout+0xe/0x10
[ 1534.797096] [<ffffffff81061378>] run_timer_softirq+0x1a4/0x266
[ 1534.797096] [<ffffffff81076a8c>] ? timekeeping_get_ns+0x18/0x3a
[ 1534.797096] [<ffffffff81060bb4>] ? process_timeout+0x0/0x10
[ 1534.797096] [<ffffffff8105ae4c>] __do_softirq+0xd2/0x19d
[ 1534.797096] [<ffffffff81072750>] ? hrtimer_interrupt+0x11a/0x1b5
[ 1534.797096] [<ffffffff8100aadc>] call_softirq+0x1c/0x30
[ 1534.797096] [<ffffffff8100c101>] do_softirq+0x46/0x81
[ 1534.797096] [<ffffffff8105afd0>] irq_exit+0x49/0x8b
[ 1534.797096] [<ffffffff8147c09b>] smp_apic_timer_interrupt+0x7e/0x8c
[ 1534.797096] [<ffffffff8100a593>] apic_timer_interrupt+0x13/0x20
[ 1534.797096] [<ffffffff811f51ab>] ? avtab_search_node+0x69/0x7a
[ 1534.797096] [<ffffffff811971b4>] ? ext4_mark_iloc_dirty+0x4db/0x543
[ 1534.797096] [<ffffffff811fde0e>] ? cond_compute_av+0x26/0x8c
[ 1534.797096] [<ffffffff811fa8af>] ? context_struct_compute_av+0x16f/0x257
[ 1534.797096] [<ffffffff811fb4a9>] ? security_compute_av+0xf9/0x20d
[ 1534.797096] [<ffffffff811e9d52>] ? avc_has_perm_noaudit+0x104/0x389
[ 1534.797096] [<ffffffff811aaf93>] ? __ext4_journal_stop+0x76/0x7c
[ 1534.797096] [<ffffffff811ea00a>] ? avc_has_perm+0x33/0x63
[ 1534.797096] [<ffffffff811eb0eb>] ? inode_has_perm+0x76/0x8c
[ 1534.797096] [<ffffffff8122c8bc>] ? radix_tree_lookup_slot+0xe/0x10
[ 1534.797096] [<ffffffff8104127e>] ? should_resched+0xe/0x2d
[ 1534.797096] [<ffffffff81474408>] ? _cond_resched+0xe/0x22
[ 1534.797096] [<ffffffff810d9d64>] ? filemap_fault+0x20d/0x36c
[ 1534.797096] [<ffffffff811ee75d>] ? selinux_inode_permission+0x82/0xa2
[ 1534.797096] [<ffffffff811e7f4a>] ? security_inode_exec_permission+0x2a/0x2c
[ 1534.797096] [<ffffffff81129ae2>] ? exec_permission+0x71/0x80
[ 1534.797096] [<ffffffff8112b5e5>] ? link_path_walk+0x85/0x3b8
[ 1534.797096] [<ffffffff8104127e>] ? should_resched+0xe/0x2d
[ 1534.797096] [<ffffffff8112ac3d>] ? path_init_rcu+0x87/0x192
[ 1534.797096] [<ffffffff812324e1>] ? might_fault+0x21/0x23
[ 1534.797096] [<ffffffff8112bb4b>] ? do_path_lookup+0x4d/0xf6
[ 1534.797096] [<ffffffff8112c810>] ? user_path_at+0x57/0x94
[ 1534.797096] [<ffffffff811131d7>] ? __kmalloc_track_caller+0xf7/0x109
[ 1534.797096] [<ffffffff810ec1ef>] ? kmemdup+0x20/0x35
[ 1534.797096] [<ffffffff811ed04e>] ? selinux_cred_prepare+0x1c/0x32
[ 1534.797096] [<ffffffff81074263>] ? override_creds+0x28/0x3d
[ 1534.797096] [<ffffffff811205be>] ? sys_faccessat+0xa0/0x162
[ 1534.797096] [<ffffffff81120698>] ? sys_access+0x18/0x1a
[ 1534.797096] [<ffffffff81009bc2>] ? system_call_fastpath+0x16/0x1b
[ 1534.797096] Code: Bad RIP value.
[ 1534.797096] RIP [< (null)>] (null)
[ 1534.797096] RSP <ffff8800789ff738>
[ 1534.797096] CR2: 0000000000000000
[ 1534.797096] ---[ end trace 1b2381b9c932a61a ]---
[ 1534.797096] Kernel panic - not syncing: Fatal exception in interrupt
[ 1534.797096] Pid: 686, comm: rs:main Q:Reg Tainted: G D
2.6.38.8-32.fc15.x86_64 #1
[ 1534.797096] Call Trace:
[ 1534.797096] [<ffffffff8146c6e6>] panic+0x91/0x19c
[ 1534.797096] [<ffffffff81476cc6>] oops_end+0xb4/0xc5
[ 1534.797096] [<ffffffff8146c06e>] no_context+0x203/0x212
[ 1534.797096] [<ffffffff8146c211>] __bad_area_nosemaphore+0x194/0x1b7
[ 1534.797096] [<ffffffff810d1e17>] ? __perf_event_task_sched_out+0x27/0x2c
[ 1534.797096] [<ffffffff8146c247>] bad_area_nosemaphore+0x13/0x15
[ 1534.797096] [<ffffffff81478d9d>] do_page_fault+0x1c5/0x37a
[ 1534.797096] [<ffffffff814761d5>] page_fault+0x25/0x30
[ 1534.797096] [<ffffffff81047f30>] ? enqueue_task+0x5d/0x6b
[ 1534.797096] [<ffffffff81047f69>] activate_task+0x2b/0x33
[ 1534.797096] [<ffffffff8104df47>] try_to_wake_up+0x1f7/0x226
[ 1534.797096] [<ffffffff8102ac09>] ? pvclock_clocksource_read+0x48/0xb7
[ 1534.797096] [<ffffffff8104df9f>] wake_up_process+0x15/0x17
[ 1534.797096] [<ffffffff81060bc2>] process_timeout+0xe/0x10
[ 1534.797096] [<ffffffff81061378>] run_timer_softirq+0x1a4/0x266
[ 1534.797096] [<ffffffff81076a8c>] ? timekeeping_get_ns+0x18/0x3a
[ 1534.797096] [<ffffffff81060bb4>] ? process_timeout+0x0/0x10
[ 1534.797096] [<ffffffff8105ae4c>] __do_softirq+0xd2/0x19d
[ 1534.797096] [<ffffffff81072750>] ? hrtimer_interrupt+0x11a/0x1b5
[ 1534.797096] [<ffffffff8100aadc>] call_softirq+0x1c/0x30
[ 1534.797096] [<ffffffff8100c101>] do_softirq+0x46/0x81
[ 1534.797096] [<ffffffff8105afd0>] irq_exit+0x49/0x8b
[ 1534.797096] [<ffffffff8147c09b>] smp_apic_timer_interrupt+0x7e/0x8c
[ 1534.797096] [<ffffffff8100a593>] apic_timer_interrupt+0x13/0x20
[ 1534.797096] [<ffffffff811f51ab>] ? avtab_search_node+0x69/0x7a
[ 1534.797096] [<ffffffff811971b4>] ? ext4_mark_iloc_dirty+0x4db/0x543
[ 1534.797096] [<ffffffff811fde0e>] ? cond_compute_av+0x26/0x8c
[ 1534.797096] [<ffffffff811fa8af>] ? context_struct_compute_av+0x16f/0x257
[ 1534.797096] [<ffffffff811fb4a9>] ? security_compute_av+0xf9/0x20d
[ 1534.797096] [<ffffffff811e9d52>] ? avc_has_perm_noaudit+0x104/0x389
[ 1534.797096] [<ffffffff811aaf93>] ? __ext4_journal_stop+0x76/0x7c
[ 1534.797096] [<ffffffff811ea00a>] ? avc_has_perm+0x33/0x63
[ 1534.797096] [<ffffffff811eb0eb>] ? inode_has_perm+0x76/0x8c
[ 1534.797096] [<ffffffff8122c8bc>] ? radix_tree_lookup_slot+0xe/0x10
[ 1534.797096] [<ffffffff8104127e>] ? should_resched+0xe/0x2d
[ 1534.797096] [<ffffffff81474408>] ? _cond_resched+0xe/0x22
[ 1534.797096] [<ffffffff810d9d64>] ? filemap_fault+0x20d/0x36c
[ 1534.797096] [<ffffffff811ee75d>] ? selinux_inode_permission+0x82/0xa2
[ 1534.797096] [<ffffffff811e7f4a>] ? security_inode_exec_permission+0x2a/0x2c
[ 1534.797096] [<ffffffff81129ae2>] ? exec_permission+0x71/0x80
[ 1534.797096] [<ffffffff8112b5e5>] ? link_path_walk+0x85/0x3b8
[ 1534.797096] [<ffffffff8104127e>] ? should_resched+0xe/0x2d
[ 1534.797096] [<ffffffff8112ac3d>] ? path_init_rcu+0x87/0x192
[ 1534.797096] [<ffffffff812324e1>] ? might_fault+0x21/0x23
[ 1534.797096] [<ffffffff8112bb4b>] ? do_path_lookup+0x4d/0xf6
[ 1534.797096] [<ffffffff8112c810>] ? user_path_at+0x57/0x94
[ 1534.797096] [<ffffffff811131d7>] ? __kmalloc_track_caller+0xf7/0x109
[ 1534.797096] [<ffffffff810ec1ef>] ? kmemdup+0x20/0x35
[ 1534.797096] [<ffffffff811ed04e>] ? selinux_cred_prepare+0x1c/0x32
[ 1534.797096] [<ffffffff81074263>] ? override_creds+0x28/0x3d
[ 1534.797096] [<ffffffff811205be>] ? sys_faccessat+0xa0/0x162
[ 1534.797096] [<ffffffff81120698>] ? sys_access+0x18/0x1a
[ 1534.797096] [<ffffffff81009bc2>] ? system_call_fastpath+0x16/0x1b
This is on f15 (2.6.38.8-32.fc15.x86_64), but I have also seen it happen on f14
(2.6.35.13-92.fc14.x86_64).
I am using the following workaround atm, but this only seems to work because we
never run out of buffers now (at least in my environment):
diff --git a/runtime/transport/debugfs.c b/runtime/transport/debugfs.c
index 6bbef53..0897fe5 100644
--- a/runtime/transport/debugfs.c
+++ b/runtime/transport/debugfs.c
@@ -12,7 +12,7 @@
#include <linux/debugfs.h>
#include "transport.h"
-#define STP_DEFAULT_BUFFERS 50
+#define STP_DEFAULT_BUFFERS 1024
inline static int _stp_ctl_write_fs(int type, void *data, unsigned len)
{
--
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.