This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
[PATCH] fix for 'spinlock bad magic' error with itrace probe point
- From: Maynard Johnson <maynardj at us dot ibm dot com>
- To: systemtap at sourceware dot org
- Cc: "Frank Ch. Eigler" <fche at redhat dot com>
- Date: Wed, 18 Mar 2009 11:45:54 -0500
- Subject: [PATCH] fix for 'spinlock bad magic' error with itrace probe point
- Reply-to: maynardj at us dot ibm dot com
After Dave Nomura first committed the itrace support back in August 2008, Frank
reported in an Aug 6 posting that the itrace test failed on x86_64 with "BUG:
soft lockup - CPU#0 stuck for 10s!". Using a current SystemTap git pull on a
ppc64 Fedora 11 (rawhide) system, I have encountered a 'spinlock bad magic'
error when testing the itrace probe point. The system was configured with
various debugging options (e.g., CONFIG_LOCKDEP, CONFIG_DEBUG_SPINLOCK). This
error does not occur on systems running production kernels vs a debug kernel.
I believe the same underlying bug (an uninitialized spinlock in
systemtap/runtime/itrace.c) was the cause of both symptoms -- the one Frank
reported and the one I'm seeing now. The attached patch fixes this problem.
For the current problem, an error message is sent to the screen, with more
detailed information sent to the system log as follows:
---- Screen output -----
Message from syslogd@localhost at Mar 18 10:38:16 ...
kernel:BUG: spinlock bad magic on CPU#1, probeme4/2703 (Not tainted)
Message from syslogd@localhost at Mar 18 10:38:16 ...
kernel: lock: d000000000c0e758, .magic: 00000000, .owner: probeme4/2703,
.owner_cpu: 1
---- System log -----
Mar 17 20:19:19 localhost kernel: create_itrace_info: tid=3725
Mar 17 20:19:19 localhost kernel: BUG: spinlock bad magic on CPU#0,
probeme4/3725 (Not tainted)
Mar 17 20:19:19 localhost kernel: lock: d000000000f16758, .magic: 00000000,
.owner: probeme4/3725, .owner_cpu: 0
Mar 17 20:19:19 localhost kernel: Call Trace:
Mar 17 20:19:19 localhost kernel: [c00000002cba76c0] [c000000000012530]
.show_stack+0x98/0x188 (unreliable)
Mar 17 20:19:19 localhost kernel: [c00000002cba7770] [c00000000060db1c]
.dump_stack+0x28/0x3c
Mar 17 20:19:19 localhost kernel: [c00000002cba77f0] [c0000000002b5120]
.spin_bug+0xd4/0x108
Mar 17 20:19:19 localhost kernel: [c00000002cba78a0] [c0000000002b5190]
._raw_spin_unlock+0x3c/0xd8
Mar 17 20:19:19 localhost kernel: [c00000002cba7930] [c000000000601114]
._spin_unlock+0x48/0x64
Mar 17 20:19:19 localhost kernel: [c00000002cba79d0] [d000000000ee89a8]
._stp_itrace_probe_cb+0x168/0x360 [stap_484930766b74f315cd7f061cecaed62f_46246]
Mar 17 20:19:19 localhost kernel: [c00000002cba7a90] [d000000000ee5278]
.__stp_utrace_task_finder_target_quiesce+0x178/0x394
[stap_484930766b74f315cd7f061cecaed62f_46246]
Mar 17 20:19:19 localhost kernel: [c00000002cba7b70] [c0000000001060a8]
.start_callback+0x90/0x11c
Mar 17 20:19:19 localhost kernel: [c00000002cba7c20] [c00000000010862c]
.utrace_resume+0xfc/0x160
Mar 17 20:19:19 localhost kernel: [c00000002cba7ce0] [c000000000015298]
.do_signal+0x308/0x334
Mar 17 20:19:19 localhost kernel: [c00000002cba7e30] [c000000000008c48]
do_work+0x24/0x28
Mar 17 20:19:19 localhost kernel: usr_itrace_init: completed for tid = 3725
Mar 17 20:19:20 localhost kernel: cleanup_usr_itrace called
Mar 17 20:19:20 localhost kernel: remove_usr_itrace_info: tid=3725
Mar 17 20:19:42 localhost kernel: create_itrace_info: tid=3742
Mar 17 20:19:42 localhost kernel: usr_itrace_init: completed for tid = 3742
=====================================================================
diff -paur systemtap/runtime/itrace.c systemtap-fix/runtime/itrace.c
--- systemtap/runtime/itrace.c 2009-03-17 20:29:44.000000000 -0400
+++ systemtap-fix/runtime/itrace.c 2009-03-17 20:30:48.000000000 -0400
@@ -275,6 +275,7 @@ static int usr_itrace_init(int single_st
struct itrace_info *ui;
struct task_struct *tsk;
+ spin_lock_init(&itrace_lock);
rcu_read_lock();
tsk = find_task_by_vpid(tid);
if (!tsk) {
@@ -293,11 +294,6 @@ static int usr_itrace_init(int single_st
put_task_struct(tsk);
rcu_read_unlock();
- spin_lock_init(&itrace_lock);
-
- /* set initial state */
- spin_lock(&itrace_lock);
- spin_unlock(&itrace_lock);
printk(KERN_INFO "usr_itrace_init: completed for tid = %d\n", tid);
return 0;
@@ -314,7 +310,6 @@ void static remove_usr_itrace_info(struc
if (debug)
printk(KERN_INFO "remove_usr_itrace_info: tid=%d\n", ui->tid);
- spin_lock(&itrace_lock);
if (ui->tsk && ui->engine) {
status = utrace_control(ui->tsk, ui->engine, UTRACE_DETACH);
if (status < 0 && status != -ESRCH && status != -EALREADY)
@@ -322,6 +317,7 @@ void static remove_usr_itrace_info(struc
"utrace_control(UTRACE_DETACH) returns %d\n",
status);
}
+ spin_lock(&itrace_lock);
list_del(&ui->link);
spin_unlock(&itrace_lock);
kfree(ui);