This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] fix for 'spinlock bad magic' error with itrace probe point


After Dave Nomura first committed the itrace support back in August 2008, Frank reported in an Aug 6 posting that the itrace test failed on x86_64 with "BUG: soft lockup - CPU#0 stuck for 10s!". Using a current SystemTap git pull on a ppc64 Fedora 11 (rawhide) system, I have encountered a 'spinlock bad magic' error when testing the itrace probe point. The system was configured with various debugging options (e.g., CONFIG_LOCKDEP, CONFIG_DEBUG_SPINLOCK). This error does not occur on systems running production kernels vs a debug kernel.

I believe the same underlying bug (an uninitialized spinlock in systemtap/runtime/itrace.c) was the cause of both symptoms -- the one Frank reported and the one I'm seeing now. The attached patch fixes this problem.

For the current problem, an error message is sent to the screen, with more detailed information sent to the system log as follows:

---- Screen output -----
Message from syslogd@localhost at Mar 18 10:38:16 ...
 kernel:BUG: spinlock bad magic on CPU#1, probeme4/2703 (Not tainted)

Message from syslogd@localhost at Mar 18 10:38:16 ...
kernel: lock: d000000000c0e758, .magic: 00000000, .owner: probeme4/2703, .owner_cpu: 1


---- System log -----
Mar 17 20:19:19 localhost kernel: create_itrace_info: tid=3725
Mar 17 20:19:19 localhost kernel: BUG: spinlock bad magic on CPU#0, probeme4/3725 (Not tainted)
Mar 17 20:19:19 localhost kernel: lock: d000000000f16758, .magic: 00000000, .owner: probeme4/3725, .owner_cpu: 0
Mar 17 20:19:19 localhost kernel: Call Trace:
Mar 17 20:19:19 localhost kernel: [c00000002cba76c0] [c000000000012530] .show_stack+0x98/0x188 (unreliable)
Mar 17 20:19:19 localhost kernel: [c00000002cba7770] [c00000000060db1c] .dump_stack+0x28/0x3c
Mar 17 20:19:19 localhost kernel: [c00000002cba77f0] [c0000000002b5120] .spin_bug+0xd4/0x108
Mar 17 20:19:19 localhost kernel: [c00000002cba78a0] [c0000000002b5190] ._raw_spin_unlock+0x3c/0xd8
Mar 17 20:19:19 localhost kernel: [c00000002cba7930] [c000000000601114] ._spin_unlock+0x48/0x64
Mar 17 20:19:19 localhost kernel: [c00000002cba79d0] [d000000000ee89a8] ._stp_itrace_probe_cb+0x168/0x360 [stap_484930766b74f315cd7f061cecaed62f_46246]
Mar 17 20:19:19 localhost kernel: [c00000002cba7a90] [d000000000ee5278] .__stp_utrace_task_finder_target_quiesce+0x178/0x394 [stap_484930766b74f315cd7f061cecaed62f_46246]
Mar 17 20:19:19 localhost kernel: [c00000002cba7b70] [c0000000001060a8] .start_callback+0x90/0x11c
Mar 17 20:19:19 localhost kernel: [c00000002cba7c20] [c00000000010862c] .utrace_resume+0xfc/0x160
Mar 17 20:19:19 localhost kernel: [c00000002cba7ce0] [c000000000015298] .do_signal+0x308/0x334
Mar 17 20:19:19 localhost kernel: [c00000002cba7e30] [c000000000008c48] do_work+0x24/0x28
Mar 17 20:19:19 localhost kernel: usr_itrace_init: completed for tid = 3725
Mar 17 20:19:20 localhost kernel: cleanup_usr_itrace called
Mar 17 20:19:20 localhost kernel: remove_usr_itrace_info: tid=3725
Mar 17 20:19:42 localhost kernel: create_itrace_info: tid=3742
Mar 17 20:19:42 localhost kernel: usr_itrace_init: completed for tid = 3742


=====================================================================
diff -paur systemtap/runtime/itrace.c systemtap-fix/runtime/itrace.c
--- systemtap/runtime/itrace.c	2009-03-17 20:29:44.000000000 -0400
+++ systemtap-fix/runtime/itrace.c	2009-03-17 20:30:48.000000000 -0400
@@ -275,6 +275,7 @@ static int usr_itrace_init(int single_st
 	struct itrace_info *ui;
 	struct task_struct *tsk;
 
+	spin_lock_init(&itrace_lock);
 	rcu_read_lock();
 	tsk = find_task_by_vpid(tid);
 	if (!tsk) {
@@ -293,11 +294,6 @@ static int usr_itrace_init(int single_st
 	put_task_struct(tsk);
 	rcu_read_unlock();
 
-	spin_lock_init(&itrace_lock);
-
-	/* set initial state */
-	spin_lock(&itrace_lock);
-	spin_unlock(&itrace_lock);
 	printk(KERN_INFO "usr_itrace_init: completed for tid = %d\n", tid);
 
 	return 0;
@@ -314,7 +310,6 @@ void static remove_usr_itrace_info(struc
 	if (debug)
 		printk(KERN_INFO "remove_usr_itrace_info: tid=%d\n", ui->tid);
 
-	spin_lock(&itrace_lock);
 	if (ui->tsk && ui->engine) {
 		status = utrace_control(ui->tsk, ui->engine, UTRACE_DETACH);
 		if (status < 0 && status != -ESRCH && status != -EALREADY)
@@ -322,6 +317,7 @@ void static remove_usr_itrace_info(struc
 			       "utrace_control(UTRACE_DETACH) returns %d\n",
 			       status);
 	}
+	spin_lock(&itrace_lock);
 	list_del(&ui->link);
 	spin_unlock(&itrace_lock);
 	kfree(ui);

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]