Summary: | simple script oops box recursively | ||
---|---|---|---|
Product: | systemtap | Reporter: | James Dickens <jamesd.wi> |
Component: | kprobes | Assignee: | Prasanna S Panchamukhi <prasanna> |
Status: | RESOLVED DUPLICATE | ||
Severity: | normal | CC: | amavin, jkenisto |
Priority: | P2 | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Host: | Target: | ||
Build: | Last reconfirmed: |
Description
James Dickens
2005-10-28 21:31:09 UTC
Prasanna, can you look into this bug. Note that the variable "called" is not declared global. I am seeing different behaviours when a kernel module is loaded to insert probes on sched_clock() with calling printks from the handlers. On i386 uniprocessor box, I see lots of oops messages. But on i386 smp box, probes handlers are executed and I can see messages on the console, the system is not hung, but it does not allow me to remove the loaded kernel module as well. If no printks are used in the handlers, it runs fine on both uni and smp box. One solution would be to prevent calling of printks from probe handlers of sched_clock(). Another solution would be to avoid probes on sched_clock(). Also there are situations where inserted probes cannot be removed from the command line, in such situations we need to provide a SysRq key support to remove all the probes from the kernel. -Prasanna Subject: Re: simple script oops box recursively On 11 Nov 2005 14:38:43 -0000, prasanna at in dot ibm dot com <sourceware-bugzilla@sourceware.org> wrote: > > ------- Additional Comments From prasanna at in dot ibm dot com 2005-11-11 14:38 ------- > I am seeing different behaviours when a kernel module is loaded to insert probes > on sched_clock() with calling printks from the handlers. On i386 uniprocessor > box, I see lots of oops messages. But on i386 smp box, probes handlers are > executed and I can see messages on the console, the system is not hung, but it > does not allow me to remove the loaded kernel module as well. > > If no printks are used in the handlers, it runs fine on both uni and smp box. > One solution would be to prevent calling of printks from probe handlers of > sched_clock(). Another solution would be to avoid probes on sched_clock(). please see my comments in http://sourceware.org/bugzilla/show_bug.cgi?id=1776 about printk requirements. > > Also there are situations where inserted probes cannot be removed from the > command line, in such situations we need to provide a SysRq key support to > remove all the probes from the kernel. > This isn't a solution, if you ever wish systemtap to be useful on any System other than a single developers box. Most production boxes don't have a keyboard attached. Are you going to add a quick hack to ssh that sends the SysRq key to a task? You need to solve the real problem rather than just adding another quick hack. Solve the problem, rather than hiding/removing the symptoms. James Dickens uadmin.blogspot.com > -Prasanna > > -- > > > http://sourceware.org/bugzilla/show_bug.cgi?id=1594 > > ------- You are receiving this mail because: ------- > You reported the bug, or are watching the reporter. > Below is the stack trace when I run the given script. Also I get a double fault as seen in the bug #1776. wks126319wss.in.ibm.com login: double fault, gdt at c04ea000 [255 bytes] Kernel panic - not syncing: kernel/sched.c:357: spin_lock(kernel/sched.c:c04eac40) already locked by kernel/sched.c/357. (Not tainted) [<c0127ba8>] panic+0x45/0x1b4 [<c01207fa>] wake_up_process+0x0/0x10 [<c01519d7>] autoremove_wake_function+0x15/0x37 [<c012186b>] __wake_up_common+0x39/0x59 [<c012191d>] __wake_up+0x92/0x252 [<c0128c93>] call_console_drivers+0x7e/0x149 [<c01298c3>] release_console_sem+0x283/0x41c [<c01292b0>] vprintk+0x401/0x70e [<c0128eab>] printk+0x1b/0x1f [<c010dc16>] doublefault_fn+0x36/0xf0 <0>Kernel panic - not syncing: kernel/sched.c:3063: spin_lock(kernel/printk.c:c0458d00) already locked by kernel/sched.c/3063. (Not tain) [<c0127ba8>] panic+0x45/0x1b4 [<c0271f16>] vgacon_dummy+0x0/0xa [<c0121add>] __wake_up_locked+0x0/0x21 [<c01298c3>] release_console_sem+0x283/0x41c [<c01292b0>] vprintk+0x401/0x70e [<c01292b0>] vprintk+0x401/0x70e [<c025ce8a>] vsnprintf+0x32c/0x624 I get the following trace when I run the simple script. Not sure what code added by the transilator is causing the problem. probe kernel.function("sched_clock") { } login: double fault, gdt at c04ea000 [255 bytes] Kernel panic - not syncing: kernel/sched.c:357: spin_lock(kernel/sched.c:c04eac40) already locked by kernel/sched.c/357. (Not tainted) -Prasanna |