This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

ia64 hang when using itrace (was Re: Backward compatibility for insn probe point)


David Smith wrote:
> David Smith wrote:
>> Maynard Johnson wrote:
>>>> David Smith wrote:
>>>> One last thing.  I thought I'd try block stepping, so I got access to an
>>>> ia64 machine.  Unfortunately, using systemtap insn probes (either single
>>>> or block step) lock up the system with a spinlock lockup.  Sigh.
>>> Does anyone know who maintains ia64/utrace?  David, was the above error
>>> on "old" utrace or "new"?
>> The error is on "old" utrace.  I'm trying to look into the ia64 utrace
>> problem now.
> 
> Here's what I see on the console (running lockdep enabled
> 2.6.18-146.el5debug):
> 
> ====
> BUG: spinlock lockup on CPU#0, ls/2576, e0000040fe1092d8 (Tainted: G)
> 
> Call Trace:
>  [<a000000100013b40>] show_stack+0x40/0xa0
>                                 sp=e0000003f640f870 bsp=e0000003f6409440
>  [<a000000100013bd0>] dump_stack+0x30/0x60
>                                 sp=e0000003f640fa40 bsp=e0000003f6409428
>  [<a0000001002de200>] _raw_spin_lock+0x200/0x260
>                                 sp=e0000003f640fa40 bsp=e0000003f64093e8
>  [<a00000010065ff50>] _spin_lock_irqsave+0x30/0x60
>                                 sp=e0000003f640fa40 bsp=e0000003f64093c0
>  [<a00000010009c730>] force_sig_info+0x30/0x160
>                                 sp=e0000003f640fa40 bsp=e0000003f6409380
>  [<a000000100661450>] ia64_fault+0xff0/0x1280
>                                 sp=e0000003f640fa40 bsp=e0000003f6409328
>  [<a00000010000bfe0>] __ia64_leave_kernel+0x0/0x280
>                                 sp=e0000003f640fc60 bsp=e0000003f6409328
>  [<a0000001002de0d0>] _raw_spin_lock+0xd0/0x260
>                                 sp=e0000003f640fe30 bsp=e0000003f64092c0
>  [<a00000010065ff50>] _spin_lock_irqsave+0x30/0x60
>                                 sp=e0000003f640fe30 bsp=e0000003f6409298
>  [<a00000010009c730>] force_sig_info+0x30/0x160
>                                 sp=e0000003f640fe30 bsp=e0000003f6409258
>  [<a00000010009c890>] force_sig+0x30/0x60
>                                 sp=e0000003f640fe30 bsp=e0000003f6409230
>  [<a00000010002cfe0>] syscall_trace_leave+0x100/0x140
>                                 sp=e0000003f640fe30 bsp=e0000003f64091d0
>  [<a00000010000bda0>] __ia64_trace_syscall+0x100/0x110
>                                 sp=e0000003f640fe30 bsp=e0000003f64091d0
>  [<a000000000010620>] __start_ivt_text+0xffffffff00010620/0x400
>                                 sp=e0000003f6410000 bsp=e0000003f64091d0
> ====
> 
> From what I can tell, the spinlock that is stuck is
> current->sighand->siglock.  force_sig_info() (from kernel/signal.c:739)
> grabs the spinlock, but we get a fault somewhere? and end up in
> __ia64_leave_kernel() (from arch/ia64/kernel/entry.S:813).  The fault
> handling in ia64_fault() calls force_sig_info() again, which tries to
> grab same spinlock again.
> 
> If anyone has a better understanding of this, I'd love to know how we
> ended up in __ia64_leave_kernel().

I should have included other information I know.  This always happens
after a call to set_tid_address(), which is the 79th syscall that 'ls'
runs.  By this point the insn probe has been hit at least
555391 times (my test script prints the number of instructions at every
syscall entry and exit).

-- 
David Smith
dsmith@redhat.com
Red Hat
http://www.redhat.com
256.217.0141 (direct)
256.837.0057 (fax)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]