This is the mail archive of the gdb-patches@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: nptl threading patch for linux


On Fri, May 09, 2003 at 07:38:34PM -0400, J. Johnston wrote:
> Daniel Jacobowitz wrote:
> >On Thu, Apr 24, 2003 at 04:52:04PM -0400, J. Johnston wrote:
> >
> >>The following is the last part of my revised nptl patch that has
> >>been broken up per Daniel J.'s suggestion.  There are no generated
> >>files included in the patch.
> >
> >
> >Well, this patch doesn't work for me :(  Using 2.5.69, since I don't
> >have any of the Red Hat kernels available here at the moment.  It looks
> >like GDB bellies up around the second thread creation.
> >
> 
> Is this one of the gdb.threads testcases?  If not, do any of those run
> for you and/or can you send me a testcase for the problem below so we can 
> at least
> have something common to compare?

Sorry, I forgot to say.  This is just pthreads.exp, with a breakpoint
on common_routine.

> 
> -- Jeff J.
> 
> >A backtrace looks like:
> >#0  0xffffe402 in ?? ()
> >#1  0x080e1332 in stop_wait_callback (lp=0x0, data=0xbffff450)
> >    at /opt/src/gdb/src-gdblinks/gdb/lin-lwp.c:708
> >#2  0x080e159a in stop_wait_callback (lp=0x0, data=0xbffff450)
> >    at /opt/src/gdb/src-gdblinks/gdb/lin-lwp.c:870
> >#3  0x080e159a in stop_wait_callback (lp=0x0, data=0xbffff450)
> >    at /opt/src/gdb/src-gdblinks/gdb/lin-lwp.c:870
> >#4  0x080e159a in stop_wait_callback (lp=0x0, data=0xbffff450)
> >    at /opt/src/gdb/src-gdblinks/gdb/lin-lwp.c:870
> >
> >And that's not just the stack unwinder getting confused.  We really did
> >recurse until we ran out of stack.
> >
> >The superficial reason is this:
> >SWC: Pending event Segmentation Fault (stopped) in LWP 4490
> >
> >i.e. every time we resume it with no signal it SIGSEGV's again, and we
> >never get the SIGSTOP.
> >
> >Here's some more of the log:
> >(gdb) c
> >Continuing.
> >LLR: PTRACE_SINGLESTEP process 4498, 0 (resume event thread)
> >LLW: waitpid 4498 received Trace/breakpoint trap (stopped)
> >LLTA: PTRACE_PEEKUSER LWP 4498, 0, 0 (OK)
> >LLW: Candidate event Trace/breakpoint trap (stopped) in LWP 4498.
> >SEL: Select single-step LWP 4498
> >LLW: trap_ptid is LWP 4498.
> >RC:  PTRACE_CONT LWP 4497, 0, 0 (resume sibling)
> >LLR: PTRACE_CONT process 4498, 0 (resume event thread)
> >LLW: waitpid 4497 received Trace/breakpoint trap (stopped)
> >LLTA: PTRACE_PEEKUSER LWP 4497, 0, 0 (OK)
> >LLW: Candidate event Trace/breakpoint trap (stopped) in LWP 4497.
> >SC:  kill LWP 4498 **<SIGSTOP>**
> >SC:  lwp kill 0 ERRNO-OK
> >SWC: waitpid LWP 4498 received Stopped (signal) (stopped)
> >LLTA: PTRACE_PEEKUSER LWP 4498, 0, 0 (OK)
> >LLW: trap_ptid is LWP 4497.
> >[New Thread 1077276112 (LWP 4499)]
> >LLAL: PTRACE_ATTACH LWP 4499, 0, 0 (OK)
> >LLAL: waitpid LWP 4499 received Stopped (signal) (stopped)
> >LLR: PTRACE_SINGLESTEP process 4497, 0 (resume event thread)
> >LLW: waitpid 4497 received Trace/breakpoint trap (stopped)
> >LLTA: PTRACE_PEEKUSER LWP 4497, 0, 0 (OK)
> >LLW: Candidate event Trace/breakpoint trap (stopped) in LWP 4497.
> >SEL: Select single-step LWP 4497
> >LLW: trap_ptid is LWP 4497.
> >RC:  PTRACE_CONT LWP 4499, 0, 0 (resume sibling)
> >RC:  PTRACE_CONT LWP 4498, 0, 0 (resume sibling)
> >LLR: PTRACE_CONT process 4497, 0 (resume event thread)
> >LLW: waitpid 4499 received Trace/breakpoint trap (stopped)
> >LLTA: PTRACE_PEEKUSER LWP 4499, 0, 0 (OK)
> >LLW: Candidate event Trace/breakpoint trap (stopped) in LWP 4499.
> >SC:  kill LWP 4498 **<SIGSTOP>**
> >SC:  lwp kill 0 ERRNO-OK
> >SC:  kill LWP 4497 **<SIGSTOP>**
> >SC:  lwp kill 0 ERRNO-OK
> >SWC: waitpid LWP 4498 received Stopped (signal) (stopped)
> >LLTA: PTRACE_PEEKUSER LWP 4498, 0, 0 (OK)
> >SWC: waitpid LWP 4497 received Trace/breakpoint trap (stopped)
> >LLTA: PTRACE_PEEKUSER LWP 4497, 0, 0 (OK)
> >PTRACE_CONT LWP 4497, 0, 0 (OK)
> >SWC: Candidate SIGTRAP event in LWP 4497
> >SWC: waitpid LWP 4497 received Trace/breakpoint trap (stopped)
> >LLTA: PTRACE_PEEKUSER LWP 4497, 0, 0 (OK)
> >PTRACE_CONT LWP 4497, 0, 0 (OK)
> >SWC: Candidate SIGTRAP event in LWP 4497
> >SWC: waitpid LWP 4497 received Segmentation fault (stopped)
> >LLTA: PTRACE_PEEKUSER LWP 4497, 0, 0 (OK)
> >SWC: Pending event Segmentation fault (stopped) in LWP 4497
> >SWC: PTRACE_CONT LWP 4497, 0, 0 (OK)
> >SWC: waitpid LWP 4497 received Segmentation fault (stopped)
> >LLTA: PTRACE_PEEKUSER LWP 4497, 0, 0 (OK)
> >
> >
> >A little interpretation: 4497 hits the creation breakpoint.  We atach
> >to 4499.  4499 hits the common_routine breakpoint.  We stop 4497.  It
> >hits the breakpoint at thread creation again for the next thread.  We
> >PTRACE_CONT 4497 again trying to get the SIGSTOP, and get another
> >SIGTRAP - probably we were backed up from the breakpoint last time so
> >we hit it again.  We try _again_, and SIGSEGV because we're on the
> >second byte of a multi-byte instruction, the first byte having been
> >replaced by a breakpoint.
> >
> >Life explodes.
> >
> >
> >So:
> >  - stop_wait_callback should be fixed to not be so dumb when this
> >    happens.
> >  - we need to figure out how we got into this mess.
> >  - and why the SIGSTOP never showed up.
> >
> >I avoid this entire foul issue in gdbserver by not backtracking and
> >resuming the application; instead I just set a flag marking the next
> >SIGSTOP as "expected".  It's still not perfect but it's a great deal
> >better.  I can do even better when I have some time to play with
> >PTRACE_GETSIGINFO.
> >
> >I'm waiting for GDB to tell me how we got here.  The backtrace is more
> >than 40K frames, since I forgot to shrink the stack limit.  50K...
> >170K... ooh!
> >
> >#174697 0x080e1724 in stop_wait_callback (lp=0x0, data=0xbffff450)
> >    at /opt/src/gdb/src-gdblinks/gdb/lin-lwp.c:830
> >#174698 0x080e033d in iterate_over_lwps (callback=0x80e12d0 
> ><stop_wait_callback>, data=0x1181)
> >    at /opt/src/gdb/src-gdblinks/gdb/lin-lwp.c:293
> >#174699 0x080e251e in lin_lwp_wait (ptid={pid = -1, lwp = 0, tid = 0}, 
> >ourstatus=0x72)
> >    at /opt/src/gdb/src-gdblinks/gdb/lin-lwp.c:1499
> >#174700 0x08128ca3 in thread_db_wait (ptid={pid = -1, lwp = 0, tid = 0}, 
> >ourstatus=0xffffffff)
> >    at /opt/src/gdb/src-gdblinks/gdb/thread-db.c:846
> >#174701 0x080bc19e in wait_for_inferior () at 
> >/opt/src/gdb/src-gdblinks/gdb/infrun.c:1003
> >#174702 0x080bbf13 in proceed (addr=3221222720, siggnal=144, step=0)
> >    at /opt/src/gdb/src-gdblinks/gdb/infrun.c:814
> >#174703 0x080b8fb0 in continue_command (proc_count_exp=0x0, from_tty=1)
> >    at /opt/src/gdb/src-gdblinks/gdb/infcmd.c:539
> >
> >It wasn't worth the wait.  That didn't help much.
> >
> >
> 
> 
> 

-- 
Daniel Jacobowitz
MontaVista Software                         Debian GNU/Linux Developer


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]