This is the mail archive of the
mailing list for the GDB project.
Re: [rfa/mips] Stop backtraces when we've lost the PC
On Sun, Mar 07, 2004 at 10:23:24PM -0500, Daniel Jacobowitz wrote:
> On Sun, Mar 07, 2004 at 07:56:18PM -0500, Andrew Cagney wrote:
> > >Here's an updated version of a little hack I've been using since GDB 6.0.
> > >If we are in a nested normal frame, i.e. something whose next frame is a
> > >function that it called in the normal way, and we didn't find a saved PC,
> > >we're going to be stuck in a loop. We might have been able to figure out
> > >the frame size, but not where the return address was stored; as the comment
> > >says, this happens in glibc's clone function. Of course the problem there
> > >is that it _doesn't_ save $ra in the normal fashion; it won't return.
> > >
> > >Without this patch schedlock.exp falls apart, because backtraces continue
> > >forever printing "clone()" on every line.
> > >
> > >OK?
> > No!
> > >+ if (frame_relative_level (next_frame) >= 0
> > >+ && get_frame_type (next_frame) == NORMAL_FRAME
> > >+ && !trad_frame_addr_p (info->saved_regs, NUM_REGS + PC_REGNUM))
> > The whole point of this recursive code is that you don't need to add
> > hacks that look down the stack at the type of more inner frames :-/
> > Can you post the relevant assembler so that we can determine why the
> > prologue analyzer is failing.
> This is clone. It can't be unwound past - there is no stack behind it
> (we're in the child, here).
> 00113730 <__clone>:
> 113730: 3c1c000c lui gp,0xc
> 113734: 279ca5c0 addiu gp,gp,-23104
> 113738: 0399e021 addu gp,gp,t9
> 11373c: 27bdffe8 addiu sp,sp,-24
> 113740: afbc0014 sw gp,20(sp)
> 113744: 10800010 beqz a0,113788 <__clone+0x58>
> 113748: 24020016 li v0,22
> 11374c: 10a0000e beqz a1,113788 <__clone+0x58>
> 113750: 00000000 nop
> 113754: 24a5ffe0 addiu a1,a1,-32
> 113758: aca40000 sw a0,0(a1)
> 11375c: aca70004 sw a3,4(a1)
> 113760: 00c02021 move a0,a2
> 113764: 24021018 li v0,4120
> 113768: 0000000c syscall
> 11376c: 14e00006 bnez a3,113788 <__clone+0x58>
> 113770: 00000000 nop
> 113774: 10400008 beqz v0,113798 <__thread_start>
> 113778: 00000000 nop
> 11377c: 03e00008 jr ra
> 113780: 27bd0018 addiu sp,sp,24
> 113784: 00000000 nop
> 113788: 8f998290 lw t9,-32112(gp)
> 11378c: 27bd0018 addiu sp,sp,24
> 113790: 03200008 jr t9
> 113794: 00000000 nop
> 00113798 <__thread_start>:
> 113798: afbc0014 sw gp,20(sp)
> 11379c: 8fb90000 lw t9,0(sp)
> 1137a0: 8fa40004 lw a0,4(sp)
> 1137a4: 0320f809 jalr t9
> 1137a8: 00000000 nop
> 1137ac: 8fbc0014 lw gp,20(sp)
> 1137b0: 00402021 move a0,v0
> 1137b4: 8f9991f4 lw t9,-28172(gp)
> 1137b8: 00000000 nop
> 1137bc: 0320f809 jalr t9
> 1137c0: 00000000 nop
> The last jalr is exit.
> And here's its PDR (little endian):
> 0e0c0 30371100 00000000 00000000 00000000 07..............
> 0e0d0 00000000 10000000 1d000000 1d000000 ................
> 0e0e0 98371100 00000000 00000000 00000000 .7..............
> 0e0f0 00000000 00000000 00000000 00000000 ................
> Frame register is sp, stack size is 16, no registers saved for clone.
> Then stack size 0, no registers saved, frame register is zero for
> __thread_start, which is where we are. I don't know why that didn't
> terminate the stack trace right away, I'll bring the board back up
> tomorrow to find out.
OK, two things.
The first problem is that __thread_start is a local symbol, and the
copy of the library I'm using on my target is stripped of local symbols
but has a .pdr section. We pass the PC to find_proc_desc, which is
fine, but non_heuristic_proc_desc does:
find_pc_partial_function (pc, NULL, &startaddr, NULL);
and looks for a PDR associated with startaddr. Oopsie.
The problem is, GAS's .pdr information does not include a size or an
end address. It's slightly risky to search for the latest .pdr entry
preceding PC, because that might be for a completely different
function, but not very risky - .frame will generally be used
consistently if it was used at all.
I believe the attached patch is safe in a superset of circumstances to
where the previous code was safe. It causes us to select the correct
.pdr entry in this circumstance. What do you think of it?
[BTW: I verified that without a .pdr section we notice that we can't
find a prologue for clone, and give up unwinding semi-gracefully. I
also verified that with local symbols, we find the correct .pdr.]
The other thing:
I hypothesize that if two consecutive frames, regardless of their type,
claim to save the PC register at the same location, then unwinding is
hosed. If we changed the interface for gdbarch_unwind_pc to return
information on where, if anywhere, the PC was unwound from, we could
check this. This handles sigtramp frames (which save the PC at a
particular memory buffer), dummy frames (which would say not_lval, i.e.
the PC is not saved in a simple fashion, i.e. skip this test), typical
normal frames (the sentinel frame says it's saved in a register, the
outermost frame says it's saved in the return address register or in a
local stack slot, normal frames which called other normal frames will
have it in a local stack slot.
Obviously for some architecture with a reconstructed not_lval PC this
wouldn't work, but I think it could still be useful as a safety check
on the unwinder. Not a safe termination - this sort of hack doesn't
have any business being relied on - but at least a call to error() to
prevent spiralling off into infinity.
Does my hypothesis make sense? Shall I try to implement it?
MontaVista Software Debian GNU/Linux Developer
2004-03-08 Daniel Jacobowitz <email@example.com>
* mips-tdep.c (non_heuristic_proc_desc): Search using the specified
PC rather than the partial function start address. Use the start
address to sanity check the found PDR.
RCS file: /big/fsf/rsync/src-cvs/src/gdb/mips-tdep.c,v
retrieving revision 1.283
diff -u -p -r1.283 mips-tdep.c
--- mips-tdep.c 17 Feb 2004 15:21:21 -0000 1.283
+++ mips-tdep.c 8 Mar 2004 15:33:25 -0000
@@ -2289,10 +2289,6 @@ non_heuristic_proc_desc (CORE_ADDR pc, C
if (DEPRECATED_PC_IN_CALL_DUMMY (pc, 0, 0))
- find_pc_partial_function (pc, NULL, &startaddr, NULL);
- if (addrptr)
- *addrptr = startaddr;
priv = NULL;
sec = find_pc_section (pc);
@@ -2352,38 +2348,60 @@ non_heuristic_proc_desc (CORE_ADDR pc, C
int low, mid, high;
+ CORE_ADDR pdr_pc;
low = 0;
high = priv->size / 32;
+ /* First, find the last .pdr entry starting at or before PC. */
- CORE_ADDR pdr_pc;
mid = (low + high) / 2;
ptr = priv->contents + mid * 32;
pdr_pc = bfd_get_signed_32 (sec->objfile->obfd, ptr);
pdr_pc += ANOFFSET (sec->objfile->section_offsets,
- if (pdr_pc == startaddr)
+ if (pdr_pc == pc)
- if (pdr_pc > startaddr)
+ if (pdr_pc > pc)
high = mid;
low = mid + 1;
while (low != high);
- if (low != high)
+ /* If we got an exact match on PC, low != high. Otherwise,
+ both low and high point one past the PDR of interest. */
+ if (low == high && low > 0)
+ ptr = priv->contents + (low - 1) * 32;
+ pdr_pc = bfd_get_signed_32 (sec->objfile->obfd, ptr);
+ pdr_pc += ANOFFSET (sec->objfile->section_offsets,
+ SECT_OFF_TEXT (sec->objfile));
+ /* We don't have a range, so we have no way to know for sure
+ whether we're in the correct PDR or a PDR for a preceding
+ function and the current function was a stripped local
+ symbol. But if the PDR's PC is at least as great as the
+ best guess from the symbol table, assume that it does cover
+ the right area; if a .pdr section is present at all then
+ nearly every function will have an entry. */
+ find_pc_partial_function (pc, NULL, &startaddr, NULL);
+ if (startaddr <= pdr_pc)
struct symbol *sym = find_pc_function (pc);
+ if (addrptr)
+ *addrptr = pdr_pc;
/* Fill in what we need of the proc_desc. */
proc_desc = (mips_extra_func_info_t)
sizeof (struct mips_extra_func_info));
- PROC_LOW_ADDR (proc_desc) = startaddr;
+ PROC_LOW_ADDR (proc_desc) = pdr_pc;
/* Only used for dummy frames. */
PROC_HIGH_ADDR (proc_desc) = 0;
@@ -2412,6 +2430,10 @@ non_heuristic_proc_desc (CORE_ADDR pc, C
if (b == NULL)
+ find_pc_partial_function (pc, NULL, &startaddr, NULL);
+ if (addrptr)
+ *addrptr = startaddr;
if (startaddr > BLOCK_START (b))