This is the mail archive of the
gdb@sources.redhat.com
mailing list for the GDB project.
Re: backtrace through 'sleep', (1255 and 1253)
- From: Elena Zannoni <ezannoni at redhat dot com>
- To: Michael Elizabeth Chastain <mec at shout dot net>
- Cc: gdb at sources dot redhat dot com
- Date: Mon, 4 Aug 2003 11:40:47 -0400
- Subject: Re: backtrace through 'sleep', (1255 and 1253)
- References: <200308021518.h72FISaN031424@duracef.shout.net>
Michael Elizabeth Chastain writes:
> Here's what I've learned so far.
>
> This is the code for 'sleep' in /lib/i686/libc.so.6:
>
> push %ebp
> xor %ecx, %ecx
> mov %esp, %ebp
> push %edi
> xor %edx, %edx
> ...
> call __i686.get_pc_thunk.bx
> add $0x7bfab, %ebx
> sub $0x1cc, %esp
> ...
>
> This is on a red hat linux 8 system, native i686-pc-linux-gnu.
>
> This is C code, not hand-coded assembler! The "xor" instructions have been
> mixed into the prologue. They are just setting some variables to zero.
> The call to __i686.get_pc_thunk.bx comes from gcc -fpic.
>
> Here is the code in i386_frame_cache:
>
> frame_unwind_register (next_frame, I386_EBP_REGNUM, buf);
> cache->base = extract_unsigned_integer (buf, 4);
> if (cache->base == 0)
> return cache;
>
> cache->save_regs[I386_EIP_REGNUM] = 4;
>
> cache->pc = frame_func_unwind (next_frame);
> if (cache->pc != 0)
> i386_analyze_prologue (cache->pc, frame_pc_unwind (next_frame), cache);
>
> if (cache->locals < 0)
> {
> /* We didn't find a valid frame, which means that CACHE->base
> currently holds the frame pointer for our calling frame. If
> we're at the start of a function, or somewhere half-way its
> prologue, the function's frame probably hasn't been fully
> setup yet. Try to reconstruct the base address for the stack
> frame by looking at the stack pointer. For truly "frameless"
> functions this might work too. */
>
> frame_unwind_register (next_frame, I386_ESP_REGNUM, buf);
> cache->base = extract_unsigned_integer (buf, 4) + cache->sp_offset;
> }
>
> The etiology is:
>
> The prologue analyzer fails on this function because of the
> 'xor %ecx, %ecx'.
>
> So cache->locals == -1.
>
> /* We didn't find a valid frame ... */
>
> So the code behaves like it's in a frameless function. It grabs
> the stack pointer and adds an offset to it and uses that for a frame.
>
> Whereas, in reality, the pc is in the middle of 'sleep' (well past the
> prologue), and there is a perfectly good frame. In fact if I undo the
> bogus re-assignment to cache->base in this case then the stack trace
> works fine.
>
> Now, what to do about it ...
>
> Red Hat Linux 8 has an rpm for a debug version of glibc. The
> glibc-debug rpm installs libraries in /usr/lib/debug, rather than
> overwriting /lib/i686. I installed glibc-debug and set LD_LIBRARY_PATH
> to /usr/lib/debug, and it worked! The test cases in both gdb/1253 and
> gdb/1255 both backtraced just fine!
FWIW, in general the RedHat debug rpms contain only debug info, and a
section in the /lib/i686 libraries contains a pointer to them. You
shouldn't need to set LD_LIBRARY_PATH at all, just do a 'set
debug-file-directory /usr/lib/debug' and gdb should be able to
integrate the two together. For glibc though, they provide 3 flavors
of rpms, one w/o debug info (glibc), one which includes the debug info
and the rest (glibc-debug) which you installed, and one which includes
only the debuginfo (glibc-debuginfo) for which you can do what I
described. The glibc-debug stuff gets installed in /usr/lib/debug/.
The glibc-debuginfo gets installed in /usr/lib/debug/lib/.
>
> Also, static-linking with glibc works, because the static version
> of 'sleep' has different code (no -fpic) with a prologue that gdb
> can digest.
>
> So we can either:
>
> . Document the problem and tell people to use a debugging glibc or
> static-link their program. Also send a message to vendors that they may
> want to make the debugging glibc the default glibc. Vendors may even
> want to patch their gcc to not mix other instructions into the prologue,
> because gdb is a lot more sensitive to un-analyzable prologues now.
>
Unlikely to happen, I am afraid :-(
> . Ask the gcc guys directly to not schedule any instructions between
> 'push %ebp' and 'mov %esp, %ebp'.
>
more likely.
> . Change gdb so that the prologue reader is more powerful. It doesn't
> take much to get through the 'xor %ecx, %ecx' instruction. The
> trouble is that there could be a billion different instructions
> in there ('mov any-register, immediate'). The advantage is that
> this would work without any changes to external software.
>
yes. How did the prologue analyzer changed between 5.3 and now?
elena
> . Do nothing, let the users suffer.
>
> . Something else?
>
> Michael C