This is the mail archive of the gdb@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: backtrace through 'sleep', (1255 and 1253)


Michael Elizabeth Chastain writes:
 > Here's what I've learned so far.
 > 
 > This is the code for 'sleep' in /lib/i686/libc.so.6:
 > 
 >   push %ebp
 >   xor  %ecx, %ecx
 >   mov  %esp, %ebp
 >   push %edi
 >   xor  %edx, %edx
 >   ...
 >   call __i686.get_pc_thunk.bx
 >   add  $0x7bfab, %ebx
 >   sub  $0x1cc, %esp
 >   ...
 > 
 > This is on a red hat linux 8 system, native i686-pc-linux-gnu.
 > 
 > This is C code, not hand-coded assembler!  The "xor" instructions have been
 > mixed into the prologue.  They are just setting some variables to zero.
 > The call to __i686.get_pc_thunk.bx comes from gcc -fpic.
 > 
 > Here is the code in i386_frame_cache:
 > 
 >   frame_unwind_register (next_frame, I386_EBP_REGNUM, buf);
 >   cache->base = extract_unsigned_integer (buf, 4);
 >   if (cache->base == 0)
 >     return cache;
 > 
 >   cache->save_regs[I386_EIP_REGNUM] = 4;
 > 
 >   cache->pc = frame_func_unwind (next_frame);
 >   if (cache->pc != 0)
 >     i386_analyze_prologue (cache->pc, frame_pc_unwind (next_frame), cache);
 > 
 >   if (cache->locals < 0)
 >     {
 >       /* We didn't find a valid frame, which means that CACHE->base
 >          currently holds the frame pointer for our calling frame.  If
 >          we're at the start of a function, or somewhere half-way its
 >          prologue, the function's frame probably hasn't been fully
 >          setup yet.  Try to reconstruct the base address for the stack
 >          frame by looking at the stack pointer.  For truly "frameless"
 >          functions this might work too.  */
 > 
 >       frame_unwind_register (next_frame, I386_ESP_REGNUM, buf);
 >       cache->base = extract_unsigned_integer (buf, 4) + cache->sp_offset;
 >     }
 > 
 > The etiology is:
 > 
 >   The prologue analyzer fails on this function because of the 
 >   'xor %ecx, %ecx'.
 > 
 >   So cache->locals == -1.
 > 
 >   /* We didn't find a valid frame ... */
 > 
 >   So the code behaves like it's in a frameless function.  It grabs
 >   the stack pointer and adds an offset to it and uses that for a frame.
 > 
 > Whereas, in reality, the pc is in the middle of 'sleep' (well past the
 > prologue), and there is a perfectly good frame.  In fact if I undo the
 > bogus re-assignment to cache->base in this case then the stack trace
 > works fine.
 > 
 > Now, what to do about it ...
 > 
 > Red Hat Linux 8 has an rpm for a debug version of glibc.  The
 > glibc-debug rpm installs libraries in /usr/lib/debug, rather than
 > overwriting /lib/i686.  I installed glibc-debug and set LD_LIBRARY_PATH
 > to /usr/lib/debug, and it worked!  The test cases in both gdb/1253 and
 > gdb/1255 both backtraced just fine!

FWIW, in general the RedHat debug rpms contain only debug info, and a
section in the /lib/i686 libraries contains a pointer to them. You
shouldn't need to set LD_LIBRARY_PATH at all, just do a 'set
debug-file-directory /usr/lib/debug' and gdb should be able to
integrate the two together.  For glibc though, they provide 3 flavors
of rpms, one w/o debug info (glibc), one which includes the debug info
and the rest (glibc-debug) which you installed, and one which includes
only the debuginfo (glibc-debuginfo) for which you can do what I
described. The glibc-debug stuff gets installed in /usr/lib/debug/.
The glibc-debuginfo gets installed in /usr/lib/debug/lib/.


 > 
 > Also, static-linking with glibc works, because the static version
 > of 'sleep' has different code (no -fpic) with a prologue that gdb
 > can digest.
 > 
 > So we can either:
 > 
 > . Document the problem and tell people to use a debugging glibc or
 >   static-link their program.  Also send a message to vendors that they may
 >   want to make the debugging glibc the default glibc.  Vendors may even
 >   want to patch their gcc to not mix other instructions into the prologue,
 >   because gdb is a lot more sensitive to un-analyzable prologues now.
 > 

Unlikely to happen, I am afraid :-(

 > . Ask the gcc guys directly to not schedule any instructions between
 >   'push %ebp' and 'mov %esp, %ebp'.
 > 

more likely.

 > . Change gdb so that the prologue reader is more powerful.  It doesn't
 >   take much to get through the 'xor %ecx, %ecx' instruction.  The
 >   trouble is that there could be a billion different instructions
 >   in there ('mov any-register, immediate').  The advantage is that
 >   this would work without any changes to external software.
 > 

yes. How did the prologue analyzer changed between 5.3 and now?

elena


 > . Do nothing, let the users suffer.
 > 
 > . Something else?
 > 
 > Michael C


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]