This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Shared library call problems on PowerPC with current binutils/gdb


Hello,

using current binutils and gdb head on powerpc-linux, you see the
following quite annoying effect(s) with shared library calls.

I'm starting out with a simple "hello world" program (compiled with
-ffreestanding to avoid the compiler optimizing the printf to puts),
and the following debugging session:

>GNU gdb 6.8.50.20080408
>Copyright (C) 2008 Free Software Foundation, Inc.
>License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>This is free software: you are free to change and redistribute it.
>There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
>and "show warranty" for details.
>This GDB was configured as "powerpc64-linux"...
>(gdb) break printf
>Function "printf" not defined.
>Make breakpoint pending on future shared library load? (y or [n]) n

Hmmm.  Ok, maybe because libc is not loaded yet ...

>(gdb) start
>Breakpoint 1 at 0x10000474: file hello.c, line 6.
>Starting program: /home/uweigand/a.out
>main () at hello.c:6
>6         printf ("Hello, world!\n");
>(gdb) info sharedlibrary
>From        To          Syms Read   Shared Object Library
>0x0ffc1960  0x0ffda700  Yes         /lib/ld.so.1
>0x0fe3db20  0x0ff5e230  Yes         /lib/libc.so.6
>(gdb) break printf
>Function "printf" not defined.
>Make breakpoint pending on future shared library load? (y or [n]) n

Well, libc is definitely loaded now, but I still cannot set a
breakpoint ...   Maybe stepping in works?

>(gdb) s
>0x10000800 in call___do_global_ctors_aux ()

Not really.

>(gdb) start
>The program being debugged has been started already.
>Start it from the beginning? (y or n) y
>Breakpoint 2 at 0x10000474: file hello.c, line 6.
>Starting program: /home/uweigand/a.out
>main () at hello.c:6
>6         printf ("Hello, world!\n");
>(gdb) n
>0x10000800 in call___do_global_ctors_aux ()

Huh.  Not even stepping over works ...

>(gdb) bt
>#0  0x10000800 in call___do_global_ctors_aux ()
>#1  0x0fe3de0c in generic_start_main () from /lib/libc.so.6
>#2  0x0fe3e060 in __libc_start_main () from /lib/libc.so.6
>#3  0x00000000 in ?? ()

... and I guess that's the reason why.

>(gdb) n
>Single stepping until exit from function call___do_global_ctors_aux,
>which has no line number information.
>0x0ffd544c in _dl_runtime_resolve () from /lib/ld.so.1
>(gdb)
>Single stepping until exit from function _dl_runtime_resolve,
>which has no line number information.
>0x0fe75820 in printf@@GLIBC_2.4 () from /lib/libc.so.6
>(gdb)
>Single stepping until exit from function printf@@GLIBC_2.4,
>which has no line number information.
>Hello, world!
>main () at hello.c:7
>7       }

Also, it's quite tedious to get back.



So, what's going on here?  It looks like a combination of multiple
different problems.

1) Setting a breakpoint on a shared library function (that is called by
   the main program) before libraries are loaded used to work because
   of the so-called "solib trampoline" minimal symbols.

   These are generated by GDB when it finds an *undefined* symbol in
   the dynamic symbol table with a *non-zero* value.  Those are a special
   "hack" used by the linker to implement function pointer comparison
   correctly; their "value" points to the PLT call stub in the main
   executable used to call the shared library function.

   However, over time BFD has been optimized to only use this hack when
   it is actually necessary, i.e. when the symbol is in fact used for
   purposed of function pointer comparisons.  In simple cases like this
   where the function is just called, the value of the undefined symbol
   is now always 0 when using current binutils.

   On the other hand, BFD now provides "synthetic symbols" that point to
   those same PLT call stubs (on many targets).  In fact, the synthetic
   symbol "printf@plt" is actually defined.  However, elfread.c does not
   consider this to be a "solib trampoline" symbol for printf.

   Even if it would, there is an additional complication on PowerPC: 
   when using the new-style "secure" PLTs, the "printf@plt" entry point
   actually points to a *data* variable holding a pointer to the "glink"
   stub, not the original PLT call stub itself.

   This could still work, as ppc-linux-tdep.c actually contains code to
   treat this as a case of "function descriptors" and would resolve to
   the real target.  However, "break printf" actually still wouldn't work
   because linespec.c:minsym_found does not handle function descriptors ...

2) What about when libc is already loaded?  Why is printf still not found?
   This is because the (static) symbol table of libc.so on current powerpc
   systems does not contain a symbol "printf", only "printf@@GLIBC_2.4" and
   "printf@GLIBC_2.0".  This is because of symbol versioning needed to handle
   both 128-bit and 64-bit long double types.

   Now, the *dynamic* table *does* contain "printf" (twice, with different
   version information), but elfread.c ignores the dynamic table "as the
   dynamic symbol table is usually a subset of the main symbol table."

   Note that even if full debug information for libc.so is available, we
   do not get a debug symbol "printf" either -- the two entry points are
   called __printf and __nldbl_printf in the original source, and that's
   what debug symbols show.

3) As to stepping in and/or over the printf call, the primary reason why
   this doesn't work is that unwinding breaks.  The immediate target of 
   the call instruction is a PLT call stub; these stubs are part of the
   .text section (when using the secure PLT scheme) and have no symbols.

   The immediately preceding symbol happens to be call___do_global_ctors_aux
   from GCC's crtend.o, which is compiled with -finhibit-size-directive,
   so that function is considered by GDB to span until the end of .text.

   Thus prolog parsing of call___do_global_ctors_aux detects building of
   a stack frame, and GDB assumes that the PLT call stubs have one -- 
   which they really don't.

4) Even if that worked, stepping *into* the call would require support
   for gdbarch_skip_trampoline_code, and the current ppc32 implementation
   of that (for secure PLTs) simply calls find_solib_trampoline_target
   -- which requires the old-style "solib trampoline" symbols to work.



To fix this, I'd propose the following implementation steps:

- Extend elf_symtab_read to treat a synthetic symbol XXX@plt as a
  mst_solib_trampoline symbol for XXX.

- Change elf_symtab_read to register symbols with version name
  simply under their base name.

- Include something along the lines of Markus' "multiply-defined
  symbol" patch so that "break printf" will break on *both* definitions
  (created by the two symbols with different version names) after libc
  has been loaded.

- Teach minsym_found about function descriptors.

- Add extra unwinders to ppc-linux-tdep that recognize the various
  PLT call and glink stubs, and properly treat them as frameless.
  (This will probably require code reading ...)

- Extend ppc_skip_trampoline_code to likewise handle those stubs.


Does this look reasonable?  Am I overlooking anything?

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]