This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: The future of dwarf2_physname


Hi Keith,

On Thu, 19 May 2011 00:34:43 +0200, Keith Seitz wrote:
[...]
> It appears that computing the physname offers gdb some buffer from
> compiler changes/bugs. At least with gcc.

I find any difference from compiler's idea of a (demangled) linkage name to be
a bug.  When nm, objdump, other tools deal with some C++ name I believe GDB
also should.

If the linkage name is wrong for whatever reason this is an ABI issue for GCC,
it is out of scope of GDB.  Linkage name is defined by the ISO C++ standard,
neither GCC nor GDB can change it.  If GCC is not ISO C++ compliant then GCC
should be fixed (and possibly deal with the ABI breakage) but such ISO C++
compliance issue is out of scope of GDB.

Trying to introduce new "better" naming for C++ methods which is in 90% the
same as GCC's and ISO C++'s idea seems a bit messy for me.

That is the GCC and ISO C++ naming should be always available in GDB.  I am
not against allowing also other aliases for the same symbol.  The availability
of symbol aliases is discussed more at the end of this mail.


As parts of code have DWARF (=full symbols) and part only ELF .symtab
(=minimal symbols) the physname code (creating linkage names based on DWARF
DIEs) just cannot apply to ELF symbols.  Thefore if it creates different
physname from DWARF than what can be figured out from the library.

#include <sstream>
std::stringstream v;
int
main ()
{
  v.unsetf (std::ios_base::hex);
}
x86_64-fedora15-linux-gnu with no /usr/lib/debug

GDB with
	Re: [RFA] Typedef'd method parameters [0/4]
	http://sourceware.org/ml/gdb-patches/2011-05/msg00318.html
without my patch
	[rfc] physname cross-check [Re: [RFA] Typedef'd method parameters [0/4]]
	http://sourceware.org/ml/gdb-patches/2011-05/msg00358.html
(gdb) complete p 'std::ios_base::unsetf
p 'std::ios_base::unsetf(std::_Ios_Fmtflags)
p 'std::ios_base::unsetf(std::ios_base::fmtflags)
(gdb) p 'std::ios_base::unsetf(std::_Ios_Fmtflags)'
$2 = {<text variable, no debug info>} 0x400768 <std::ios_base::unsetf(std::ios_base::fmtflags)>
(gdb) ptype 'std::ios_base::unsetf(std::_Ios_Fmtflags)'
type = int (void)
(gdb) p 'std::ios_base::unsetf(std::ios_base::fmtflags)'
No symbol "std::ios_base::unsetf(std::ios_base::fmtflags)" in current context.
(gdb) ptype 'std::ios_base::unsetf(std::ios_base::fmtflags)'
No symbol "std::ios_base::unsetf(std::ios_base::fmtflags)" in current context.

while pre-physname GDB:
(gdb) complete p 'std::ios_base::unsetf
p 'std::ios_base::unsetf(std::_Ios_Fmtflags)
(gdb) p 'std::ios_base::unsetf(std::_Ios_Fmtflags)' 
$1 = {void (class std::ios_base * const, std::ios_base::fmtflags)} 0x400768 <std::ios_base::unsetf(std::_Ios_Fmtflags)>
(gdb) ptype 'std::ios_base::unsetf(std::_Ios_Fmtflags)'
type = void (class std::ios_base * const, std::ios_base::fmtflags)

This is a regression.  And it will always be a regression for any
physname != DW_AT_linkage_name as with my cross-check patch it prints:
Computed physname <std::ios_base::unsetf(enum std::_Ios_Fmtflags)> does not match demangled <std::ios_base::unsetf(std::_Ios_Fmtflags)> (from linkage <_ZNSt8ios_base6unsetfESt13_Ios_Fmtflags>)


The idea is simple - the compiled application with DWARF will contain:

 <1><135>: Abbrev Number: 17 (DW_TAG_namespace)
    <136>   DW_AT_name        : std     
 <2><44a>: Abbrev Number: 23 (DW_TAG_class_type)
    <44b>   DW_AT_name        : ios_base        
 <3><454>: Abbrev Number: 24 (DW_TAG_subprogram)
    <455>   DW_AT_external    : 1       
    <456>   DW_AT_name        : unsetf   
    <45a>   DW_AT_decl_file   : 1
    <45b>   DW_AT_decl_line   : 612     
    <45d>   DW_AT_MIPS_linkage_name: _ZNSt8ios_base6unsetfESt13_Ios_Fmtflags    
    <461>   DW_AT_accessibility: 1      (public)
    <462>   DW_AT_declaration : 1       
    <463>   DW_AT_object_pointer: <0x46b>
 <4><46b>: Abbrev Number: 25 (DW_TAG_formal_parameter)
    <46c>   DW_AT_type        : <0x18bb> 
    <470>   DW_AT_artificial  : 1       
 <4><471>: Abbrev Number: 26 (DW_TAG_formal_parameter)
    <472>   DW_AT_type        : <0x476> 
 <4><476>: Abbrev Number: 8 (DW_TAG_typedef)
    <477>   DW_AT_name        : fmtflags        
    <47b>   DW_AT_decl_file   : 1
    <47c>   DW_AT_decl_line   : 257
    <47e>   DW_AT_type        : <0x32e>

It does not (currently cannot - discussed at GCC PR debug/40040) contain
DW_AT_low_pc+DW_AT_high_pc for std::io_base::unsetf as it is DW_AT_external.
Therefore the linkage name (physname) which GDB thinks about must be always
exactly DW_AT_linkage_name (*) (**).

(*) As long as the GCC produced DW_AT_MIPS_linkage_name matches the .symtab
    ELF symbol name but there have never been bugs in this part in GCC AFAIK.

(**) DW_AT_MIPS_linkage_name always equals DW_AT_linkage_name.

So physname must be always the same as DW_AT_linkage_name.  Even if GCC
computes DW_AT_linkage_name wrongly then GDB has to be bug-to-bug compatible
with that GCC.  I found the only benefits of physname to reduce the debuginfo
size by omitting DW_AT_linkage_name from DWARF as it is redundant.  But in
such case there should be a verification of GCC-output DW_AT_linkage_name
names if they really match to what GDB thinkgs they should be.  That should
done during distro build of every package before stripping DW_AT_linkage_name.
A rough guess is saving something between 3% to 10% of .debug files size which
may be worth it with all the .debug size reduction efforts.  (As long as other
DWARF consumers do not need DW_AT_linkage_name but that is a bit out of scope
of this discussion, currently I am not aware of any.)


> The bigger issue is what to do for 7.3, which has been in limbo for
> quite some time awaiting resolution of c++/12506 (for which I have
> posted patches).
> 
> I guess there are basically two options:
> 1) Keep physname & apply the 12266/12506 patchset (when approved)
> 2) Adopt Jan's patch and fix the fallout now
> 
> My take on this today is that I think the least risky/most timely
> solution is #1.

As it introduces symbol names which are not compliant with ISO C++ I find it
harmful to get people used for the new names in any new release.  So far the
non-compliant symbol names are present only in FSF gdb-7.2 which could be
forgotten IMO but releasing another stable FSF GDB release with non-compliant
symbol names seems to me making backward compatibility difficult.  Still it is
questionable whether the GDB-only symbol variants can create any compatibility
issues - such as some .gdbinit scripts using such names.


> Gdb has been used with the dwarf2_physname patchset
> for over a year now, and it is not altogether broken. We are seeing
> some regressions from previous releases (none of these are in the
> test suite), and I have been trying to keep on top of them.

There are some regressions when using DW_AT_linkage_name by that patch of mine
but according to quick analysis IIUC they seem to be caused by coding the
testcases based on the ISO C++ non-compliant naming introduced by physname.
Therefore these should be fixed in the testcases.  But these regressions need
to be (attempted to be) fixed first to get definitive answer.


> The 12266/12506 patchset is almost orthogonal to this decision. It
> fixes several additional typedef-related problems. 12506 is fixed by
> using DW_AT_linkage_name, but 12266 is not.

12266 is about `break func(typedefname)' which never worked before physname
(before gdb-7.2).  This is about providing canonicalization in linespec which
you have implemented in the current pending patchset.
decode_variable->cp_canonicalize_string_no_typedefs in:
	Re: [RFA] Typedef'd method parameters [3/4]
	http://sourceware.org/ml/gdb-patches/2011-05/msg00317.html
typedef int t; void f(t p) {} int main () {}
(gdb) b f(t)
Breakpoint 1 at 0x4004db: file 4.C, line 1.

This is a kind of "symbol alias" and I do not find it much critical if it does
not exactly match in some cases - it is just user convenience and it does not
break any symbol references used in the inferior code.


> The only hesitation I have is the compiler issues. Just using
> DW_AT_linkage_name instead of computing it has lead to many test
> regressions with differing compiler versions. [Again, I have only
> tested gcc.]
+
> My big fear with #2 is "fixing" linespecs (yet again). Any changes
> to this area take an unimaginably large amount of time and effort.

I do not see any real regressions except incorrect testcase assumptions.

$ nm -C gdb.cp/cp-relocate.o
0000000000000000 W int func<1>(int)
print func<1>(int)
No symbol "func<1>" in current context.
(gdb) FAIL: gdb.cp/cp-relocate.exp: get address of func<1>(int)

but:
(gdb) print 'int func<1>(int)'
$1 = {int (int)} 0x28 <func<1>(int)>

This is a regression against gdb-7.2 (physname) but in pre-physname it also
did not work.  The symbols canonicalization in decode_variable could also
strip the leading return types.


Thanks,
Jan


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]