This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
Re: dwarf_output overview

From: Mark Wielaard <mjw at redhat dot com>
To: elfutils-devel at lists dot fedorahosted dot org
Date: Wed, 11 Aug 2010 13:10:34 +0200
Subject: Re: dwarf_output overview
On Fri, 2010-08-06 at 15:21 -0700, Roland McGrath wrote:
> Likewise, consider:
> 
> 	<compile_unit>
> 	  <namespace name="A">
> 	    <structure_type name="empty" byte_size=0/>
> 	  </namespace>
> 	  <namespace name="B">
> 	    <structure_type name="empty" byte_size=0/>
> 	  </namespace>
> 	</compile_unit>
> 
> This is from the C++:
> 
> 	namespace A { struct empty {}; };
> 	namespace B { struct empty {}; };
> 
> but glossing over the DECL attributes (which would only all match with some
> #include use and so forth).
> 
> Now, for "correctness of the tree transformation", if that is the whole
> tree, then the <structure_type/> can be shared.  The physical tree becomes:
> 
> 	<compile_unit>
> 	  <namespace name="A">
> 	    <imported_unit import="#pu"/>
> 	  </namespace>
> 	  <namespace name="B">
> 	    <imported_unit import="#pu"/>
> 	  </namespace>
> 	</compile_unit>
> 	<partial_unit id="pu">
> 	  <structure_type name="empty" byte_size=0/>
> 	</partial_unit>
> 
> In the logical view, that's the same tree as above.  The only difference is
> if you look at the debug_info_entry::identity () (i.e., dwarf_dieoffset,
> for a consumer)--do you think the two "empty"'s are the same DIE or not?
> (More on that later.)
> 
> When it clearly matters is when there are references.
> So add to the C++:
> 
> 	A::empty a;
> 	B::empty b;
> 
> So the tree looks like:
> 
> 	<compile_unit>
> 	  <namespace name="A">
> 	    <structure_type id="atype" name="empty" byte_size=0/>
> 	  </namespace>
> 	  <namespace name="B">
> 	    <structure_type id="btype" name="empty" byte_size=0/>
> 	  </namespace>
> 	</compile_unit>
> 	<variable name="a" type="#atype"/>
> 	<variable name="b" type="#btype"/>
> 
> Now it becomes clear that type="#atype" and type="#btype" are distinct
> references that should not be conflated.  If you conflated them, then
> the decompression of the compressed tree would have:
> 
> 	<variable name="b" type="#atype"/>
> 
> and that's wrong.

With some trickery one can come up with an example that expresses the
above with the type location declaration attributes equal. (For some
reason gcc decides to split the record struct type in two with a
DW_AT_declaration and DW_AT_specification, but that seems a
non-significant detail).

::doublename.h::
namespace MY_NAMESPACE { struct record { int id; char *name; }; };

::doublename.cpp::
#define MY_NAMESPACE A
#include "doublename.h"
#undef MY_NAMESPACE

#define MY_NAMESPACE B
#include "doublename.h"

int main (int argc, char **argv)
{
  A::record a;
  B::record b;
  return (a.id - b.id);
}

$ g++ -g -o doublename doublename.cpp

$ tests/dwarf-print doublename
<compile_unit producer="GNU C++ 4.4.4 20100726 (Red Hat 4.4.4-13)"
 language=C_plus_plus name="doublename.cpp" comp_dir="." low_pc=0x400554
 high_pc=0x40056d stmt_list=[{1 dirs}, {4 line entries}]>
  <namespace name="A" decl_file="./doublename.h" decl_line=1>
    <structure_type ref="ref3" name="record" declaration=1/>
  </namespace>
  <structure_type ref="ref7" specification="#ref3" byte_size=0x10
   decl_file="./doublename.h" decl_line=1>
    <member name="id" decl_file="./doublename.h" decl_line=1
     type="#ref1" data_member_location=0/>
    <member name="name" decl_file="./doublename.h" decl_line=1
     type="#ref2" data_member_location=0x8/>
  </structure_type>
  <base_type ref="ref1" byte_size=0x4 encoding=signed name="int"/>
  <pointer_type ref="ref2" byte_size=0x8 type="#ref4"/>
  <base_type ref="ref4" byte_size=0x1 encoding=signed_char name="char"/>
  <namespace name="B" decl_file="./doublename.h" decl_line=1>
    <structure_type ref="ref5" name="record" declaration=1/>
  </namespace>
  <structure_type ref="ref8" specification="#ref5" byte_size=0x10
   decl_file="./doublename.h" decl_line=1>
    <member name="id" decl_file="./doublename.h" decl_line=1
     type="#ref1" data_member_location=0/>
    <member name="name" decl_file="./doublename.h" decl_line=1
     type="#ref2" data_member_location=0x8/>
  </structure_type>
  <subprogram external=1 name="main" decl_file="./doublename.cpp"
   decl_line=8 type="#ref1" low_pc=0x400554 high_pc=0x40056d
   frame_base={locexpr}>
    <formal_parameter name="argc" decl_file="./doublename.cpp"
     decl_line=8 type="#ref1" location={locexpr}/>
    <formal_parameter name="argv" decl_file="./doublename.cpp"
     decl_line=8 type="#ref6" location={locexpr}/>
    <lexical_block low_pc=0x40055f high_pc=0x40056b>
      <variable name="a" decl_file="./doublename.cpp" decl_line=10
       type="#ref7" location={locexpr}/>
      <variable name="b" decl_file="./doublename.cpp" decl_line=11
       type="#ref8" location={locexpr}/>
    </lexical_block>
  </subprogram>
  <pointer_type ref="ref6" byte_size=0x8 type="#ref2"/>
</compile_unit>

(Slightly reformatted and made absolute path names relative)

So the interesting types are ref3/7 and ref5/8. That are the structure
types referenced by variables a and b in the main function.

So pushing through dwarf_output we get:

$ tests/dwarf-print --output doublename

<compile_unit name="doublename.cpp"
 stmt_list=[{1 dirs}, {4 line entries}] low_pc=0x400554 high_pc=0x40056d
 language=C_plus_plus comp_dir="."
 producer="GNU C++ 4.4.4 20100726 (Red Hat 4.4.4-13)">
  <namespace name="A" decl_file="./doublename.h" decl_line=1>
    <structure_type ref="ref3" name="record" declaration=1/>
  </namespace>
  <structure_type ref="ref6" byte_size=0x10 decl_file="./doublename.h"
   decl_line=1 specification="#ref3">
    <member name="id" data_member_location=0 decl_file="./doublename.h"
     decl_line=1 type="#ref1"/>
    <member name="name" data_member_location=0x8
     decl_file="./doublename.h" decl_line=1 type="#ref2"/>
  </structure_type>
  <base_type ref="ref1" name="int" byte_size=0x4 encoding=signed/>
  <pointer_type ref="ref2" byte_size=0x8 type="#ref4"/>
  <base_type ref="ref4" name="char" byte_size=0x1 encoding=signed_char/>
  <namespace name="B" decl_file="./doublename.h" decl_line=1>
    <structure_type ref="ref3" name="record" declaration=1/>
  </namespace>
  <structure_type ref="ref6" byte_size=0x10 decl_file="./doublename.h"
   decl_line=1 specification="#ref3">
    <member name="id" data_member_location=0 decl_file="./doublename.h"
     decl_line=1 type="#ref1"/>
    <member name="name" data_member_location=0x8
     decl_file="./doublename.h" decl_line=1 type="#ref2"/>
  </structure_type>
  <subprogram name="main" low_pc=0x400554 high_pc=0x40056d
   decl_file="./doublename.cpp" decl_line=8 external=1
   frame_base={locexpr} type="#ref1">
    <formal_parameter location={locexpr} name="argc"
     decl_file="./doublename.cpp" decl_line=8 type="#ref1"/>
    <formal_parameter location={locexpr} name="argv"
     decl_file="./doublename.cpp" decl_line=8 type="#ref5"/>
    <lexical_block low_pc=0x40055f high_pc=0x40056b>
      <variable location={locexpr} name="a" decl_file="./doublename.cpp"
       decl_line=10 type="#ref6"/>
      <variable location={locexpr} name="b" decl_file="./doublename.cpp"
       decl_line=11 type="#ref6"/>
    </lexical_block>
  </subprogram>
  <pointer_type ref="ref5" byte_size=0x8 type="#ref2"/>
</compile_unit>

Something odd seems to have happened here. Both a and b now point to the
same structure_type #ref6. But we actually have two structure_types that
have ref="ref6", both point to specification="#ref3". And both record
structure_types inside namespace A and B have ref="ref3".

Did dwarf_output mangle things up, or am I misinterpreting the output of
dwarf_print?

Thanks,

Mark
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]