This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
dwarf_output overview

From: Roland McGrath <roland at redhat dot com>
To: elfutils-devel at lists dot fedorahosted dot org
Date: Tue, 20 Jul 2010 05:22:31 -0700
Subject: dwarf_output overview
https://fedorahosted.org/elfutils/wiki/DwarfOutput is still accurate in its
basic description about the collector (class dwarf_output_collector).
We don't really need to get into right now why it's a separate object
from dwarf_output.  For uses in the time being, we'll use one collector
with one dwarf_output for one input file.

The simple attribute values, and some bits of interface glue, are shared
via template use from dwarf_data, with dwarf_edit.  The only difference is
that in dwarf_output we never have mutable versions of thse objects, only
const pointers/references to immutable copies sitting in the collector.

Whereas dwarf_edit has mutable containers, there is no modification you can
do to a dwarf_output object once it's created.  Instead, like dwarf_edit
you can copy-construct a dwarf_output object from any dwarf-compatible
class (i.e. dwarf, dwarf_edit, or dwarf_output)--this is the only way to
get a nonempty dwarf_output object.

The dwarf_output constructor takes two arguments, unlike the simple
dwarf_edit constructor.  The second argument is the collector object.

This actually uses a second helper object, whose only purpose is to live as
long as copy construction is going on.  That's dwarf_output::copier, which
is actually dwarf_output::copier<input> parameterized by the particular
top-level input type (dwarf or dwarf_edit).  (This will one day be exposed
outside dwarf_output for some of the same cases where a separate collector
is really useful, but we won't get into that now.)

The copy construction is doing the whole file, or at least a whole CU, from
the root of the tree down to the leaves and all attribute values.  The
copier holds some bookkeeping and some caches during that process.

The essential method is that the tree from the CU on down has its
debug_info_entry objects constructed recursively, with
debug_info_entry::attr_value objects being constructed for each node.  Each
attr_value on up to each debug_info_entry is "constructed" by finding its
exact match stored in the collector, inserting it only if it's not already
there, and then returning the pointer (i.e. actually all C++ references) to
the (now) existing data structure.

The copier's part in this is twofold.  First, there is some simple caching
related to copying the simple attribute values.  (All the value types other
than references are "simple".)  Second, there is all the hairy bookkeeping
for the complexity brought on by references.

The copier's caching is a simple optimization, so I'll describe it first.
This is for strings, identifier strings (which we distinguish from other
strings, though they are just strings in DWARF), and source files.  In one
input file, even across many CUs (because of linker SHF_MERGE|SHF_STRINGS
merging), every DW_AT_name "foobar" will point to point to the same
"foobar" in the .debug_str table.  Likewise, a dwarf::source_file object
that represents a source file from the .debug_line table is what we get for
every DW_AT_decl_file attribute on the input tree, but there will be many
DIEs with that attribute pointing to the same file entry (same index into
the CU's .debug_ling table, so getting the same dwarf::source_file again
and again).  Hence, we expect to see the actual same pointer (in a dwarf
reader object, really the same pointer into the mmapped DWARF file data)
appearing many times in the input tree's attr_values.  Matching pointers
you've seen before is quicker than matching string values you've seen
before.  So the copier keeps maps keyed by these string (and file entry)
pointers into the input data, to find exact attr_value's already
found/created to match other instances of the same input string (or file
entry).  (This sped things up a fair bit when I added it, though I don't
recall the details.)

For all the simple attribute values the method is simple.  For each flavor
of attr_value, the collector holds a set of values seen before.  The
subr::value_set container type is used for this.  It's based on the STL
unordered_set, which is a hash table where we supply the key type and the
hash and comparison functions on that type.  In these sets, our key types
are dwarf_output::value::value_*, which are the flavor-specific subtypes
that an attr_value actually points to.  So, each of those types has to
provide an integer hash function (a "hasher" inner class) and a comparison
function (operator==).

The copier does a set insertion of the input value object, which installs a
new dwarf_output::value::value_foobar object if none matched, or doesn't if
one did, and then produces an attr_value that points to that.  Note that
since this is the only way any dwarf_output::attr_value objects can ever
get created, within one dwarf_output object (really, one collector), simple
pointer equality is exactly correct as a semantic equality check for the
non-reference attribute values.  A corollary is that just this pointer
value itself can be used as the hash value for the attr_value object (a
perfect hash function).

If there are no reference attributes to consider, then it proceeds from the
bottom up in the same straightforward way.  That is, the entire map of
attributes (considered as a sequence of <int name, attr_value> pairs in a
canonical name-keyed order) can be hashed together to get a hash value for
the attributes container.  Then a leaf node can combine the tag into the
that hash value to get a hash value for the whole DIE.  Then, a sequence of
sibling leaf nodes' hash values are combined to form a hash value for the
whole children container.  Then that hash value is combined with the tag
and the attribute map hash value to form a hash value for the whole DIE.
Then on up.

So this is what we do.  The simple attribute values are hashed on the
actual values (string data, integer values, etc.), and then those pointer
values along with the attribute name (integers) in a canonical order are
combined with the DIE tag into a single integer hash.  The _m_attr_sets set
(i.e. hash table) holds these <attribute map, tag integer> pairs.  Then
children vectors, starting with the empty vector for leaf DIEs, are hashed
into the _m_broods set.  (The empty vector has hash value zero.)  Finally,
a dwarf_output::debug_info_entry just consists of pointers to the
attributes map and children vector, and the DIE tag integer.  We just
combine those two pointer values and the tag to make the integer hash of
the debug_info_entry object itself.  _m_unique holds the set of unique
debug_info_entry objects.  So, a leaf DIE gets created with an attributes
map pointer and a pointer to the one and only empty children vector object.
Then a non-empty children vector is simply a vector of pointers to
debug_info_entry objects like that one.  Like for the other sets, a pointer
to a particular debug_info_entry in the collector serves as a perfect
integer hash, since by definition pointer equality of the debug_info_entry
object exactly maps to semantic equivalence.  So, the children vector is a
simple container of pointers, and the vector itself has a hash value made
just by combining the pointer values as so many constituent integer hash
values.

So, it's like that when no attribute values are references.  The exact
control flow between the constructors of the dwarf_output inner classes and
dwarf_output::copier is extremely hairy.  But the extreme template
weirdness is mostly just to get the copier pointer passed down the levels
of constructors to where it calls an add_* method on the copier.  It's
there, and the collector methods they call, where the real work happens.

It's even like that when there are some references, for simple cases.

So, start with a completely simple leaf node:

	<base_type name="int" byte_size=0x4 encoding=signed/>

That's all simple, so its attribute values, map, empty children vector,
debug_info_entry will all be hashed and inserted in the collector as above.
Now there is a known debug_info_entry pointer for that DIE.

Next, say you are copying:

	<variable name="x" type="#int" location=.../>

The various simple attributes all resolve to known value pointers.  The
type="#int" attribute is the only reference value, pointing to the DIE we
just mentioned.  Since that's a fully resolved known DIE now, we can in
fact treat it the same way as we treat simple attribute values.  Along with
the one debug_info_entry object in the collector, we only need to have
exactly one dwarf_output::value::value_reference object that points to it.
So here, that value object pointer is perfect hash and equality test for a
reference attribute that refers to a DIE identical to the "int" above, just
like the value object pointers for the simple value flavors are.

If that's all there were to do, it would be easy.  
Of course, there are circular references.

So say we have:

	<structure_type id="t1" name="list">
	  <member name="next" type="#p1"/>
	</structure_type>
	<pointer_type id="p1" type="#t1"/>

Now, the structure_type's own attribute map is simple, so that hashes up
nicely.  But, the member's attribute map includes a circular reference.
So, what we do is still hash up the whole attribute map, but use zero as
the hash value for the reference value.  Now <member/> as a whole has a
hash value that contributes to the children vector hash value, and that
contributes to the structure_type's overall hash value.

Two entries that look like this but have different references in the type
attribute--but both circular references in the overall graph--would have
the same hash value.  So when comparing those, we'd get a hash collision
and have to do a full comparison.  That would find the two attribute maps
identical, so look at the two children vectors.  The corresponding member
entries' attribute maps would then be compared for equivalence, the
non-reference attribute values (i.e. name="next") matching by simple
pointer comparison.  But for the reference, we have to do a ref-walking
comparison in the manner of dwarf_comparator.

This plan means that for <pointer_type type="something circular"/> and
<reference_type type="something circular"/>, we'll actually have dozens to
thousands of entries with the same hash value--basically every single
pointer_type to a nontrivial class/struct is a hash collision with every
other.  If that's the plan, then the ref-walking comparisons we do have to
be very well-optimized.  We'll be doing many, many of them every time we
come across another pointer_type entry in the input.

Picayune as I've been, this is just the broad overview.  It seems solid,
and the obvious thing to do, up to the point of the plan for circular
references.  It's not entirely clear that having so many hash collisions in
this approach is a reasonable way to go.  But I don't know what a different
plan would be for efficiently identifying equivalent subtrees in the
presence of circular references.

Most of the real complexity in the copier has to do with handling forward
references in attributes, including circular references.  Aside from the
simple-value mapping caches I described earlier, everything else is pretty
much just interface glue for making the dwarf_output constructors work.
The possibility of forward references means we have to partially construct
entries while waiting to resolve their references, so there are lots of
internal data structures for that.

In valid input data, a single entire (logical) CU tree has no dangling
references.  So all the partially-built tracking data structures only need
to live (at most) during the construction of one compile_unit tree in the
dwarf_output object.  The only things that live longer in the copier are
just caches keyed on parts of the input that might be shared across input
logical CUs (like the strings, but also input DIEs--for when eventually
reading compressed format input).  For following the control flow, you can
start at dwarf_output::copier::make_unit.  You can read the classes it
calls to see how it's all organized.  (Note how at various levels, most of
the actual work happens inside a foo_copier constructor.)

For all of the entry-copying hair inside dwarf_output::copier, I think it
will be easiest to just answer questions as you read through the code,
rather than trying to describe a structure and a plan.  I can explain it
all piece by piece, but at this point will largely be recovering the
knowledge from re-reading the code anyway, so we can read it together.
How that all works is quite ad hoc (and about the third generation of how I
organized it), and I probably couldn't articulate an overall algorithm it
is implementing.  There is:

#if 0  // Toggle this to enable massive debugging spew during construction.

and making that 1 will get you quite a lot of spew that may help get a
handle on how the partial entry resolution control flow goes.  The usual
ways to drive a test of this is:

	tests/dwarf-print --output file

For nontrivial CUs in the input file, the debugging spew is truly massive.


Thanks,
Roland
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]