This is the mail archive of the
elfutils-devel@sourceware.org
mailing list for the elfutils project.
Re: dwarf_edit_output testcase sanity checking
- From: Mark Wielaard <mjw at redhat dot com>
- To: elfutils-devel at lists dot fedorahosted dot org
- Date: Wed, 10 Nov 2010 13:42:12 +0100
- Subject: Re: dwarf_edit_output testcase sanity checking
On Tue, 2010-11-09 at 12:57 -0800, Roland McGrath wrote:
> Perhaps you can write up a clean description/pseudo-code of the whole
> algorithm that we can all consider from the ground up.
The idea is to walk the DIE tree as a directed graph only. But to since
we want hashes that depend on the structure of the DIEs (children and
references) to prevent collisions of non-identical DIE structures as
much as possible we need to walk the tree at least twice to calculate
two hashes that get combined to create the final hash for each DIE.
- First we walk the DIE tree ignoring any reference attributes.
Since we ignore reference attributes, this is just walking the tree,
so no cycles and we are guaranteed to terminate.
- For basic attributes we calculate the hash as before, likewise for
attr_sets of basic attributes.
- The first hash of a DIE is calculated based on the first hash of its
children set (broods), again all reference attributes are ignored
- Secondly we walk the DIE tree just creating an additional hash just
for the reference attributes. Here each DIE hash is extended with the
hash based on all just reference attributes. We do need the results of
the first hash calculated for each DIE, but we can just enumerate each
DIE, so again no cyclic walks.
- Each reference attribute's hash is based on its name and the
referent DIE first hash. We can create attr_sets of just these
reference attributes like we did for the basic attributes sets.
- In case the above isn't unique enough we could extend the algorithm
with a third (or even fourth) pass like the first or the second.
Where either for each DIE the final hash is calculated based on both
hashes plus the combination of both hashes of its children. And/Or
the final hash is calculated based on both hashes plus the second hash
of all the DIEs that its reference attributes point at. But I expect
that for larger structures the internal representation of the children
die basic attributes will be different enough already. It could
however help if there are lots of longer simple reference chains
(without any basic attributes and/or children DIEs to help in the
first two passes of hashing).
> > > That sounds nice, but the "structural tree" <singleton_tag/> with "all
> > > basic attributes" being the empty set is a common case.
> >
> > Maybe I am missing the actual example you have in mind. Could you post
> > an example tree that shows it?
>
> I just meant <pointer_type type=.../> and the like (no children,
> no non-reference attributes).
So in the above scheme the final hash for such singleton_tags with just
one reference attribute is based on the (first) hash of the type DIE. So
if they point at the same basic_type they would hash the same, just like
when they point to a structure_type with the same children. But they
will hash differently if the basic_types are not equal or the
structure_type has different children.
Cheers,
Mark