This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: dwarf_edit_output testcase sanity checking

From: Mark Wielaard <mjw at redhat dot com>
To: elfutils-devel at lists dot fedorahosted dot org
Date: Wed, 10 Nov 2010 13:42:12 +0100
Subject: Re: dwarf_edit_output testcase sanity checking

On Tue, 2010-11-09 at 12:57 -0800, Roland McGrath wrote:
> Perhaps you can write up a clean description/pseudo-code of the whole
> algorithm that we can all consider from the ground up.

The idea is to walk the DIE tree as a directed graph only. But to since
we want hashes that depend on the structure of the DIEs (children and
references) to prevent collisions of non-identical DIE structures as
much as possible we need to walk the tree at least twice to calculate
two hashes that get combined to create the final hash for each DIE.

- First we walk the DIE tree ignoring any reference attributes.
  Since we ignore reference attributes, this is just walking the tree,
  so no cycles and we are guaranteed to terminate.
  - For basic attributes we calculate the hash as before, likewise for
    attr_sets of basic attributes.
  - The first hash of a DIE is calculated based on the first hash of its
    children set (broods), again all reference attributes are ignored

- Secondly we walk the DIE tree just creating an additional hash just
  for the reference attributes. Here each DIE hash is extended with the
  hash based on all just reference attributes. We do need the results of
  the first hash calculated for each DIE, but we can just enumerate each
  DIE, so again no cyclic walks.
  - Each reference attribute's hash is based on its name and the
    referent DIE first hash. We can create attr_sets of just these
    reference attributes like we did for the basic attributes sets.

- In case the above isn't unique enough we could extend the algorithm
  with a third (or even fourth) pass like the first or the second.
  Where either for each DIE the final hash is calculated based on both
  hashes plus the combination of both hashes of its children. And/Or
  the final hash is calculated based on both hashes plus the second hash
  of all the DIEs that its reference attributes point at. But I expect
  that for larger structures the internal representation of the children
  die basic attributes will be different enough already. It could
  however help if there are lots of longer simple reference chains
  (without any basic attributes and/or children DIEs to help in the
   first two passes of hashing).

> > > That sounds nice, but the "structural tree" <singleton_tag/> with "all
> > > basic attributes" being the empty set is a common case.
> > 
> > Maybe I am missing the actual example you have in mind. Could you post
> > an example tree that shows it?
> 
> I just meant <pointer_type type=.../> and the like (no children,
> no non-reference attributes).

So in the above scheme the final hash for such singleton_tags with just
one reference attribute is based on the (first) hash of the type DIE. So
if they point at the same basic_type they would hash the same, just like
when they point to a structure_type with the same children. But they
will hash differently if the basic_types are not equal or the
structure_type has different children.

Cheers,

Mark

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]