This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: dwarf_edit_output testcase sanity checking

From: Mark Wielaard <mjw at redhat dot com>
To: elfutils-devel at lists dot fedorahosted dot org
Date: Wed, 17 Nov 2010 18:17:42 +0100
Subject: Re: dwarf_edit_output testcase sanity checking

On Mon, 2010-11-15 at 18:16 -0800, Roland McGrath wrote:
> This plan of attack hinges on the "simple enough to be sure it's
> correct" comparison implementation not being unusably slow on our
> real-world data.  I'm not really very sure about that at all.  An
> alternative plan is to stub it out to assume that all hash collision
> comparisons are false, and try proceed in parallel on those other
> things from there.  But that is going to conflate all a ** with b **
> and suchlike, so it seems unlikely to be very usable with real data.
>
> What do you think?

I think I (once again) forgot about the importance of the fact that when
a reference attribute points to a DIE the context of that DIE always
matters. That makes any comparison of DIEs not very "simple enough". So
then you get to drag in the reference tracker and use dwarf_comparator
for correctness. But I would really like to have the dwarf_output
algorithm be independent from the dwarf_comparator implementation. Not
just because we then have an independent checker. But also because it
feels that what dwarf_output should be simpler and more concrete.

Since in the proposed algorithm the first pass does a full walk over the
DIE tree already, it should be simple to just store the full context of
every DIE. Then in the second pass when we hash all reference attributes
we can hash in this context. Which should reduce collisions again.

I am pondering if we could just rely on the fact that a pure reference
chain (ignoring child DIEs) will never loop.  I am not sure what to do
if we ever encounter a real reference chain loop. But it should be easy
to detect. But if we can then we would have for each DIE a full hash for
the whole "child chain" and for the whole "reference chain", which
should give us "perfect" hashes, since they would describe each DIE
completely.

But I am note sure we can get rid of ignoring hash-collisions though.
Because I don't see how you ever make a distinction between a hash
collision because two DIEs are indeed completely identical and when
there is false collision because the hash function isn't perfect.

Cheers,

Mark

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]