This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: dwarfcmp performance


28.01.2011 00:30, Roland McGrath wrote:
>> The obvious solution is to do an initial pass over all dies, noting
>> forward references, and when we hit the die that is forward-referred to,
>> remembering the path for later.  That way we would avoid all the
>> repetitve walking that context extraction entails.  I suspect that
>> should help a great deal with the overall performance.
>
> I think you have misconstrued the data here.  dwarf_path_finder
> (libdw/c++/dwarf_tracker) already does caching for these context paths.
> It's not done by a preemptive walk, but by opportunistic caching along the
> way of any walks being done.  When a forward reference necessitates finding
> a context not already known, it clones and spins forward the current walk,
> caching all the nodes seen along the way, until it finds the referent.
> Unless there are bugs in that logic, we visit each node at most twice (once
> in the comparison walk, and once on a side walk looking for a referent) for
> the mere purpose of finding the context.

The cache in question being dwarf_path_finder::_m_seen that's filled in 
dwarf_path_finder::step::step?  Then I guess we see the case that 
referents are further and further ahead, so each of our caching walks 
ends just before hitting them.  On one of the bigger C++ objects, I 
would normally see one DIE being walked over 90+ times (since that's for 
dwarfcmp X X case, that actually means 45+ times for each of the files).

> With a little more instrumentation, you can investigate whether there are
> really redundant walks being kicked off from {left,right}_context calls.

Yeah, I believe that actually is the case.  I added some logging so that 
step::step tells me where it's constructed from.  That can be either 
dwarf_path_finder::walk_to or dwarf_ref_tracker::step::step.  The latter 
case is rather rare (on my arbitrary test case, which is 
libdw/edit-values.o, it's about 10% of all step constructions).  The 
former case is always inside {left,right}_context calls.

After reading further and annotating some more, I see that some of these 
*_context calls are actually caused by the sub-comparator.  But they are 
still due to *_context.

FWIW the caching seems to work, eventually.  The last 
{left,right}_context calls that end up doing more than 0 steps are in 8% 
of my log file.  There are big chasms of 0-step *_context calls even 
before that point.  The problem is that before we get to that point, we 
do more steps that there is dies.

Actually it seems to me like we always start walking from "here" when we 
look for referents, instead of walking from last referent on (since up 
to that point, everything is cached).  Could it be something as simple 
as that?

> Those calls use dwarf_path_finder::path_to.  That starts with a lookup in
> the _m_seen map, which is an unordered_map, i.e. a hash table.  (Each key
> in this table is a referent DIE's identity, which is a unique pointer and
> thus its own perfect hash.)  If that lookup fails--as it always will for
> the first forward reference--then it kicks off a side walk.  So, you can
> instrument path_to and collect the data on success/failure of that hash
> lookup (i.e. cache hit/miss) for each identity.  If there is ever more than
> one cache miss for a given identity, then there is a bug somewhere.  If
> there is ever any cache miss for an identity that is not a forward
> reference, then there is a bug somewhere.  If there are cache hits for
> identities that are indeed forward references, then the side-caching done
> along the extra walks for prior cache misses are doing a good job.
>
> I believe the reason we have repeated walks is not due to looking for
> context, it's due to repeated actual comparison walks.  There is the main
> tree walk comparing DIEs.  When that hits reference attributes that are
> forward references, dwarf_comparator::reference_match does another subtree
> walk rooted at the referent DIEs.  It's all those walks where we have the
> massive redundancy.

> It could well be that the most costly individual component of those extra
> walks is dwarf_path_finder::step.  It's worth looking into how costly that
> is for avoidable reasons.  But it's not the chief problem.  The chief
> problem is an algorithmic one, which I will get back to later.

Yes, slicing 20% off a week's worth of runtime won't help that much :) 
I snipped that bit for now.

> Now, back to the larger point.  Optimizing context collection is all well
> and good and worth a little bit of investigation, but it's not the big
> thing.  The big thing is the walks done inside the "subcomparator" DIE
> comparisons from dwarf_comparator::reference_match.
>
> When comparison hits two reference attributes with the same attribute name,
> it compares them by comparing their referent DIEs.  That's where the
> "subcomparator" comes in.  This is where we do another comparison walk on
> the left-hand vs right-hand referent subtrees.  For trivial mismatches
> (different tag, one has children and the other doesn't), we always
> short-circuit this walk quickly at the top of reference_match.  For
> nontrivial cases, including all comparisons that will ultimately succeed,
> we have to use a subcomparator walk.
>
> When we have only non-circular backward references, then the reference
> attribute comparison could short-circuit quickly and safely, at this code:
>
>        // Maybe the tracker has already cached a correspondence of references.
>        typename tracker::reference_match matched;
>        if (_m_tracker.reference_matched (matched, ref1, ref2))
> 	return true;
>
> That works when we've previously called tracker::notice_match to record a
> successful comparison.  If we only had non-circular backward references,
> then that's what we'd be doing.  However, you'll notice in dwarf_tracker:
>
>      inline bool notice_match (reference_match&/*matched*/,
> 			      const die1&, const die2&/*b*/, bool matches)
>      {
>        /* XXX not reliable!
> 	 This match could be predicated on a tentative match of a
> 	 circular ref inside.  We can't cache that!
>        if (matches&&  matched._m_lhs != NULL)
> 	matched._m_lhs->second.insert (b);
>        */
>        return matches;
>      }

Oh, I'm actually using roland/dwarf-hacking, so I don't have this 
commented out.  Even the week-long spinning that I talked about before 
was with that.

> I've posted here before about the reasons this caching is not correct in
> the general case and hence is now disabled in the main dwarf branch.
> Perhaps Mark has the archive link closer to hand than I do.
> I'll just repeat it now in summary.

Yeah I remember seeing those emails.  But I didn't remember the details, 
so thanks for reiterating.

PM

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]