This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

dwarfcmp performance


Hi there,

about two weeks ago I let dwarfcmp spin on one of the difficult cases. 
After a week or so it answered that the files are equal.  Another case 
ended up after about a day.  So I suppose the algorithms are sound, it's 
just that it takes ages to get through it all.

OProfile shows that one major time sink is dwarf_path_finder::step ctor. 
  That is called from a bunch of places, but the interesting cases come 
from dwarf_comparator::reference_match where it calls 
dwarf_ref_tracker::left_context and dwarf_ref_tracker::right_context. 
This gets called when the comparator hits forward reference. The 
algorithm then starts from where in the tree we are (since it's 
guaranteed to not need any of the already-seen dies, these are cached), 
and traverses it in a linear fashion, looking for the referenced die.

I added counters to see how long these walks were.  Average over our 
build tree (excluding the cases that would take ages, where "ages" for 
my purposes was anything above 20 seconds) is 18 steps, but many C++ 
modules make easily 100+ steps on average.  Average over dwarflint 
objects is 90 steps.  Clearly C++ produces forward references that lead 
further ahead on average (or there's more detailed and deeper structure 
between here and there).

So the problem is we simply end up doing walks over those same dies 
again and again and again, as we look for some far-ahead die.  I don't 
know if there are other time sinks, but getting rid of this traversing 
should help a great deal.

The obvious solution is to do an initial pass over all dies, noting 
forward references, and when we hit the die that is forward-referred to, 
remembering the path for later.  That way we would avoid all the 
repetitve walking that context extraction entails.  I suspect that 
should help a great deal with the overall performance.

Thanks,
PM

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]