This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

entry vs pending_entry in dwarf_output::copier

From: Roland McGrath <roland at redhat dot com>
To: elfutils-devel at lists dot fedorahosted dot org
Date: Tue, 11 Jan 2011 16:24:23 -0800
Subject: entry vs pending_entry in dwarf_output::copier

dwarf_output::copier::entry is the main struct describing each input DIE
encountered.  These live as long as the copier lives, and thus are the
"permanent" record about each input DIE.  These are created via
copier::enter from either copier::add_child or copier::add_reference.
Thus, an entry will exist for an input DIE not yet encountered in the
copying walk, if another DIE's attributes contained a reference to it.

All entry objects live in the copier::_m_entries map, which is a hash
table keyed by the <impl>::debug_info_entry::identity () value, which is
actually a globally-unique pointer.  For <impl>=dwarf, this is actually
the Dwarf_Die::addr value, i.e. the pointer to the file data mapped in
memory.  For the other implementations, it's the address of the
debug_info_entry object itself.

Once a new DIE has been finalized, entry::_m_final points to its
instance in the collector.  Before that, entry::_m_pending might point
to a pending_entry.  If both of those are NULL, then this is an entry
that has only been encountered in a reference attribute, and no other
members are set except for _m_offset (used in the debugging output).

dwarf_output::copier::pending_entry describes an input DIE currently
being copied, but not yet made final.  This first exists while the
copier is copying that entry on the initial copying walk.  It lives
until the entry can be finalized, when it is freed and replaced by
setting _m_final.  It's somewhat important to do this as early as we can
manage, because duplicative identical input DIEs each have their own
entry, and thus their own pending_entry before they're final.  The
pending_entry's members are where the most memory is consumed in the
copier.  The quicker we can finalize entries, the less memory we consume
during the copying of the whole tree.

In the current code, a pending_entry lives until the hairy bookkeeping
figures out that it can do the finalization.  In the new plan's first
cut, each pending_entry will live until the whole first copying walk is
done.  (The exception being that if the input uses DW_TAG_imported_unit,
then any shared duplicate entries will have been finalized in the
copying of the first CU containing them, and thereafter another CU's
copying walk will encounter an entry that already has its _m_final set.)
In the new plan's second walk, each entry will be finalized so its
pending_entry can be freed.

Because of the aforementioned memory consumption, we will want to do
early finalization during the first copying walk whenever possible.
That is, do the "second walk" work for any subtree that has no
references to other entries that are not yet final.

Is that all clear?  I'm not sure it's entirely necessary wrap one's head
around all the hairy bookkeeping of the current plan.  The new plan is
replacing it with something easier to understand anyway.  

There will still be some hairy magic for cases of circular references.
An entry can't be truly final until all its references are final, and
thus that's a circular requirement.  That's why we need the "placed"
stage, where a value_reference already in the collector gets reified
with the actual iterator into its referent's parent's children vector.


Thanks,
Roland

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]