This is the mail archive of the
elfutils-devel@sourceware.org
mailing list for the elfutils project.
Re: dwarf_output overview
- From: Mark Wielaard <mjw at redhat dot com>
- To: elfutils-devel at lists dot fedorahosted dot org
- Date: Wed, 06 Oct 2010 17:07:11 +0200
- Subject: Re: dwarf_output overview
On Tue, 2010-07-20 at 05:22 -0700, Roland McGrath wrote:
> For all the simple attribute values the method is simple. For each
> flavor of attr_value, the collector holds a set of values seen before.
> The subr::value_set container type is used for this. It's based on
> the STL unordered_set, which is a hash table where we supply the key
> type and the hash and comparison functions on that type. In these
> sets, our key types are dwarf_output::value::value_*, which are the
> flavor-specific subtypes that an attr_value actually points to. So,
> each of those types has to provide an integer hash function (a
> "hasher" inner class) and a comparison function (operator==).
>
> The copier does a set insertion of the input value object, which
> installs a new dwarf_output::value::value_foobar object if none
> matched, or doesn't if one did, and then produces an attr_value that
> points to that. Note that since this is the only way any
> dwarf_output::attr_value objects can ever get created, within one
> dwarf_output object (really, one collector), simple pointer equality
> is exactly correct as a semantic equality check for the non-reference
> attribute values. A corollary is that just this pointer value itself
> can be used as the hash value for the attr_value object (a perfect
> hash function).
I found this hard to follow in the source code.
We are using a subr::value_set class which will be called with an input
value and type:
// Set of hashed_value's.
template<typename value_type>
class value_set
: public std::tr1::unordered_set<hashed_value<value_type>,
struct hashed_value<value_type>::hasher>
[...]
const value_type *add (const value_type &v)
{
std::pair<class _base::iterator, bool> p
= _base::insert (hashed_value_type (v));
if (p.second)
{
// XXX hook for collection: abbrev building, etc.
}
return &p.first->second;
};
[...]
template<typename input, typename arg_type>
const value_type *add (const input &v, arg_type &arg)
{
return add (value_type (v, arg));
}
So it will always create a new value_type (which is always of a
dwarf_output::value::subtype?) and ultimately calls the (sub)struct
hasher () operator on that value:
template<typename T>
struct hash : public T::hasher {};
template<typename T>
static inline size_t hash_this (const T &v)
{
return hash<T> () (v);
}
So for a simple attr_value we don't care about the hasher of the input
(dwarf/dwarf_edit) variant, but will always create a new
dwarf_output::attr_value from the given input variant. Which just
records the given input attr_value variant:
inline attr_value (const attr_value &other)
: _base ()
{
_m_value = other._m_value;
}
So we will use the dwarf_output::value_attr hasher struct:
/* We can use the _m_value pointer itself as a perfect hash,
because all identical values share the same cell in the
collector. The exception to this is for references. See
value_reference::hash. */
struct hasher : public std::unary_function<attr_value, size_t>
{
inline size_t operator () (const attr_value &v) const
{
const value::value_reference *ref
= dynamic_cast<const value::value_reference *> (v._m_value);
return ref == NULL ? (uintptr_t) v._m_value : ref->hash ();
}
};
So in the end we will indeed just return the recorded original _m_value
value_attr (except if it can be expressed as a value_reference).
I don't understand what guarantees that the original given value_attr is
unique. This can be any dwarf::value_attr or dwarf_edit::value_attr from
the input. Do they all have to property that they represent any equal
value_attr uniquely with one object instance?