This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: mjw/dwarf_output branch


> So all hashers used for putting things into the dwarf_output_collector
> are defined on the types we actually put into the collector. Which does
> make kind of sense, 

Right.

> but does raise the question whether we aren't
> creating objects to match against only to immediately throw them away
> because they are already in the collector? 

We are.  AFAICT this is the way STL maps/sets are normally used.  Otherwise
their interfaces make you do the hashing and table lookup twice, once to
check if it exists and a second time to insert if it doesn't.  Furthermore,
you then copy the new object into the container and destroy the object you
passed to the STL insert method.  Unless there are STL methods I have
overlooked, the only way to avoid that last copy is to make all the
containers be containers of pointers.

Something to realize here is that some of these objects are really small
and thus should be cheap to construct/copy/destroy.  For example,
dwarf_output::debug_info_entry is just four words, including the cached
hash value.

The children_type and attributes_type objects are somewhat larger.  And
those are not the granularities of sharing in the file format.  So the only
benefit we are getting from independent sharing at those granularities is
memory savings.  For example, there is just one empty children_type object
in the collector and all debug_info_entry objects share the pointer to
that.  Likewise for less trivial ones, like an attributes_type of just
{name="x", type="#ref_to_int"} and a whole children_type for a parameter
list of (int, int, int) appearing in numerous prototypes' subprogram dies,
etc.  But it could well be that the memory savings there is not worth the
computation time to do the two separate hash tables.  At the moment
probably the most important consideration is whatever makes the code easier
to write.

> > It's certainly possible to do it the other way.  You just don't break up
> > the hash calculation along the lines of the attributes_type/children_type
> > objects.  Instead, do it all at the entry level.
> 
> I am not sure if I understand precisely what you mean here. We certainly
> need a hash based on the new local hash idea for the attributes_type,
> since that is where most of the collisions happen now. Do you mean,
> calculate the local hash for/at the entry/die level, and leave it at
> that, then use that local hash value in the calculation of the
> attributes_type hash?

I guess what I mean is that we would not have separate hash tables for
attributes_type and children_type objects in the collector (as I just
mentioned above).  Instead, only the whole debug_info_entry would have its
hashes collected from the contents of its two constituent containers.

> Right, yes that would work. But isn't using friend considered breaking
> abstraction? Maybe it is common usage in c++, but it feels very much
> like arbitrarily breaking data encapsulation. 

Well, sure.  But you have to look at the practical realities of where we
actually have information hiding.  The several levels of objects are not
really all independent moving parts.  Most of the layers are just because
that's how it has to work to make the end-user interfaces come out nicely.
If you are bending over backwards to funnel things through more purely
objecty method styles with no real benefit, then it's not an improvement.

> I admit I find this very hard to reason about.
> So if we have the following:
> 
> class dwarf_data
> {
>   template<class impl, typename vw = value<impl> >
>   class attr_value
>   {
>     typedef typename impl::debug_info_entry::pointer die_ptr;
>     inline die_ptr &reference ()
>     {
>       return variant<typename vw::value_reference> ().ref;
>     }
>   }
> }
> 
> class dwarf_output
> {
>   class attr_value
>     : public dwarf_data::attr_value<dwarf_output, value>
>   {
>   }
> }
> 
> class pending_dwarf
> {
>   struct attr_value
>     : public dwarf_output::attr_value
>   {
>     inline typename debug_info_entry::const_pointer reference () const
>     {
>     }
>   }
> }
> 
> How do I tell that the return type of reference () is not inherited
> through the template base (base) class, but is actually a template value
> itself. And what is the return type template value precisely for
> reference () in this case?

There is obviously base class inheritance here too.  But that is really
just an implementation detail for how we're getting our code-sharing.
dwarf_data is nothing but an implementation detail for dwarf_edit and
dwarf_output.  The only level at which there is an overall interface
abstraction being matched is dwarf, dwarf_edit, dwarf_output, and
pending_dwarf.  All of those except dwarf_edit are entirely read-only.
So what matters is that <impl>::attr_value::reference () as used for
reading the value is used the same way to the caller across each <impl>.

If there were a module type system, then there would be an overall
"dwarf-alike" signature that all these implementations match.  That
would say that <impl>::attr_value::reference () returns a thing you can
store in a variable of type <impl>::debug_info_entry::const_pointer and
that on such a thing you can use the ==, !=, and * operators (and
->method to get <impl>::debug_info_entry methods, by implication), and
that's it.  In the C++ standards world there was a notion of such a
module type system, which they called "concepts".  But they couldn't
figure it out enough, so they dropped it from the C++0x (1x now?)
standard.

It's more like an SML or Modula-3 module signature than like a C++ base
class, just in case that helps you at all.  And I don't actually know
Java deeply enough to say whether a Java interface is more like one or
the other.


Thanks,
Roland

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]