This is the mail archive of the elfutils-devel@sourceware.org mailing list for the elfutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: denser dwarf macro information


> As far as I can see this idea was never implemented. I don't know why
> though. Maybe a simpler approach that just gets the strings shared is
> more effective?

I did some experiments to calculate the likely savings from a simple
format change to use .debug_str offsets instead of direct strings in
.debug_macinfo.  I don't remember the details (you might be able to
find them on some RH-internal list where I posted).  But my vague
recollection is that it didn't help nearly as much as I thought it
would.

That simple change approximately saves the length of the string less
four bytes of offset, for each time any string is duplicated.  So
its effectiveness is limited to that.

An approach based on putting macro information into the DIE tree has
the potential for a very different kind of savings.  When there are
long identical sequences of numerous macros, as is normal for any
repeated uses of header files, then normal DIE compression (i.e.
DW_TAG_imported_unit grafting) saves not just the strings but all
the space cost of the structural encoding.  So it seems like there
could be a much greater potential savings available there.

I don't recall the details of Jim's proposal at the moment.  My
recollection from when I looked at it before is that I would do some
things a bit differently.  But the essential notion of putting macro
information into the DIE tree seems sound and has the potential for
great space savings through the normal compression mechanisms.  It
also just seems more properly in keeping with the way most of DWARF
is that most things should be done with DIEs rather than with
entirely separate formats.  The .debug_line and .debug_frame
sections are notable exceptions because the nature of their
information lends itself to their "program encoding" style of
keeping the data small in ways that other kinds of information do not.

> Same seems to be true for .debug_line, which also doesn't reference
> strings through .debug_str. Is that a good or bad idea?

.debug_line compression ideas are another subject we should discuss
separately.  Using .debug_str has some obvious potential for
savings, but that might be overwhelmed by other approaches for
making that data smaller.


Thanks,
Roland

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]