This is the mail archive of the mailing list for the Archer project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

DWARF compaction

> This is not really GCC (I think it can be done in the linker): word on
> the street is that for C++ programs, GCC generates a lot of duplicate
> debuginfo, especially for header files. DWARF has mechanisms to avoid
> such duplication. I'd like to investigate more about this and see what
> can be done...

This subject is well-understood among DWARF experts and the solutions long
planned.  Indeed the great bulk of the work here is on the producer side.
There are ways that involve new compiler and linker smarts cooperating
(this scheme is described in an appendix to the DWARF spec), as well as
ways done in a pure post-processing step without the compiler or linker
doing anything different (though post-processing integrated into linking is
also an offshoot of this option).

I think we at Red Hat will (finally) be ramping up on working on this soon.
Our first tack will be pure post-processing, which is what we need for
other reasons anyway.  We're planning to implement this in elfutils, not in
the BFD/binutils universe.  If you want to do DWARF-rewriting in BFD, knock
yourself out.  

The DWARF-appendix/Sun plan using compiler-generated section groups
and linker features might also not be that hard to get going, if you
wanted to take that tack.  That might exercise the linker in new ways
that need a little work, but most of the work is in the compiler.  (We
need the post-processing approach for other purposes anyway, so we're
not planning to work on this route.)

We don't really need to get into all the details of the producer-side
implementation plans here.  All those plans produce the same kind of
results in the final DWARF format, which is stuff already described
fully by the spec.

What GDB needs to do is cope with all the new wrinkles of the DWARF
format that can be used in the most optimally compact encodings.
These are mainly ref_addr, partial_unit, imported_unit, and not
getting confused about semantics of inter-CU pointers in the DIEs.

You can sort of work on this in parallel, though it's a bit hard to
finish any new support when you don't have anything that produces that
format to test it with.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]