This is the mail archive of the dwarf2@corp.sgi.com mailing list for the dwarf2 project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: duplicate dwarf2 reduction via comdat [overlooked CC to dwarf2 list]



>>> from the header CUs to the primary CU use internal symbols; there is no
>>> need for them to be consistent between two CUs for the same header, so long
>>> as they refer to (semantically) the same thing.
>
>> What do you mean by "internal symbols"?
>
>In the sense that they are local to a single .o.  ".L1234" and the like.
>
>> The "no need for consistency" assertion just doesn't seem obvious.
>> Is there some way to explain this in more detail?
>
>The symbols exported from a COMDAT CU must be consistent between multiple
>instances, so that they are interchangeable.  If wa.h defines a class A,
>and the DIE for A uses the symbol _DW.wa.h.92485121.4, other COMDAT CUs
>for wa.h must use the same symbol if they are to be combined by the linker
>(of course, different macro definitions can produce different information
>that should not be combined; this will be handled by the checksum).
>
>The symbols that a particular CU references need not be consistent, so
>long as they *refer* to the same thing.  For instance, my current
>implementation leaves the base types in the primary CU.  So if in one
>compilation, my CU for wa.h refers to the DIE for int using .LDIE0 and in
>another compilation uses .LDIE25, this does not matter; the CUs are still
>equivalent, and can be combined at link time.

Ah, the light finally dawns. The key point that finally clarified things
for me is this: Suppose we start with interface file I.h that is included
in two different modules A.c and B.h according to your proposal. Then we
get

    From compilation of A.c

	DW_TAG_compile_unit for A

	DW_TAG_compile_unit for I	// aka I#1

    From compilation of B.c

	DW_TAG_compile_unit for B

	DW_TAG_compile_unit for I	// aka I#2


Further, assume I#1 refers to something in A and I#2 refers to something
in B. Finally the key point: after the linker has eliminated one of I#1 or
I#2, the remaining I will still refer an entity within the same unit with
which it was co-compiled. Whether you think of that reference in terms of
symbols or offsets within sections does not matter--all that matters is
that whatever I#1 referred to (which can occur anywhere in A) is semantically
equivalent to whatever I#2 referred to (which can occur anywhere in B).

Seems obvious now that it finally registers, but I think it leads to a not
quite so obvious next question. Suppose there are two files I.h and
J.h that are included in both A.c and B.c. So long as nothing in I refers to
anything in J and vice versa, then what you describe works just fine. But
what if something in J does refer to something in I?

The concern is that there there is nothing about the underlying COMDAT 
mechanism that assures that "consistent" choices will be made together.
That is, that either I#1 and J#1 are chosen or I#2 and J#2 are choosen,
but not (I#1 and J#2), nor (I#2 and J#1).

I 'spose the alternatives in this kind of situation are:

 1) eliminate from J anything that requires a reference into I
 2) merge the I and J compilation units into a single composite unit
 3) implement the reference from J into I using some kind of global
    (in the object file sense) necessarily mangled name (because of
    the use of a global symbol)

But, the third approach is what must be used to refer into I (whether I#1
or I#2) from either A or B in any case. So, no new problems arise.

Finally, I think I do see a real semantic limitation of your scheme.
That is that there is nothing in the representation that conveys the
fact that the declarations of, say, included file I.h should be "visible"
from the module scope of A.c and C.c but *not* from some other unit C.c
that does not include I.h. (Indeed, scope information cannot be built
into the global mangled name used to reference into I because the same
name must be used for references from both A and B.)

Perhaps limitation is acceptable in light of the space benefits obtained?
I am still pondering that one. (You may recall that the enabling mechanisms
I suggested included a new DW_TAG_include or DW_TAG_separate DIE, whose
purpose was to avoid this limitation.)

Am I getting this all correct?

Ron

p.s. I am still sorting through the rest of your example...

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]