This is the mail archive of the
mailing list for the binutils project.
Re: [Mips}Using DT tags for handling local ifuncs
- From: Richard Sandiford <rdsandiford at googlemail dot com>
- To: Jack Carter <Jack dot Carter at imgtec dot com>
- Cc: "Maciej W. Rozycki" <macro at codesourcery dot com>, "binutils\ at sourceware dot org" <binutils at sourceware dot org>, Doug Gilmore <Doug dot Gilmore at imgtec dot com>
- Date: Fri, 20 Dec 2013 00:35:29 +0000
- Subject: Re: [Mips}Using DT tags for handling local ifuncs
- Authentication-results: sourceware.org; auth=none
- References: <4CEFBC1BE64A8048869F799EF2D2EEEE4C6DDC0F at BADAG02 dot ba dot imgtec dot org> <4CEFBC1BE64A8048869F799EF2D2EEEE4C6DE23E at BADAG02 dot ba dot imgtec dot org> <87txef6a07 dot fsf at talisman dot default> <4CEFBC1BE64A8048869F799EF2D2EEEE4C6DE3B8 at BADAG02 dot ba dot imgtec dot org> <87haaf5hbg dot fsf at talisman dot default> <4CEFBC1BE64A8048869F799EF2D2EEEE4C6DE50C at BADAG02 dot ba dot imgtec dot org> <877gbb5c2k dot fsf at talisman dot default> <4CEFBC1BE64A8048869F799EF2D2EEEE4C6DE59A at BADAG02 dot ba dot imgtec dot org> <alpine dot DEB dot 1 dot 10 dot 1312112311480 dot 19368 at tp dot orcam dot me dot uk> <87y53q4czx dot fsf at talisman dot default> <alpine dot DEB dot 1 dot 10 dot 1312121406040 dot 19368 at tp dot orcam dot me dot uk> <87d2kz4uhi dot fsf at talisman dot default> <4CEFBC1BE64A8048869F799EF2D2EEEE4C6DF550 at BADAG02 dot ba dot imgtec dot org> <87ppot6gle dot fsf at talisman dot default> <4CEFBC1BE64A8048869F799EF2D2EEEE4C6DF9BE at BADAG02 dot ba dot imgtec dot org> <87txe5aw74 dot fsf at sandifor-thinkpad dot stglab dot manchester dot uk dot ibm dot com> <4CEFBC1BE64A8048869F799EF2D2EEEE4C6DFD62 at BADAG02 dot ba dot imgtec dot org> <871u18c04i dot fsf at sandifor-thinkpad dot stglab dot manchester dot uk dot ibm dot com> <4CEFBC1BE64A8048869F799EF2D2EEEE4C6DFDA5 at BADAG02 dot ba dot imgtec dot org> <8761qk6045 dot fsf at talisman dot default> <4CEFBC1BE64A8048869F799EF2D2EEEE4C6DFFC2 at BADAG02 dot ba dot imgtec dot org>
Jack Carter <Jack.Carter@imgtec.com> writes:
>> > I also have a hard time with how the GOT is used for binutils. In my
>> > experience and world view, sections have attributes that make them gp
>> > relative or not. All these sections get gathered in gp relative
>> > regions that are 64k from a value that will be in their $GP. If there
>> > are GOT elements that are not gp relative, they should be in another
>> > .got that is not marked SHF_MIPS_GPREL. It will not get laid out and
>> > calibrated with any of the other GOTs. Other sections in my life that
>> > get bundled up in the equation for multigot are .sbss, .sdata,
>> > .lit[4,8,16], .srdata, but only if they are marked SHF_MIPS_GPREL.
>> Just so I understand, do you think that the ABI GOT should always be 64k
>> or smaller? I.e. DT_MIPS_LOCAL_GOTNO + (DT_MIPS_SYMTABNO - DT_MIPS_GOTSYM)
>> should be <= 64 * 1024 / sizeof (void *)? If so, what should happen
>> (under the original or IRIX n32/n64 ABIs) if the number of symbols
>> involved in .rel.dyn relocations exceeds the 64k limit? Is that a
>> link error?
> Yes, because in sgi's case you count all the SHF_MIPS_GPREL sections as
> the GP area. .got is only one of them and sgi just put gp-relative
> entries in it.
But why then do you think the R_MIPS_GOTHI16/R_MIPS_GOTLO16 relocs
and R_MIPS_CALLHI16/R_MIPS_CALLLO16 relocs were defined? (They were
part of the original ABI.) If the intention really was to limit the
ABI GOT to 64k I don't think these "xgot" relocs would be needed.
>> > The DT_MIPS_LOCAL_GOTNO describes local got entries. Not other
>> > partitions that we reserve the right to put non-local got entries.
>> I'm still not sure which part you're describing as the local GOT here.
>> Let's go back to the original 32-bit GOT layout, without any GNU extensions:
>> +------------+ + <--- DT_PLTGOT
>> | entry 0 | |
>> +------------+ + B
>> | ........ | A |
>> +------------+ + + <--- DT_PLTGOT + DT_MIPS_LOCAL_GOTNO * 4
>> | Global GOT |
>> The zero entry in the global offset table is reserved to hold the
>> address of the entry point in the dynamic linker to call when lazy
>> resolving text symbols. The dynamic linker must always initialize this
>> entry regardless of whether lazy binding is or is not enabled.
>> Do you see the local GOT as being A or B? I.e. does it include
>> the zero entry?
> It is by definition A and B,
But it was an either-or choice. :-) Does it include entry 0 or not?
If yes, it's B. If no, it's A.
> here is the quote from the pre-sgi System V
> Application Binary Interface Mips Processor Supplement:
> Global Offset Table (5-9, second paragraph)
> "The global offset tables split into two locally separate subtables: local and
> externals. Local entries reside in the first part of the global offset
> table. The
> value of the dynamic tag DT_MIPS_LOCAL_GOTNO holds the number of
> local global offset table entries."
To me this suggests B if taken at face value.
> The sgi edition is essentially the same but it includes:
> "It (the GOT" is essentially two tables. The first (with DT_MIPS_LOCAL_GOTNO
> entries) consists of local GOT addresses, i.e. non-preemptible (protected)
> addresses defined within the executable/DSO."
And to me this suggests A if taken at face value, since entry 0 isn't a
"non-preemptible (protected) address defined within the executable/DSO".
It's an address in the dynamic linker instead.
Which of A and B seems right to you?
>> > Dealing with the ifunc "local" entries implicitly will save a
>> > relocation lookup, a tiny blip of time in relation to the other costs
>> > of calling the resolver. So I am arguing about how many angels can
>> > dance on a pin.
>> Yeah, maybe this is one we'll have to agree to disagree on. I think the
>> benefit of having an implicitly-relocated irelative region is small at best.
>> I like the generality of including the GOT R_MIPS_IRELATIVE GOT
>> relocations in the general .rel.dyn pool and sorting them accordingly,
>> because it feels more future-proof. I also think an implicit region is
>> harder to handle in a backward-compatible way, since if we just add new
>> tags, older ld.sos would ignore them and not flag an error.
> Then go the way sgi did and have .dynsym indexed regions for:
Older linkers would ignore those too though.
> For entertainment sake here is the comment in my private elf dumper wrote back then:
> Function: mips_print_got
> MIPS has 2 different GOT table variants that are
> pretty much the same except one depends on symbol
> table to got table symmetry for runtime fixup purposes
> and the other uses runtime relocations.
> If there is multigot there will be entries in the first dynamic section
> of type DT_MIPS_AUX_DYNAMIC which point to the other
> dynamic sections which in turn point to and describe their
> associated gots.
> DT_MIPS_LOCAL_GOTNO Starting point for DEFAULT symbols
> DT_MIPS_GOTSYM Index into dsymtab matching DT_MIPS_LOCAL_GOTNO
> DT_MIPS_HIPAGENO Number of page table entries.
> DT_MIPS_LOCALPAGE_GOTIDX Starting point for a local got page table
> DT_MIPS_LOCAL_GOTIDX Starting point for local full addresses
> DT_MIPS_HIDDEN_GOTIDX Starting point for HIDDEN symbols
> DT_MIPS_PROTECTED_GOTIDX Starting point for PROTECTED symbols
> If DT_MIPS_LOCAL_GOTIDX == DT_HIDDEN_GOT_IDX ||
> DT_PROTECTED_GOT_IDX ||
> then there are no local entries. Local in this sense
> means addresses that may or may not have associated
> entries in the symbol table or relocation table. If
> they are present in the symbol table they will be marked
> as STO_INTERNAL and must not be referenced outside of the
> defining dso/a.out in any form.
> If DT_HIDDEN_GOT_IDX == DT_PROTECTED_GOT_IDX ||
> then there are no hidden entries. Hidden symbols
> are those that are marked STO_HIDDEN in the dynamic
> symbol table and are accessable from outside the defining
> dso only non-symbolicly such as through pointers.
> If DT_PROTECTED_GOT_IDX == DT_MIPS_LOCAL_GOTNO
> then there are no protected entries. Protected symbols
> are those that are marked STO_PROTECTED in the dynamic
> symbol table and are accessable from the outside, but
> cannot be preempted during runtime loading and thus are
> @return void.
> Note, for multigot this resulted in multiple dynamic sections, dynsyms and
> relocation fixups for the got entries.
Did it also result in multiple relocation tables, one for each .dynamic
section? Or was there still a single .rel.dyn table?
If just a single .rel.dyn table, did all relocations in the table use
the primary GOT's DT_MIPS_GOTSYM as the local/global threshold? If so,
did that mean that there was no specific limit to the number of distinct
global symbols that could be stored in GOT entries (thanks to multigot),
but that there was a limit of 16k (or 8k for n64) global symbols that
could be used in relocations? (Sorry for the barrage of questions --
the downside of doing this by email.)
If there were multiple .rel.dyn tables, each tied to their own
.dynamic sections, how would we sort them so that all IRELATIVE
relocations in am object are applied after all non-IRELATIVE ones?
> I am not proposing that we go down this route, but it may give a sense of
> the world I came from. I liked it because (other than that I designed a lot of
> it :-)) of the structure in symbol visibility and that I could dump the entries
> symbolically. Also, each GP region was described by its dynamic section.
> This is not a trivial change and goes beyond the ifunc scope, but it would resolve
> the fixup by relocation issues and usher in GP rel areas that go beyond the GOT.
> I really just want to get ifunc done without messing up future goodness in ld/ld.so.
OK, this scheme seems to create multiple .dynsyms as a way of avoiding
explicit relocations for the multigot entries. Is that right?
I.e. rather than have a .rel.dyn entry for a multigot global GOT entry,
it has an entry in a secondary .dynsym instead?
Does that really pay off though? In ELF32, symbols are 16 bytes in size
but REL relocations are 8 bytes in size. And because the global GOT
acts as a cache, resolving normal global relocations is very cheap.
We only look up the symbol once, when resolving the GOT entry.
(If the same global symbol appeared in two GOTs and .dynsyms, did you
look it up twice, or just once? If twice then the .rel.dyn approach
seems to win there too, as well as on size.)
I agree that in the specific case of ifuncs it would probably work
to do things this way, since for ifuncs the type of GOT entry needed
can be determined from the symbol type (IFUNC rather than FUNC).
But it wouldn't extend well to other types of relocation. E.g.
TLS GOT entries can't be implied from the symbol type in this way.
It might be that the next relocation type we add also has no associated
symbol type. (The type is only a 4-bit field after all, and most are
It would also mean creating .dynsym entries for all ifuncs that were
STB_LOCAL in the original .o (as well as dynsyms for internal and
hidden global symbols). Should those STB_LOCAL-derived dynsyms have
names or be nameless? If there are multiple .os with the same STB_LOCAL
symbol name, should we try to make them unique when converting them to
dynsyms, or keep several dynsyms with the same name?
As for the comment about dumping entries symbolically: like I mentioned
before, we still have local, internal and hidden symbols in .symtab.
But the nice thing about .symtab is that it can be stripped to save space.
If we force the names of local, internal and hidden symbols into
.dynsym then it's harder to get rid of them later.