This is the mail archive of the mailing list for the binutils project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: [Mips}Using DT tags for handling local ifuncs


To be honest I don't agree with many of the conclusions below. No, I
have not added any new comments ;-)

But it doesn't matter. I am going ahead with what was agreed upon.
The mechanism you outlined will work and I will work within the system.


From: Richard Sandiford []
Sent: Thursday, December 19, 2013 3:58 AM
To: Jack Carter
Cc: Maciej W. Rozycki;; Doug Gilmore
Subject: Re: [Mips}Using DT tags for handling local ifuncs

Sorry, didn't notice last night that there was more to the message.

Jack Carter <> writes:
>>>>>> So if we put the relocations after the ABI GOT we would end up forcing
>>>>>> the use of multigots even though the number of "real" GOT entries
>>>>>> (those that need to be accessed $gp-relative) is small enough for
>>>>>> a single GOT.  The idea of the tag is to avoid that.
>>>>>    I see, that makes sense to me.  Do we already care to sort GOT entries
>>>>> appropriately?
>>>> Yeah, this is GGA_NORMAL vs. GGA_RELOC_ONLY in elfxx-mips.cs
>>> I don't understand this either. Local GOT and global GOT both are GP
>>> relative.  Don't you have to include both along with any other GP
>>> relative section in the multigot accounting? And if so, don't you have
>>> to include pieces of both the local and global GOT in each multigot
>>> region that needs it?
>>> In my mind it shouldn't matter where you put a GOT entry in terms of multigot
>>> threshold accounting.
>> The point is that if foo binds globally:
>>      .word   foo
>> requires a GOT entry for foo.  But this .word doesn't on its own require
>> the GOT entry to be within range of a $gp-relative access.  It could be at
>> $gp+0x8000, say, without breaking anything.  So we sort the GOT entries
>> that do need to be accessed $gp-relative from those that don't.
>> The ones that don't go after the ones that do.
>> Therefore, when creating multigots for normal local and global entries,
>> we can ignore the entries that don't need to be $gp-relative, because
>> they will always come later.  The problem is that TLS GOT entries need
>> to be after _all_ global GOT entries.  So if the last entry of the
>> global GOT is out of range of $gp, a single TLS reloc in an input bfd
>> will force that bfd to use multigot, even if the last entry of the
>> global GOT doesn't itself need to be accessed $gp-relative.
> Okay, this legal in the abi, but SGI/Mips just put gp relative stuff
> in the GOT making it a pure SHF_MIPS_GPREL section. I'll get over it ;-)

Not sure what you mean.  It's always been a requirement (going back to the
original 32-bit psABI) that all globally-binding symbols used in dynamic
relocations must also be in the global GOT, even if there are no GOT
accesses to those symbols.  This isn't an IRIX vs. GNU thing.  From 4-19:

  The R_MIPS_REL32 relocation type is the only relocation performed by
  the dynamic linker. The value EA used by the dynamic linker to
  relocate an R_MIPS_REL32 relocation depends on its r_symndx value. If
  the relocation entry r_symndx is less than DT_MIPS_GOTSYM, the value
  of EA is the symbol st_value plus displacement. Otherwise, the value
  of EA is the value in the GOT entry corresponding to the relocation
  entry r_symndx. The correspondence between the GOT and the dynamic
  symbol table is described in the "Global Offset Table" section in
  Chapter 5.

So there have always been GOT entries that are needed to satisfy
incoming GOT relocations and GOT entries that are needed to get
the correct R_MIPS_REL32 behaviour.  If we only need a GOT entry
for the latter then it doesn't really matter whether it's in the
range of $gp.  (And indeed it can't matter, unless the ABI wants
to restrict the number of relocation symbols to less than
0x10000 / sizeof (void *).)

> If you cluster the non-gp relative GOT entries at the beginning and end of the
> the initial GOT, one can potentially prevent triggering multigot by
> ignoring them.

What you do mean by at the beginning?  The only GOT entries that don't
need to be accessed relative to $gp are those added for the rule above.
I.e. entries for symbols that bind globally, are used in dynamic relocations,
and had no incoming GOT relocations.

All other GOT entries are created in response to incoming GOT relocations
and are therefore accessed relative to $gp.  So...

> Is that what binutils does now? Put all the local non-gp relative
> entries first and all the global non-gp relative entries last?

...there are no local entries that aren't accessed $gp-relative.

Yes, binutils sorts the global GOT so that $gp-relative GOT entries
(i.e. those with incoming GOT relocations) come first.  This is the
GGA_NORMAL vs. GGA_RELOC_ONLY thing in elfxx-mips.c.

>>> Is there a way that I can reduce the multigot threshold in binutils for
>>> testing purposes?
>> Well, for testsuite tests we want the real threshold to be used.  .rept is
>> your friend here.
> Why wouldn't you want a linktime option to reduce the multigot threshold?
> It could be used for testing as well. Have one test that uses the full blown
> threshold and all the others with reduced ones. That's what I did at sgi and
> it saved a lot of time and effort. But this is a digression from ifunc.

Only the real threshold is interesting.  If you use a fake low threshold
then you don't hit other important limits (e.g. you don't get out-of-range
global GOT accesses, out-of-range TLS GOT relocs, etc.).  And you can
test whatever you wanted to test with the lower limit by bumping your
.rept count to hit the real limit instead.

>>>>>    For executables using traditional SVR4 PIC code we could use the
>>>>> absolute
>>>>> sequence indeed.  However I checked the ifunc ABI description again and no
>>>>> lazy binding is proposed for ifuncs so no stub of any kind will be
>>>>> required for PIC code (be it an executables or a shared library) as all
>>>>> calls are made through the GOT there anyway and this will have been
>>>>> relocated by the time any ifunc call is reached.
>>>> OK, once we start using the new GOT region then I agree we can make
>>>> that the case.  In terms of the ABI and patch as posted though, we used
>>>> IPLTs for all ifuncs defined in executables:
>>>>     Dynamically linked executables with ifunc definitions are for the most
>>>>     part the same as static executables, with the following exceptions:
>>>>       *) No __rel_iplt_start/end symbols. The dynamic linker handles dynamic
>>>>          relocations.
>>>>       *) Generate STT_GNU_IFUNC symbols in the .dynsym section with the value
>>>>          being the iplt entry for this IFUNC and transform them into
>>>>          STT_FUNC symbols. Any references to this IFUNC have to go through
>>>>          the stub. The dynamic linker ( will be doing the fixup at start-
>>>>          up.
>>> If the dso LOCAL/INTERNAL/HIDDEN IIfunc GOT entry has a symbol
>>> associated with it,
>>> it does not need to go through an iplt stub since the call is referenced
>>> through the GOT
>>> value which will be fixed up by
>>> The only time I would see a dso defined ifunc call going through the
>>> iplt is for local calls that
>>> go through the local got offset table. That is, the offset in the GOT is
>>> shared among multiple
>>> addresses and the call needs an additional addend to get the final
>>> address. Does this occur
>>> for function calls in GCC/binutils or do all function addresses get
>>> their own GOT entry?
>> Either we should disallow GOT page accesses for ifuncs or the static linker
>> should treat them in the same way as globals (i.e. by directing the GOT
>> reloc to the function's own GOT slot and making the offset reloc resolve
>> to 0).
> I don't see your logic. These same functions w/o ifunc are in the local GOT.
> What's changed? Only the fixup.

The offset part of a page/offset pair has to be 0 for an ifunc because
we don't know its address (and therefore its low 16 bits) until runtime.
This is just like a page/offset reference to an external or preemptible
symbol, whose value again we don't know until runtime.  We handle those
kinds of cases by directing the page part of the access to a GOT entry
that contains the full address and making the offset reloc evaluate to 0.
In the case of locally-binding ifuncs, the GOT entry used for the page
access would be one of the new IRELATIVE slots.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]