This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Mips}Using DT tags for handling local ifuncs


On Thu, 12 Dec 2013, Richard Sandiford wrote:

> >  I gather all that is needed is that ifunc pointers are reachable with 
> > gp-relative addressing (so that the same standard calling sequence can be 
> > used, either the SVR4 PIC or the non-PIC PLT type, regardless of whether 
> > calling an ifunc or an ordinary function), so grouping them in a section 
> > called .igot.plt and then either prepending or appending to .got should 
> > do; with a linker script even.  Of course the static linker will have to 
> > ensure that all the pointers in the combined sections are in range from 
> > $gp (and the same with secondary $gp values in the multi-GOT case).
> 
> I don't follow the comment about calling convention, sorry.  The problem
> here is what to do with:
> 
> 	lw	$4,%got_disp(foo)($28)
> 
> in cases where foo is an ifunc that binds locally.  We need some way
> of putting it in the GOT and having an IRELATIVE relocation against it.

 My point is it doesn't have to be the ABI-defined GOT, just somewhere 
reachable with an offset from $gp.  LD should be able to treat 
R_MIPS_GOT_DISP (and other GOT relocs) specially seeing foo is an ifunc 
symbol (it'll have to anyway); it's not like foo will be both an ifunc and 
an ordinary function in a single static link.

> I think you're suggesting that we allow the ABI-defined GOT to start at
> something other than $gp - 0x7ff0, so that explicitly-relocated data
> could go first.  I think that would be more disruptive in some ways,
> since the 0x7ff0 offset is hard-coded into glibc.  The resolver for
> lazy-binding stubs subtracts 0x7ff0 from the incoming $gp to get the
> start of the ABI-defined GOT and then gets the link map from entry 1
> (assuming that the GNU extension is in use).
> 
> I suppose it'd be possible to adjust $gp in the stub so that $gp - 0x7ff0
> is right on entry to the resolver.  But that would be difficult to do
> cleanly on n32 and n64, where $gp is call-saved.  The resolver would
> probably have to return to the stub, which in turn would mean that the
> stub would need call-frame information.

 Hmm, thanks for reminding me that, that rules out the space before the 
ABI GOT.  We still have space afterwards for things like this (or e.g. for 
a small-data area if we ever implement it) though.

> >  BTW, for loading 64-bit addresses I suggest using two temporaries (we've 
> > got plenty of them) for a sequence that is faster on superscalar 
> > processors, i.e. rather than:
> >
> > static const bfd_vma mips64_exec_iplt_entry[] =
> > {
> >   0x3c0f0000,	/* lui $15, %highest(.got.iplt entry)        */
> >   0x65ef0000,	/* daddiu $15, $15, %higher(.got.iplt entry) */
> >   0x000f7c38,	/* dsll $15,$15, 16                          */
> >   0x65ef0000,	/* daddiu $15, $15, %hi(.got.iplt entry)     */
> >   0x000f7c38,	/* dsll $15,$15, 16                          */
> >   0x01f90000,	/* l[wd] $25, %lo(.got.iplt entry)($15)      */
> >   0x03200008,	/* jr $25                                    */
> >   0x00000000,	/* nop                                       */
> > };
> >
> > use:
> >
> > static const bfd_vma mips64_exec_iplt_entry[] =
> > {
> >   0x3c0f0000,	/* lui $15, %highest(.got.iplt entry)        */
> >   0x3c0e0000,	/* lui $14, %hi(.got.iplt entry)             */
> >   0x25ef0000,	/* addiu $15, $15, %higher(.got.iplt entry)  */
> >   0x000f783c,	/* dsll32 $15, $15, 0x0                      */
> >   0x01ee782d,	/* daddu $15, $15, $14                       */
> >   0xddf90000,	/* ld $25, %lo(.got.iplt entry)($15)         */
> >   0x03200008,	/* jr $25                                    */
> >   0x00000000,	/* nop                                       */
> > };
> >
> > (this also avoids a DADDIU erratum early R4000/R4400 chips had).
> 
> Yeah, I wondered about this when I first saw it too, but Jack optimized
> the sequence based on the address, so that we would only have the full
> thing if %highest really was needed.  Since the usual base address is
> 0x120000000, I think the full sequence will in effect never be used.
> 
> I'm not opposed to having two n64 sequences, one for when %highest
> is needed and one for when it isn't.  It just doesn't seem like a
> priority.

 Fair enough, but then, after a bit of thinking, do we need 
%highest/%higher stuff in the first place?  For n64 non-PIC PLT is only 
supported for msym32 binaries anyway and it doesn't look to me it is ever 
going to change, so the high 33 address bits will always be zero and the 
32-bit version (with LD rather than LW) will do, and for SVR4 PIC binaries 
you need to figure out the GOT pointer from $t9 instead (is there any 
point in making a difference between ET_EXEC and ET_DYN binaries here?); 
note that this would exclude ifunc calls from being tail calls (breaking 
the standard calling convention) so it looks to me we'll have to make an 
extra stub to load $gp beforehand.

 Have I missed anything?

  Maciej


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]