This is the mail archive of the
mailing list for the binutils project.
Re: [PATCH 1/2] [RFC] Add IFUNC support for MIPS (v4)
- From: Faraz Shahbazker <faraz dot shahbazker at imgtec dot com>
- To: "Maciej W. Rozycki" <macro at imgtec dot com>
- Cc: "binutils at sourceware dot org" <binutils at sourceware dot org>, Richard Sandiford <rdsandiford at googlemail dot com>
- Date: Mon, 9 Jan 2017 13:10:29 -0800
- Subject: Re: [PATCH 1/2] [RFC] Add IFUNC support for MIPS (v4)
- Authentication-results: sourceware.org; auth=none
- References: <5583540C.email@example.com> <firstname.lastname@example.org> <55899D52.email@example.com> <firstname.lastname@example.org> <5589AFCD.email@example.com> <DCB1C42372B1674B8F912A294CCB775A71680718@BADAG02.ba.imgtec.org> <firstname.lastname@example.org> <5600517C.email@example.com> <firstname.lastname@example.org> <561D2820.email@example.com> <firstname.lastname@example.org> <5678829D.email@example.com> <firstname.lastname@example.org> <568EFE7B.email@example.com> <alpine.DEB.firstname.lastname@example.org> <56E9CE2D.email@example.com> <alpine.DEB.firstname.lastname@example.org> <573FD344.email@example.com> <alpine.DEB.firstname.lastname@example.org>
On 12/05/2016 06:43 AM, Maciej W. Rozycki wrote:
>> Also, note that although we have 32/48-bit variants for 64-bit, the IPLT stub
>> size is still the same as the regular entry. Only the # of instructions that
>> need to be executed is fewer. The selection is made based on the actual
>> address bits needed to get to the GOT/IGOT entry of that symbol. I should
>> probably be filling the tail part of these stubs with NOPs.
> Is that to simplify processing? We should be able to handle all exact
> sizes rather than just 2 of them at a time in a given link.
It might be possible - squeezing out the unused bytes, like say
relaxation would. The output address is not know at the time of initial
allocation/layout so we leave enough space for the largest possible
>> The delay slot optimization itself requires all regular entries to be
>> grouped together. Beyond that, stub sizes would only matter for mips16/64
>> combination. Okay then, to just sort on regular vs. compressed and keep
>> all compressed stubs before all regular ones? Compressed stubs generally
>> don't have the lagging delay-slot problem, except for the micromips_insn32
> Fine with me as long as the alignment constraints I outlined above are
> met and you can handle it all with `_bfd_mips_elf_get_synthetic_symtab'
> too (if applicable; I haven't checked).
Excluding all unsupported ISA combinations, and with the trailing
delay-slot optimization, we'd have 3 possible IPLT sizes:
12 bytes for all things mips32 (except mips1)
28 bytes for all things mips64
16 bytes for mips1
Even if we further optimize the 64-bit case for address range, but we
get one size for all entries within an IPLT section - be it 20 or 12
bytes. Can we thus say cache alignments are irrelevant for IPLT stubs?
Currently IPLTs are treated like stubs instead of PLT entries. As far as
I can make out, they do not fit neatly within the general paradigm of
_bfd_mips_elf_get_synthetic_symtab. For one, the original
(non-annotated) IFUNC symbol is present and defined within the object -
for regular PLT entry this is an undefined external reference.
Secondly, there aren't any relocations to connects a symbol name with an
IPLT entry. Each PLT entry has a JUMP_SLOT relocation which refers to
the symbol. IRELATIVE relocations OTOH, operate directly on entries in
the GOT/IGOT and do not map to specific IFUNC symbols. To be clear, an
IPLT symbol name has no functional significance beyond disassembly, but
if it must have a connection to the IFUNC symbol it targets, then it
must be anointed by the linker with an explicit stub name. This
connection cannot be simply synthesized from other information.
The only disadvantage of not having synthetic symbols I can see is the
inability to disassemble the address field embedded in the last word of
the mips16 IPLT entries. Is there anything else I am missing?
Note that I am still sorting on the basis of compressed vs. regular
entries even when they have the same size, because we can't have a
compressed entry follow a regular entry with its trailing delay slot.