This is the mail archive of the
mailing list for the binutils project.
Re: [PATCH] MIPS support for --hash-style=gnu
- From: Rafael EspÃndola <rafael dot espindola at gmail dot com>
- To: "Neil Schellenberger (neschell)" <neschell at cisco dot com>
- Cc: "binutils at sourceware dot org" <binutils at sourceware dot org>
- Date: Thu, 5 Nov 2015 12:13:57 -0500
- Subject: Re: [PATCH] MIPS support for --hash-style=gnu
- Authentication-results: sourceware.org; auth=none
- References: <b49cd7d6b04d41749d56d557879ad4a6 at XCH-RCD-006 dot cisco dot com>
Wouldn't it be possible to sort the got entries so that the symbols
end up in the same order as the gnu hash table?
On 5 October 2015 at 19:11, Neil Schellenberger (neschell)
> I am tasked with reducing the time spent in the interpreter/loader at program start time for a very large, multi-platform, multi-architecture, legacy system (25+Mloc, 2000+ DSOs, 1+M symbols). On MIPS this loader overhead is several much larger than for other supported architectures, which is not unexpected given the lack of .gnu.hash support.  Measurement shows that a principal factor for the programs most affected is the very large number of DSOs which are directly or indirectly needed -- the chief expense being the cost of successive table misses during symbol resolution. A secondary factor is that some of the executables and DSOs have a very large number of symbols with relocations -- on MIPS these tend to run afoul of the multi-GOT mechanism which places many into secondary GOTs, resulting in unconditionally forcing their resolution at load time.  At this stage in the lifecycle of this particular product, large scale architectural changes tend not to be feasible (e.g. appreciably decreasing either the number of DSOs or the number of relocations) so a more transparent solution is preferable. (c.f. )
> In order to tackle the main problem -- large numbers of needed DSOs -- some means of avoiding unprofitable (i.e. certain miss) hash table probes would help significantly. Since code to support Bloom filtering already exists for GNU-style hash tables (DT_GNU_HASH), it seemed attractive to enable that feature for MIPS (and then also benefit from as much of the hash chain optimization as possible).  As I understand it, the fundamental problem which has historically prevented this support is, briefly, that the MIPS ABI requires that the .dynsym table be sorted in a particular order, while .gnu.hash mandates a different order; this appears to have stymied at least one earlier attempt.  As I am expert neither with MIPS and its many ABI oddities, nor with the implementation of the BFD linker, I have opted to take a very, very simple -- if perhaps non-optimal -- approach. Inspired by the comment made by Richard Sandison to Hiroki Kaminaga concerning external symbol blocks , I chose to reuse GNU-style hash tables, only with the simple addition of a translation table to map the GNU symbol order to the MIPS symbol order. The MIPS .dynsym table proper continues to be completely identical: its sort order and content are entirely unchanged. The proposed changes also leave the output of all other targets/backends entirely unchanged (including MIPS using traditional SysV .hash).
> In an attempt to avoid forward compatibility issues, a new section and related dynamic tag are proposed: .gnu.xhash and DT_GNU_XHASH. (The "X" standing either for "extended" or as a mnemonic for "translated", as the reader prefers.) This ensures that old binutils, glibc, and other third party code do not attempt to behave as if a traditional .gnu.hash/DT_GNU_HASH is present. Likewise, the name of the section was chosen so that it is not a textual prefix of .gnu.hash in an attempt to preclude any insufficiently discriminating pattern from matching inadvertently (e.g. in a script parsing readelf output).
> In practice, though, the .gnu.xhash section is virtually identical to .gnu.hash. 
> // Part 0: .gnu.xhash header
> Elf32_Word ngnusyms; // number of entries in chains (and xlat); dynsymcount=symndx+ngnusyms
> // Part 1: .gnu.hash header
> Elf32_Word nbuckets; // number of hash table buckets
> Elf32_Word symndx; // number of initial .dynsym entires skipped in chains (and xlat)
> Elf32_Word maskwords; // number of ElfW(Addr) words in bitmask
> Elf32_Word shift2; // bit shift of hashval for second Bloom filter bit
> // Part 2: .gnu.hash Bloom filter
> ElfW(Addr) bitmask[maskwords]; // 2 bit Bloom filter on hashval
> // Part 3: .gnu.hash buckets
> Elf32_Word buckets[nbuckets]; // indices into chains
> // Part 4: .gnu.hash chains
> Elf32_Word chains[ngnusyms]; // consecutive hashvals in a given bucket; last entry in chain has LSB set
> // Part 5: .gnu.xhash translation table
> Elf32_Word xlat[ngnusyms]; // parallel to chains; index into .dynsym
> Parts 1 through 4 correspond exactly in layout to the traditional .gnu.hash; part 4 has slightly different semantics since the correct MIPS .dynsym index has to be first looked up in the parallel xlat table i.e. the symbol corresponding to the hashval in chain[N] is .dynsym[xlat[N]]. It is intentional that the .gnu.xhash layout contains a .gnu.hash layout as a subcomponent: it represents an attempt to reuse unchanged as much BFD and readelf code as possible. The space cost of .gnu.xhash relative to .gnu.hash is ngnusyms+1 32 bit words. (For time cost, the L2 cache hit from the xlat table lookup is probably atrocious, but at that point we're already about to perform a string compare which is probably going to be even more atrocious.... In any case, it's a whole lot better than SysV .hash.)
> In order to reuse the maximum amount of existing code with the minimum amount of copying while also maintaining 100% backward compatibility, I chose to implement the support as a BFD ELF backend hook: record_hash_symbol(). For all platforms other than MIPS, this hook is NULL and the behaviour is to write .gnu.hash (or not) exactly as they always have done. For MIPS, this hook is non-NULL and (when --hash-style=gnu) will output a .gnu.xhash section which the MIPS specific ELF backend then updates with a translation table to allow .gnu.xhash symbol indices to be translated to MIPS .dynsym indices at the right time during linking. The principal changes to the common support (in bfd/elflink.c) are in bfd_elf_size_dynsym_hash_dynstr() which computes the correct size for the .gnu.[x]hash section; and in elf_renumber_gnu_hash_syms() which did the sorting for .gnu.hash. On the ELF MIPS specific side, the main changes are to mips_elf_sort_hash_table_f(); and in the addition of the backend function _bfd_mips_elf_record_hash_symbol() with an associated new field in struct mips_elf_link_hash_entry.
> In bfd_elf_size_dynsym_hash_dynstr() the code was modified as little as possible in order to keep the diff small and simple to review; the unfortunate corollary to that is that the changes are a little ugly and somewhat brittle (conditionally shifting the contents pointers along etc.) This also to an extent dictated the layout of the .gnu.xhash section: it is essentially a .gnu.hash section with a leading ngnusyms word and a following translation table. (The basic form of the .gnu.hash section was retained so as also to keep the readelf changes to a minimum.)
> The elf_renumber_gnu_hash_syms() function no longer unconditionally renumbers the symbols. Instead, if the backend supplies record_hash_symbol(), then that is called instead to allow it to record the offset of the translation table entry for that symbol. The MIPS backend will then fill this in later once it has finished fiddling with the GOT(s). I chose to pass an offset here rather than a pointer only because I wasn't entirely certain if it was architecturally acceptable to assume that the content of the section would never be replaced sometime in between (although this is not currently the case).
> On the ELF MIPS side, mips_elf_sort_hash_table_f() now also outputs the final index of each symbol into the .gnu.xhash translation table as required. This is also a bit brittle since it assumes that nothing else is going to come along and change the order yet again afterwards, but that is also true of all of the already existing MIPS backend code.
> The readelf support essentially speaks for itself. Again, it's a bit ugly since I tried to stick only to insertions and not changing existing lines, so as to make the code inspect simpler.
> No changes to gold are proposed here because, if I understand correctly, there is as yet no general MIPS support in any case.
> I have included the glibc changes here only as a convenience to reviewers; I will be posting to libc-alpha as well. (It is perhaps interesting to note in passing that although --hash-style=gnu was disabled in the linker, DT_GNU_HASH support was never removed from the glibc MIPS sysdep dl-lookup.c. This means that on MIPS the new and old hashvals are currently always both computed at runtime for every symbol. Fortunately in practice this cost appears to be entirely negligible. Laterally, I suspect that Jakub Jelinek had independently confirmed this insignificance and is why .hashvals never made it into .gnu.hash.  I experimented with adding .hashvals as well as .gnu.xhash, but it made no appreciable difference.)
> The patch is relative to binutils-2_24 (which is where I'll ultimately need it) but is equally applicable to trunk. (The glibc patch is relative to a lightly customized 2.16 but again is equally applicable to trunk.) As this is my first attempt at a patch for the linker, I've omitted the ChangeLog and test cases for the moment, pending feedback. Technical and procedural criticism is gratefully welcomed. I should very much like to thank the many who have taken the time to post on these and related subjects over the years -- I would have found even this modest attempt very difficult without the historical context. Particular thanks are due to Michael Meeks, Jakub Jelinek, Richard Sandiford, and Hiroki Kaminaga. Errors in comprehension and judgement are entirely my own.
> P.S. It is not entirely clear to me how xgot support does or should interact with multi-GOT.  With xgot supporting about 2**32 entries, shouldn't it be the case that multiple GOTs are almost never needed?
>  Disable .gnu.hash on MIPS targets https://www.sourceware.org/ml/binutils/2006-07/msg00341.html
>  MIPS Multi GOT https://dmz-portal.mips.com/wiki/MIPS_Multi_GOT
>  MIPS multi-got link support https://www.sourceware.org/ml/binutils/2003-01/msg00165.html
>  Description of MIPS/IRIX multigot https://sourceware.org/ml/binutils/2014-03/msg00174.html
>  Speeding up the dynamic linker with 100s of DSOs? https://www.sourceware.org/ml/libc-alpha/2006-01/msg00018.html
>  Re: OO.o / ld / -Bsymbolic patch https://sourceware.org/ml/binutils/2005-07/msg00109.html
>  [PATCH] DT_GNU_HASH: ~ 50% dynamic linking improvement https://www.sourceware.org/ml/binutils/2006-06/msg00418.html
>  [RFC][PATCH][EXPERIMENTAL] enable MIPS gnu hash (was: Re: use of gnu hash for MIPS) https://sourceware.org/ml/binutils/2007-08/msg00387.html
>  Re: use of gnu hash for MIPS https://sourceware.org/ml/binutils/2007-03/msg00126.html
>  Re: GNU_HASH section format https://sourceware.org/ml/binutils/2006-10/msg00377.html
>  binutils/glibc .hashvals section ... https://sourceware.org/ml/binutils/2006-01/msg00171.html
>  Add -mxgot option to mips backend https://gcc.gnu.org/ml/gcc-patches/2003-08/msg01856.html