This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[COMMITTED] Fix alpha-elf relaxation


Relaxation has been broken for quite some time, primarily affecting large
programs.  The symptom is that relaxation creates a bunch of GPREL16
relocations, which turn out to be out-of-range, and the linker errors out.

Most folks simply turn off relaxation at this point and move on.  Indeed, I
believe the only remaining distribution supporting alpha (gentoo) turns off
relaxation by default.

Alpha has a multi-got scheme where each input file is given a .got subsection
which can contain 64k of symbols.  We merge .got subsections between object
files until they reach 64k, both to minimize redundancy in relocations, but
also to optimize calls between input files.  If a caller and callee are close
enough, and share a .got subsection, then we can optimize to a direct branch.
 Which eliminates a load and indirect branch on the caller side and computation
of the gp on the callee side.  It may also eliminate the slot in the .got
subsection, which in turn reduces the size of the got and may allow for more
subsection merging and then to more call optimization.  Similarly, we also try
to replace loading an address from the .got with a direct displacement from the
gp register, which can also eliminate a slot and enable merging.

This is all well and good, except when we try to do too many things at once.
Suppose we have two .got subsections at the end of the .got:

   ----+----------------------+--------------------+--------------------
   ... |  large subsection m  | small subsection n | data ... variable x
   ----+----------------------+--------------------+--------------------

Suppose section N has a reference to X.  It's within 32k of the start of
section N, so we optimize the reference to a GPREL16 reloc.  Suppose just
enough elimination is done so that subsections M and N can be merged.  However,
the combination of M+N is larger than N alone, so the displacement of X from
the start of M is larger than 32k, and the newly created relocation is now out
of range.

The solution is to split the relaxation into two passes.  In the first pass,
eliminate everything we can that does not involve the creation of GPREL relocs.
 This is primarily TLS and call relaxation.  Since most functions are simply
called and not addressed directly (weak functions excepted), this works
well to eliminate those slots.  In the second pass, we do all of things that
would create GPREL relocs, but we disable .got subsection merging.  That way, X
can only move closer to the start of N as we eliminate slots, eliminating the
case that caused displacement growth.

Tested with a gcc bootstrap, which triggered the problem quite easily in
cc1plus and f951, and committed.


r~


PS: Relaxation of cc1plus:

Enabled:
  [ 8] .rela.dyn         RELA             00000001200bb928  000bb928
       0000000000000750  0000000000000018   A       4     0     8
  [ 9] .rela.plt         RELA             00000001200bc078  000bc078
       0000000000001008  0000000000000018  AI       4    11     8
  [25] .got              PROGBITS         0000000121031a90  01021a90
       000000000000b0b0  0000000000000000  WA       0     0     8

Disabled:
  [ 8] .rela.dyn         RELA             00000001200b8828  000b8828
       0000000000000990  0000000000000018   A       4     0     8
  [ 9] .rela.plt         RELA             00000001200b91b8  000b91b8
       0000000000001b18  0000000000000018  AI       4    11     8
  [23] .got              PROGBITS         0000000121018a48  01008a48
       0000000000022f80  0000000000000000  WA       0     0     8

a savings of 98k in the complete got section.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]