Re: [PATCH, AARCH64] align long branch stubs

Jim Wilson writes:

> I got a bug report from Qualcomm that says if you set the A bit in the
> SCTLR register, to trap on unaligned accesses, their code fails,
> because the toolchain itself is emitting unaligned data accesses.
> The problem stems from a pair of patches from Marcus Shawcroft added
> over a year ago.
> The stub support makes a bit of effort to try to align everything to
> an 8 byte boundary, as the size of every stub is rounded up to a
> multiple of 8 bytes.  However, this patch from Marcus adds a 4-byte
> branch instruction before the first stub, and sets the section
> alignment to 4 bytes instead of 8 bytes..  This causes everything to
> end up with 4-byte alignment instead of 8-byte alignment.  This is a
> problem for long branch stubs, as they contain a 64-bit address as
> data, which should be 8-byte aligned.
> You can see the problem in the linker testcase
> ld/testsuite/ld-aarch64/farcall-back.d which has
> 0000000000002034 <__bar3_veneer>:
>     2034:       58000090        ldr     x16, 2044 <__bar3_veneer\+0x10>
>     2038:       10000011        adr     x17, 2038 <__bar3_veneer\+0x4>
>     203c:       8b110210        add     x16, x16, x17
>     2040:       d61f0200        br      x16
>     2044:       ffffffd8        .word   0xffffffd8
>     2048:       00000000        .word   0x00000000
> and you can see that the first ldr is loading unaligned data from 2044.
> One way to fix this is to add a nop after the branch, and return
> section alignment to 8 bytes.  This is fairly simple, though it
> requires a lot of annoying testsuite changes.
> I'm concerned that this might be reintroducing the problem that Marcus
> was trying to fix though, as now we end up with an occasional 4-byte 0
> padding around the stub sections.  I tried adding a
> bfd_arch_aarch64_nop_filll function, but apparently that only works
> inside sections, not between sections.

Hi Jim,

  Reading through your patch, this is my concern also.  After this patch
  there will be problem if execution path fall through to stub section
  though I haven't found a pattern from regression.

  My understanding is BFD is not aware of the padding between sections,
  as it is computing & align up file offset for each section, then fseek
  to the place when doing the final section output.  If we have both old
  and aligned offset, then we can fseek to old offset and do a
  customized padding.

> It isn't clear why we need the
> branches around the stub sections though.  It isn't normal to expect
> code to fall through the bottom of a section except for
> ctod/dtor/init/fini sections, but a stub section should not appear in
> the middle of one of those, and if it did, it might be best to use
> init_array and fini_array instead, as these are better anyways.
> The attached patch implements the solution of adding a nop after the
> branch.  It passes a build and make check on an aarch64-linux-gnu
> system.
> Jim


