This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Avoid two SSP ABI's for AArch64.


On 2 December 2013 17:53, Carlos O'Donell <carlos@redhat.com> wrote:
> Kumar, Marcus, Richard,
>
> Is there any way we can avoid having two SSP ABI's for AArch64?
>
> GCC discussion:
> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02244.html
>
> GLIBC question:
> https://sourceware.org/ml/libc-help/2013-12/msg00000.html
>
> Is it too late and the cat's out of the bag and we need to support,
> test, and ensure both models worked intermixed?
>
> All of this is under the assumption, and there is some data to back
> this up via the pointer guard work, that on all AArch64's a TP+offset
> access is always faster than loading the data from a global symbol.

Hi,

The situation as I understand it is as follows:

The aarch64 target port of GLIBC exports the  __stack_chk_guard global
and has done since the port was first upstreamed.

The original aarch64 target port for GCC had FRAME_GROWS_DOWNWARDS set
to 0.  The affect of this was that -fstack-protector support was
disabled.  Attempting to use this option on the original compiler
results in a "not supported on this target" error diagnostic.

This GCC patch, http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01890.html flips
FRAME_GROWS_DOWNWARDS to 1, with the side effect of enabling default
support for -fstack-protector.

This patch was OK'd and committed to gcc trunk ~ 12th November 2013 as
r204737.  This functionality is not present in the FSF gcc 4.8 branch
and has not yet been back ported to the linaro gcc 4.8 branch,
although it is scheduled to be back ported for the linaro 13.12
toolchain release.

This means that until early last month no GCC implementation for
aarch64 would generate code that used the __stack_chk_guard symbol
exported by glibc.

I am told that the LLVM implementation for aarch64  implements
-fstack-protector support and assumes the presence of
__stack_chk_guard.

My understanding of the alternative stack protector implementations is
that the stack canary is unique per process.  The canary is setup by
glibc and is available either by the global __stack_chk_guard or at a
defined offset from the thread pointer.  In case of the latter the
canary remains unique per process, it is never unique per thread.
Further, my understanding is that the choice between the two canary
locations is purely one of performance on the target in question,
there are no other factors that influence choice.

On AArch64 using the default 'small' memory model the relevant code
sequences would be:

1a) Canary in global:

adrp x1, g
ldr x2, [x1, #:lo12:g]

1b) Canary in global via GOT:

adrp x0, :got:g
ldr x1, [x0, #:got_lo12:g]
ldr x2, [x1]

2) Canary in TLS:

mrs x1, tpidr_el0
ldr x2, [x1, #-8]

In practice CSE will often pull out the construction of the canary
address to a register and avoiding the second address construction for
the checking code.

The first sequence is a very common A64 idiom,  while the latter
sequence is much less common.  From discussion with various
u-architects I am led to believe that given a range of different
AArch64 u-architectures each with a unique set of design trade offs,
the MRS implementation is a best going to have equivalent performance
to the ADRP, but will be most likely to have the largest variation in
implementation performance.

My view therefore is that we should run with the __stack_chk_guard
global data mechanism.

/Marcus


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]