This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] [BZ 18034] [AArch64] Lazy TLSDESC relocation data race fix
- From: Szabolcs Nagy <szabolcs dot nagy at arm dot com>
- To: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>, "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>
- Date: Wed, 22 Apr 2015 17:41:54 +0100
- Subject: Re: [PATCH] [BZ 18034] [AArch64] Lazy TLSDESC relocation data race fix
- Authentication-results: sourceware.org; auth=none
- References: <553793A3 dot 7030206 at arm dot com> <5537B943 dot 2060001 at linaro dot org>
On 22/04/15 16:07, Adhemerval Zanella wrote:
> On 22-04-2015 09:27, Szabolcs Nagy wrote:
>> On the write side I used a full barrier (__sync_synchronize) although
>>
>> dmb ishst
>>
>> would be enough, but write side is not performance critical.
>
> My understanding is you do not need to push a seq-consistent, but rather
> a store release on the 'td->arg', since the code only requires that
> 'td->entry' should not be reordered with 'td->arg'. So, to adjust to
> C11 semantics I would change:
>
> diff --git a/sysdeps/aarch64/tlsdesc.c b/sysdeps/aarch64/tlsdesc.c
> index 4821f8c..f738cc6 100644
> --- a/sysdeps/aarch64/tlsdesc.c
> +++ b/sysdeps/aarch64/tlsdesc.c
> @@ -87,6 +87,8 @@ _dl_tlsdesc_resolve_rela_fixup (struct tlsdesc volatile *td,
> if (!sym)
> {
> td->arg = (void*) reloc->r_addend;
> + /* Barrier so readers see the write above before the one below. */
> + __sync_synchronize ();
>
> To 'atomic_store_relase (td->arg, (void*) reloc->r_addend))'
>
>
i think
atomic_store_release(&td->entry, (void*) reloc->r_addend))
is the correct replacement, because moving stores
before a store_release is valid
>> - Lazy binding for static TLS may be unnecessary complexity: it seems
>> that gcc generates at most two static TLSDESC relocation entries for a
>> translation unit (zero and non-zero initialized TLS), so there has to be
>> a lot of object files with static TLS linked into a shared object to
>> make the load time relocation overhead significant. Unless there is some
>> evidence that lazy static TLS relocation makes sense I would suggest
>> removing that logic (from all archs). (The dynamic case is probably a
>> similar micro-optimization, but there the lazy vs non-lazy semantics are
>> different in case of undefined symbols and some applications may depend
>> on that).
>
> Recently I am seeing that all lazy relocation yields less gains than
> before and add a lot of complexity. I would like to see an usercase
> where lazy TLS or even normal relocation shows a real gain.
>
i'm not sure if glibc considers lazy relocation as part of the abi
contract, but i think static tls is always safe to do non-lazily