This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 1/2] aarch64: Disable lazy symbol binding of TLSDESC
- From: Szabolcs Nagy <szabolcs dot nagy at arm dot com>
- To: GNU C Library <libc-alpha at sourceware dot org>
- Cc: nd at arm dot com
- Date: Fri, 20 Oct 2017 15:18:12 +0100
- Subject: Re: [PATCH 1/2] aarch64: Disable lazy symbol binding of TLSDESC
- Authentication-results: sourceware.org; auth=none
- Authentication-results: spf=none (sender IP is ) smtp.mailfrom=Szabolcs dot Nagy at arm dot com;
- Nodisclaimer: True
- References: <59D791A6.10507@arm.com> <59D791ED.4060609@arm.com>
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:99
On 06/10/17 15:23, Szabolcs Nagy wrote:
> From fcef01e2cdf2a79c1a91d9b0a8b191d0e1a0cdae Mon Sep 17 00:00:00 2001
> From: Szabolcs Nagy <szabolcs.nagy@arm.com>
> Date: Wed, 27 Sep 2017 16:55:14 +0100
> Subject: [PATCH 1/2] aarch64: Disable lazy symbol binding of TLSDESC
>
> Always do TLS descriptor initialization at load time during relocation
> processing to avoid barriers at every TLS access.
>
i'd add here that:
In non-dlopened shared libraries the overhead of tls access vs static
global access is >3x bigger when lazy initialization is used compared
to bind-now so the barriers dominate tls access performance.
> TLSDESC relocs are in DT_JMPREL which are processed at load time using
> elf_machine_lazy_rel which is only supposed to do lightweight
> initialization using the DT_TLSDESC_PLT trampoline (the trampoline code
> jumps to the entry point in DT_TLSDESC_GOT which does the lazy tlsdesc
> initialization at runtime). This patch changes elf_machine_lazy_rel
> in aarch64 to do the symbol binding and initialization as if DF_BIND_NOW
> was set, so the non-lazy code path of elf/do-rel.h was replicated.
>
> The static linker could be changed to emit TLSDESC relocs in DT_REL*,
> which are processed non-lazily, but the goal of this patch is to always
> guarantee bind-now semantics, even if the binary was produced with an
> old linker, so the barriers can be dropped in tls descriptor functions.
>
> After this change the synchronizing ldar instructions can be dropped
> as well as the lazy initialization machinery including the DT_TLSDESC_GOT
> setup.
>
> I believe this should be done on all targets, including ones where no
> barrier is needed for lazy initialization. There is very little gain in
> optimizing for large number of symbolic tlsdesc relocations which is an
> extremely uncommon case. And currently the tlsdesc entries are only
> readonly protected with -z now and some hardennings against writable
> JUMPSLOT relocs don't work for TLSDESC so they are a security hazard.
> (But to fix that the static linker has to be changed.)
>
> 2017-09-29 Szabolcs Nagy <szabolcs.nagy@arm.com>
>
> * sysdeps/aarch64/dl-machine.h (elf_machine_lazy_rel): Do symbol
> binding and initialization non-lazily for R_AARCH64_TLSDESC.
> ---
> sysdeps/aarch64/dl-machine.h | 19 ++++++++++++++-----
> 1 file changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/sysdeps/aarch64/dl-machine.h b/sysdeps/aarch64/dl-machine.h
> index b1245476dc..9bd48752e5 100644
> --- a/sysdeps/aarch64/dl-machine.h
> +++ b/sysdeps/aarch64/dl-machine.h
> @@ -428,12 +428,21 @@ elf_machine_lazy_rel (struct link_map *map,
> }
> else if (__builtin_expect (r_type == AARCH64_R(TLSDESC), 1))
> {
> - struct tlsdesc volatile *td =
> - (struct tlsdesc volatile *)reloc_addr;
> + const Elf_Symndx symndx = ELFW (R_SYM) (reloc->r_info);
> + const ElfW (Sym) *symtab = (const void *)D_PTR (map, l_info[DT_SYMTAB]);
> + const ElfW (Sym) *sym = &symtab[symndx];
> + const struct r_found_version *version = NULL;
>
> - td->arg = (void*)reloc;
> - td->entry = (void*)(D_PTR (map, l_info[ADDRIDX (DT_TLSDESC_PLT)])
> - + map->l_addr);
> + if (map->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
> + {
> + const ElfW (Half) *vernum =
> + (const void *)D_PTR (map, l_info[VERSYMIDX (DT_VERSYM)]);
> + version = &map->l_versions[vernum[symndx] & 0x7fff];
> + }
> +
> + /* Always initialize TLS descriptors completely, because lazy
> + initialization requires synchronization at every TLS access. */
> + elf_machine_rela (map, reloc, sym, version, reloc_addr, skip_ifunc);
> }
> else if (__glibc_unlikely (r_type == AARCH64_R(IRELATIVE)))
> {
> -- 2.11.0
>