This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: RFC: TLS improvements for IA32 and AMD64/EM64T
On Sep 17, 2005, Alexandre Oliva <aoliva@redhat.com> wrote:
> On Sep 16, 2005, Alexandre Oliva <aoliva@redhat.com> wrote:
>> On Sep 16, 2005, Alexandre Oliva <aoliva@redhat.com> wrote:
>>> Over the past few months, I've been working on porting to IA32 and
>>> AMD64/EM64T the interesting bits of the TLS design I came up with for
>>> FR-V, achieving some impressive speedups along with slight code size
>>> reductions in the most common cases.
>>> Although the design is not set in stone yet, it's fully implemented
>>> and functional with patches I'm about to post for binutils, gcc and
>>> glibc mainline, as follow-ups to this message, except that the GCC
>>> patch will go to gcc-patches, as expected.
>> This is the glibc portion of the implementation. I've built it for
>> amd64-linux-gnu and i686-pc-linux-gnu, with both the current TLS
>> dialect and the new -mtls-dialect=gnu2, without regressions.
> Revised patch using new relocation and dynamic entry numbers. Tested
> again on all 4 combinations mentioned above.
Revised patch that does not include other changes, fixes a few
inconsistent uses of addends here and there (copy&pastos), avoids
crashing when lazily resolving TLSDESC relocations to weak symbols
that turn out to be undefined, and, as a bonus, handles them correctly
such that their address does map to NULL for all threads. I know
people shouldn't rely on this, since the linker may very well break
it, but hey, since I was already tweaking the code to avoid crashes,
why not take the final step and make it work as closely as possible to
a random person's expectations? :-)
Index: ChangeLog
from Alexandre Oliva <aoliva@redhat.com>
Introduce TLS descriptors for i386 and x86_64.
* elf/dl-reloc.c (_dl_try_allocate_static_tls): Extract from...
(_dl_allocate_static_tls): ... here. Rearrange failure path.
(TRY_STATIC_TLS): New macro.
* elf/dl-conflict.c (TRY_STATIC_TLS): Dummy define.
* elf/elf.h (DT_TLSDESC_GOT, DT_TLSDESC_PLT): Define.
(R_386_TLS_GOTDESC, R_386_TLS_DESC_CALL, R_386_TLS_DESC): Define.
(R_X86_64_PC64, R_X86_GOTOFF64, R_X86_64_GOTPC32): Merge from
binutils.
(R_X86_64_GOTPC32_TLSDESC, R_X86_64_TLSDESC_CALL,
R_X86_64_TLSDESC): Define.
(R_386_NUM, R_X86_64_NUM): Adjust.
* sysdeps/i386/Makefile (sysdep-dl-routines, sysdep_routines,
systep-rtld-routines): Add tlsdesc and dl-tlsdesc for elf subdir.
(gen-as-const-headers): Add tlsdesc.sym to csu subdir.
* sysdeps/i386/dl-lookupcfg.h: New file. Introduce _dl_unmap to
release tlsdesc_table.
* sysdeps/i386/dl-machine.h: Include dl-tlsdesc.h.
(elf_machine_type_class): Mark R_386_TLS_DESC as PLT class.
(elf_machine_rel): Handle R_386_TLS_DESC.
(elf_machine_rela): Likewise.
(elf_machine_lazy_rel): Likewise.
(elf_machine_lazy_rela): Likewise.
* sysdeps/i386/dl-tls.h (struct dl_tls_index): Name it.
* sysdeps/i386/dl-tlsdesc.S: New file.
* sysdeps/i386/dl-tlsdesc.h: New file.
* sysdeps/i386/tlsdesc.c: New file.
* sysdeps/i386/tlsdesc.sym: New file.
* sysdeps/i386/bits/linkmap.h (struct link_map_machine): Add
tlsdesc_table.
* sysdeps/x86_64/Makefile (sysdep-dl-routines, sysdep_routines,
systep-rtld-routines): Add tlsdesc and dl-tlsdesc for elf subdir.
(gen-as-const-headers): Add tlsdesc.sym to csu subdir.
* sysdeps/x86_64/dl-lookupcfg.h: New file. Introduce _dl_unmap to
release tlsdesc_table.
* sysdeps/x86_64/dl-machine.h: Include dl-tlsdesc.h.
(elf_machine_runtime_setup): Set up lazy TLSDESC GOT entry.
(elf_machine_type_class): Mark R_X86_64_TLSDESC as PLT class.
(elf_machine_rel): Handle R_X86_64_TLSDESC.
(elf_machine_rela): Likewise.
(elf_machine_lazy_rel): Likewise.
* sysdeps/x86_64/dl-tls.h (struct dl_tls_index): Name it.
(__tls_get_addr): Do not declare for non-shared compiles.
* sysdeps/x86_64/dl-tlsdesc.S: New file.
* sysdeps/x86_64/dl-tlsdesc.h: New file.
* sysdeps/x86_64/tlsdesc.c: New file.
* sysdeps/x86_64/tlsdesc.sym: New file.
* sysdeps/x86_64/bits/linkmap.h (struct link_map_machine): Add
tlsdesc_table for both 32- and 64-bit structs.
Index: elf/dl-conflict.c
===================================================================
--- elf/dl-conflict.c.orig
+++ elf/dl-conflict.c
@@ -1,5 +1,5 @@
/* Resolve conflicts against already prelinked libraries.
- Copyright (C) 2001, 2002, 2003, 2004 Free Software Foundation, Inc.
+ Copyright (C) 2001, 2002, 2003, 2004, 2005 Free Software Foundation, Inc.
This file is part of the GNU C Library.
Contributed by Jakub Jelinek <jakub@redhat.com>, 2001.
@@ -45,6 +45,7 @@ _dl_resolve_conflicts (struct link_map *
#define RESOLVE_MAP(ref, version, flags) (*ref = NULL, NULL)
#define RESOLVE(ref, version, flags) (*ref = NULL, 0)
#define CHECK_STATIC_TLS(ref_map, sym_map) ((void) 0)
+#define TRY_STATIC_TLS(ref_map, sym_map) (0)
#define RESOLVE_CONFLICT_FIND_MAP(map, r_offset) \
do { \
while ((resolve_conflict_map->l_map_end < (ElfW(Addr)) (r_offset)) \
Index: elf/dl-reloc.c
===================================================================
--- elf/dl-reloc.c.orig
+++ elf/dl-reloc.c
@@ -44,9 +44,9 @@
This function intentionally does not return any value but signals error
directly, as static TLS should be rare and code handling it should
not be inlined as much as possible. */
-void
-internal_function __attribute_noinline__
-_dl_allocate_static_tls (struct link_map *map)
+int
+internal_function
+_dl_try_allocate_static_tls (struct link_map *map)
{
/* If we've already used the variable with dynamic access, or if the
alignment requirements are too high, fail. */
@@ -54,8 +54,7 @@ _dl_allocate_static_tls (struct link_map
|| map->l_tls_align > GL(dl_tls_static_align))
{
fail:
- _dl_signal_error (0, map->l_name, NULL, N_("\
-cannot allocate memory in static TLS block"));
+ return -1;
}
# if TLS_TCB_AT_TP
@@ -109,6 +108,20 @@ cannot allocate memory in static TLS blo
}
else
map->l_need_tls_init = 1;
+
+ return 0;
+}
+
+void
+internal_function __attribute_noinline__
+_dl_allocate_static_tls (struct link_map *map)
+{
+ if (map->l_tls_offset == FORCED_DYNAMIC_TLS_OFFSET
+ || _dl_try_allocate_static_tls (map))
+ {
+ _dl_signal_error (0, map->l_name, NULL, N_("\
+cannot allocate memory in static TLS block"));
+ }
}
/* Initialize static TLS area and DTV for current (only) thread.
@@ -267,6 +280,12 @@ _dl_relocate_object (struct link_map *l,
_dl_allocate_static_tls (sym_map); \
} while (0)
+#define TRY_STATIC_TLS(map, sym_map) \
+ (__builtin_expect ((sym_map)->l_tls_offset \
+ != FORCED_DYNAMIC_TLS_OFFSET, 1) \
+ && (__builtin_expect ((sym_map)->l_tls_offset != NO_TLS_OFFSET, 1) \
+ || _dl_try_allocate_static_tls (sym_map) == 0))
+
#include "dynamic-link.h"
ELF_DYNAMIC_RELOCATE (l, lazy, consider_profiling);
Index: elf/elf.h
===================================================================
--- elf/elf.h.orig
+++ elf/elf.h
@@ -699,6 +699,12 @@ typedef struct
If any adjustment is made to the ELF object after it has been
built these entries will need to be adjusted. */
#define DT_ADDRRNGLO 0x6ffffe00
+#define DT_TLSDESC_PLT 0x6ffffef6 /* Location of PLT entry for
+ TLS descriptor resolver
+ calls. */
+#define DT_TLSDESC_GOT 0x6ffffef7 /* Location of GOT entry used
+ by TLS descriptor resolver
+ PLT entry. */
#define DT_GNU_CONFLICT 0x6ffffef8 /* Start of conflict section */
#define DT_GNU_LIBLIST 0x6ffffef9 /* Library list */
#define DT_CONFIG 0x6ffffefa /* Configuration information. */
@@ -1136,8 +1142,17 @@ typedef struct
#define R_386_TLS_DTPMOD32 35 /* ID of module containing symbol */
#define R_386_TLS_DTPOFF32 36 /* Offset in TLS block */
#define R_386_TLS_TPOFF32 37 /* Negated offset in static TLS block */
+/* 38? */
+#define R_386_TLS_GOTDESC 39 /* GOT offset for TLS descriptor. */
+#define R_386_TLS_DESC_CALL 40 /* Marker of call through TLS
+ descriptor for
+ relaxation. */
+#define R_386_TLS_DESC 41 /* TLS descriptor containing
+ pointer to code and to
+ argument, returning the TLS
+ offset for the symbol. */
/* Keep this the last entry. */
-#define R_386_NUM 38
+#define R_386_NUM 42
/* SUN SPARC specific definitions. */
@@ -2496,8 +2511,17 @@ typedef Elf32_Addr Elf32_Conflict;
#define R_X86_64_GOTTPOFF 22 /* 32 bit signed PC relative offset
to GOT entry for IE symbol */
#define R_X86_64_TPOFF32 23 /* Offset in initial TLS block */
+#define R_X86_64_PC64 24 /* PC relative 64 bit */
+#define R_X86_64_GOTOFF64 25 /* 64 bit offset to GOT */
+#define R_X86_64_GOTPC32 26 /* 32 bit signed pc relative
+ offset to GOT */
+/* 27 .. 33 */
+#define R_X86_64_GOTPC32_TLSDESC 34 /* GOT offset for TLS descriptor. */
+#define R_X86_64_TLSDESC_CALL 35 /* Marker for call through TLS
+ descriptor. */
+#define R_X86_64_TLSDESC 36 /* TLS descriptor. */
-#define R_X86_64_NUM 24
+#define R_X86_64_NUM 37
/* AM33 relocations. */
Index: sysdeps/i386/Makefile
===================================================================
--- sysdeps/i386/Makefile.orig
+++ sysdeps/i386/Makefile
@@ -65,3 +65,13 @@ endif
ifneq (,$(filter -mno-tls-direct-seg-refs,$(CFLAGS)))
defines += -DNO_TLS_DIRECT_SEG_REFS
endif
+
+ifeq ($(subdir),elf)
+sysdep-dl-routines += tlsdesc dl-tlsdesc
+sysdep_routines += tlsdesc dl-tlsdesc
+sysdep-rtld-routines += tlsdesc dl-tlsdesc
+endif
+
+ifeq ($(subdir),csu)
+gen-as-const-headers += tlsdesc.sym
+endif
Index: sysdeps/i386/bits/linkmap.h
===================================================================
--- sysdeps/i386/bits/linkmap.h.orig
+++ sysdeps/i386/bits/linkmap.h
@@ -2,4 +2,5 @@ struct link_map_machine
{
Elf32_Addr plt; /* Address of .plt + 0x16 */
Elf32_Addr gotplt; /* Address of .got + 0x0c */
+ void *tlsdesc_table; /* Address of TLS descriptor hash table. */
};
Index: sysdeps/i386/dl-lookupcfg.h
===================================================================
--- /dev/null
+++ sysdeps/i386/dl-lookupcfg.h
@@ -0,0 +1,28 @@
+/* Configuration of lookup functions.
+ Copyright (C) 2005 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+ 02111-1307 USA. */
+
+#define DL_UNMAP_IS_SPECIAL
+
+#include_next <dl-lookupcfg.h>
+
+struct link_map;
+
+extern void _dl_unmap (struct link_map *map);
+
+#define DL_UNMAP(map) _dl_unmap (map)
Index: sysdeps/i386/dl-machine.h
===================================================================
--- sysdeps/i386/dl-machine.h.orig
+++ sysdeps/i386/dl-machine.h
@@ -25,6 +25,7 @@
#include <sys/param.h>
#include <sysdep.h>
#include <tls.h>
+#include <dl-tlsdesc.h>
/* Return nonzero iff ELF header is compatible with the running host. */
static inline int __attribute__ ((unused))
@@ -248,7 +249,7 @@ _dl_start_user:\n\
# define elf_machine_type_class(type) \
((((type) == R_386_JMP_SLOT || (type) == R_386_TLS_DTPMOD32 \
|| (type) == R_386_TLS_DTPOFF32 || (type) == R_386_TLS_TPOFF32 \
- || (type) == R_386_TLS_TPOFF) \
+ || (type) == R_386_TLS_TPOFF || (type) == R_386_TLS_DESC) \
* ELF_RTYPE_CLASS_PLT) \
| (((type) == R_386_COPY) * ELF_RTYPE_CLASS_COPY))
#else
@@ -375,6 +376,38 @@ elf_machine_rel (struct link_map *map, c
*reloc_addr = sym->st_value;
# endif
break;
+ case R_386_TLS_DESC:
+ {
+ struct tlsdesc volatile *td =
+ (struct tlsdesc volatile *)reloc_addr;
+
+# ifndef RTLD_BOOTSTRAP
+ if (! sym)
+ td->entry = _dl_tlsdesc_undefweak;
+ else
+# endif
+ {
+# ifndef RTLD_BOOTSTRAP
+# ifndef SHARED
+ CHECK_STATIC_TLS (map, sym_map);
+# else
+ if (!TRY_STATIC_TLS (map, sym_map))
+ {
+ td->arg = _dl_make_tlsdesc_dynamic
+ (sym_map, sym->st_value + (ElfW(Word))td->arg);
+ td->entry = _dl_tlsdesc_dynamic;
+ }
+ else
+# endif
+# endif
+ {
+ td->arg = (void*)(sym->st_value - sym_map->l_tls_offset
+ + (ElfW(Word))td->arg);
+ td->entry = _dl_tlsdesc_return;
+ }
+ }
+ break;
+ }
case R_386_TLS_TPOFF32:
/* The offset is positive, backward from the thread pointer. */
# ifdef RTLD_BOOTSTRAP
@@ -488,6 +521,41 @@ elf_machine_rela (struct link_map *map,
Therefore the offset is already correct. */
*reloc_addr = (sym == NULL ? 0 : sym->st_value) + reloc->r_addend;
break;
+ case R_386_TLS_DESC:
+ {
+ struct tlsdesc volatile *td =
+ (struct tlsdesc volatile *)reloc_addr;
+
+# ifndef RTLD_BOOTSTRAP
+ if (!sym)
+ {
+ td->arg = (void*)reloc->r_addend;
+ td->entry = _dl_tlsdesc_undefweak;
+ }
+ else
+# endif
+ {
+# ifndef RTLD_BOOTSTRAP
+# ifndef SHARED
+ CHECK_STATIC_TLS (map, sym_map);
+# else
+ if (!TRY_STATIC_TLS (map, sym_map))
+ {
+ td->arg = _dl_make_tlsdesc_dynamic
+ (sym_map, sym->st_value + reloc->r_addend);
+ td->entry = _dl_tlsdesc_dynamic;
+ }
+ else
+# endif
+# endif
+ {
+ td->arg = (void*)(sym->st_value - sym_map->l_tls_offset
+ + reloc->r_addend);
+ td->entry = _dl_tlsdesc_return;
+ }
+ }
+ }
+ break;
case R_386_TLS_TPOFF32:
/* The offset is positive, backward from the thread pointer. */
/* We know the offset of object the symbol is contained in.
@@ -582,6 +650,55 @@ elf_machine_lazy_rel (struct link_map *m
*reloc_addr = (map->l_mach.plt
+ (((Elf32_Addr) reloc_addr) - map->l_mach.gotplt) * 4);
}
+#ifdef USE_TLS
+ else if (__builtin_expect (r_type == R_386_TLS_DESC, 1))
+ {
+ struct tlsdesc volatile * __attribute__((__unused__)) td =
+ (struct tlsdesc volatile *)reloc_addr;
+
+ /* Handle relocations that reference the local *ABS* in a simple
+ way, so as to preserve a potential addend. */
+ if (ELF32_R_SYM (reloc->r_info) == 0)
+ td->entry = _dl_tlsdesc_resolve_abs_plus_addend;
+ /* Given a known-zero addend, we can store a pointer to the
+ reloc in the arg position. */
+ else if (td->arg == 0)
+ {
+ td->arg = (void*)reloc;
+ td->entry = _dl_tlsdesc_resolve_rel;
+ }
+ else
+ {
+ /* We could handle non-*ABS* relocations with non-zero addends
+ by allocating dynamically an arg to hold a pointer to the
+ reloc, but that sounds pointless. */
+ const Elf32_Rel *const r = reloc;
+ /* The code below was borrowed from elf_dynamic_do_rel(). */
+ const ElfW(Sym) *const symtab =
+ (const void *) D_PTR (map, l_info[DT_SYMTAB]);
+
+#ifdef RTLD_BOOTSTRAP
+ /* The dynamic linker always uses versioning. */
+ assert (map->l_info[VERSYMIDX (DT_VERSYM)] != NULL);
+#else
+ if (map->l_info[VERSYMIDX (DT_VERSYM)])
+#endif
+ {
+ const ElfW(Half) *const version =
+ (const void *) D_PTR (map, l_info[VERSYMIDX (DT_VERSYM)]);
+ ElfW(Half) ndx = version[ELFW(R_SYM) (r->r_info)] & 0x7fff;
+ elf_machine_rel (map, r, &symtab[ELFW(R_SYM) (r->r_info)],
+ &map->l_versions[ndx],
+ (void *) (l_addr + r->r_offset));
+ }
+#ifndef RTLD_BOOTSTRAP
+ else
+ elf_machine_rel (map, r, &symtab[ELFW(R_SYM) (r->r_info)], NULL,
+ (void *) (l_addr + r->r_offset));
+#endif
+ }
+ }
+#endif
else
_dl_reloc_bad_type (map, r_type, 1);
}
@@ -593,6 +710,22 @@ __attribute__ ((always_inline))
elf_machine_lazy_rela (struct link_map *map,
Elf32_Addr l_addr, const Elf32_Rela *reloc)
{
+#ifdef USE_TLS
+ Elf32_Addr *const reloc_addr = (void *) (l_addr + reloc->r_offset);
+ const unsigned int r_type = ELF32_R_TYPE (reloc->r_info);
+ if (__builtin_expect (r_type == R_386_JMP_SLOT, 1))
+ ;
+ else if (__builtin_expect (r_type == R_386_TLS_DESC, 1))
+ {
+ struct tlsdesc volatile * __attribute__((__unused__)) td =
+ (struct tlsdesc volatile *)reloc_addr;
+
+ td->arg = (void*)reloc;
+ td->entry = _dl_tlsdesc_resolve_rela;
+ }
+ else
+ _dl_reloc_bad_type (map, r_type, 1);
+#endif
}
#endif /* !RTLD_BOOTSTRAP */
Index: sysdeps/i386/dl-tls.h
===================================================================
--- sysdeps/i386/dl-tls.h.orig
+++ sysdeps/i386/dl-tls.h
@@ -19,7 +19,7 @@
/* Type used for the representation of TLS information in the GOT. */
-typedef struct
+typedef struct dl_tls_index
{
unsigned long int ti_module;
unsigned long int ti_offset;
Index: sysdeps/i386/dl-tlsdesc.S
===================================================================
--- /dev/null
+++ sysdeps/i386/dl-tlsdesc.S
@@ -0,0 +1,228 @@
+/* Thread-local storage handling in the ELF dynamic linker. i386 version.
+ Copyright (C) 2004, 2005 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+ 02111-1307 USA. */
+
+#include <sysdep.h>
+#include <tls.h>
+#include "tlsdesc.h"
+
+ .text
+#ifdef USE_TLS
+ .hidden _dl_tlsdesc_return
+ .global _dl_tlsdesc_return
+ .type _dl_tlsdesc_return,@function
+ cfi_startproc
+ .align 16
+_dl_tlsdesc_return:
+ movl 4(%eax), %eax
+ ret
+ cfi_endproc
+ .size _dl_tlsdesc_return, .-_dl_tlsdesc_return
+
+ .hidden _dl_tlsdesc_undefweak
+ .global _dl_tlsdesc_undefweak
+ .type _dl_tlsdesc_undefweak,@function
+ cfi_startproc
+ .align 16
+_dl_tlsdesc_undefweak:
+ movl 4(%eax), %eax
+ subl %gs:0, %eax
+ ret
+ cfi_endproc
+ .size _dl_tlsdesc_undefweak, .-_dl_tlsdesc_undefweak
+
+#ifdef SHARED
+ .hidden _dl_tlsdesc_dynamic
+ .global _dl_tlsdesc_dynamic
+ .type _dl_tlsdesc_dynamic,@function
+
+ /* %eax points to the TLS descriptor, such that 0(%eax) points to
+ _dl_tlsdesc_dynamic itself, and 4(%eax) points to a struct
+ tlsdesc_dynamic_arg object. It must return in %eax the offset
+ between the thread pointer and the object denoted by the
+ argument, without clobbering any registers.
+
+ The assembly code that follows is a rendition of the following
+ C code, hand-optimized a little bit.
+
+ptrdiff_t
+__attribute__ ((__regparm__ (1)))
+_dl_tlsdesc_dynamic (struct tlsdesc *tdp)
+{
+ struct tlsdesc_dynamic_arg *td = tdp->arg;
+ dtv_t *dtv = *(dtv_t **)((char *)__thread_pointer + DTV_OFFSET);
+ if (__builtin_expect (td->gen_count <= dtv[0].counter
+ && (dtv[td->tlsinfo.ti_module].pointer.val
+ != TLS_DTV_UNALLOCATED),
+ 1))
+ return dtv[td->tlsinfo.ti_module].pointer.val + td->tlsinfo.ti_offset
+ - __thread_pointer;
+
+ return ___tls_get_addr (&td->tlsinfo) - __thread_pointer;
+}
+*/
+ cfi_startproc
+ .align 16
+_dl_tlsdesc_dynamic:
+ /* Preserve call-clobbered registers.
+ We need two scratch regs anyway.
+ FIXME: maybe remove the requirement to preserve them? */
+ subl $28, %esp
+ cfi_adjust_cfa_offset (28)
+ movl %ecx, 20(%esp)
+ movl %edx, 24(%esp)
+ movl TLSDESC_ARG(%eax), %eax
+ movl %gs:DTV_OFFSET, %edx
+ movl TLSDESC_GEN_COUNT(%eax), %ecx
+ cmpl (%edx), %ecx
+ ja .Lslow
+ movl TLSDESC_MODID(%eax), %ecx
+ movl (%edx,%ecx,8), %edx
+ cmpl $-1, %edx
+ je .Lslow
+ movl TLSDESC_MODOFF(%eax), %eax
+ addl %edx, %eax
+.Lret:
+ movl 20(%esp), %ecx
+ subl %gs:0, %eax
+ movl 24(%esp), %edx
+ addl $28, %esp
+ cfi_adjust_cfa_offset (-28)
+ ret
+ .p2align 4,,7
+.Lslow:
+ cfi_adjust_cfa_offset (28)
+ movl %ebx, 16(%esp)
+ call __i686.get_pc_thunk.bx
+ addl $_GLOBAL_OFFSET_TABLE_, %ebx
+ call ___tls_get_addr@PLT
+ movl 16(%esp), %ebx
+ jmp .Lret
+ cfi_endproc
+ .size _dl_tlsdesc_dynamic, .-_dl_tlsdesc_dynamic
+#endif /* SHARED */
+
+ .hidden _dl_tlsdesc_resolve_abs_plus_addend
+ .global _dl_tlsdesc_resolve_abs_plus_addend
+ .type _dl_tlsdesc_resolve_abs_plus_addend,@function
+ cfi_startproc
+ .align 16
+_dl_tlsdesc_resolve_abs_plus_addend:
+0:
+ pushl %eax
+ cfi_adjust_cfa_offset (4)
+ pushl %ecx
+ cfi_adjust_cfa_offset (4)
+ pushl %edx
+ cfi_adjust_cfa_offset (4)
+ movl $1f - 0b, %ecx
+ movl 4(%ebx), %edx
+ call _dl_tlsdesc_resolve_abs_plus_addend_fixup
+1:
+ popl %edx
+ cfi_adjust_cfa_offset (-4)
+ popl %ecx
+ cfi_adjust_cfa_offset (-4)
+ popl %eax
+ cfi_adjust_cfa_offset (-4)
+ jmp *(%eax)
+ cfi_endproc
+ .size _dl_tlsdesc_resolve_abs_plus_addend, .-_dl_tlsdesc_resolve_abs_plus_addend
+
+ .hidden _dl_tlsdesc_resolve_rel
+ .global _dl_tlsdesc_resolve_rel
+ .type _dl_tlsdesc_resolve_rel,@function
+ cfi_startproc
+ .align 16
+_dl_tlsdesc_resolve_rel:
+0:
+ pushl %eax
+ cfi_adjust_cfa_offset (4)
+ pushl %ecx
+ cfi_adjust_cfa_offset (4)
+ pushl %edx
+ cfi_adjust_cfa_offset (4)
+ movl $1f - 0b, %ecx
+ movl 4(%ebx), %edx
+ call _dl_tlsdesc_resolve_rel_fixup
+1:
+ popl %edx
+ cfi_adjust_cfa_offset (-4)
+ popl %ecx
+ cfi_adjust_cfa_offset (-4)
+ popl %eax
+ cfi_adjust_cfa_offset (-4)
+ jmp *(%eax)
+ cfi_endproc
+ .size _dl_tlsdesc_resolve_rel, .-_dl_tlsdesc_resolve_rel
+
+ .hidden _dl_tlsdesc_resolve_rela
+ .global _dl_tlsdesc_resolve_rela
+ .type _dl_tlsdesc_resolve_rela,@function
+ cfi_startproc
+ .align 16
+_dl_tlsdesc_resolve_rela:
+0:
+ pushl %eax
+ cfi_adjust_cfa_offset (4)
+ pushl %ecx
+ cfi_adjust_cfa_offset (4)
+ pushl %edx
+ cfi_adjust_cfa_offset (4)
+ movl $1f - 0b, %ecx
+ movl 4(%ebx), %edx
+ call _dl_tlsdesc_resolve_rela_fixup
+1:
+ popl %edx
+ cfi_adjust_cfa_offset (-4)
+ popl %ecx
+ cfi_adjust_cfa_offset (-4)
+ popl %eax
+ cfi_adjust_cfa_offset (-4)
+ jmp *(%eax)
+ cfi_endproc
+ .size _dl_tlsdesc_resolve_rela, .-_dl_tlsdesc_resolve_rela
+
+ .hidden _dl_tlsdesc_resolve_hold
+ .global _dl_tlsdesc_resolve_hold
+ .type _dl_tlsdesc_resolve_hold,@function
+ cfi_startproc
+ .align 16
+_dl_tlsdesc_resolve_hold:
+0:
+ pushl %eax
+ cfi_adjust_cfa_offset (4)
+ pushl %ecx
+ cfi_adjust_cfa_offset (4)
+ pushl %edx
+ cfi_adjust_cfa_offset (4)
+ movl $1f - 0b, %ecx
+ movl 4(%ebx), %edx
+ call _dl_tlsdesc_resolve_hold_fixup
+1:
+ popl %edx
+ cfi_adjust_cfa_offset (-4)
+ popl %ecx
+ cfi_adjust_cfa_offset (-4)
+ popl %eax
+ cfi_adjust_cfa_offset (-4)
+ jmp *(%eax)
+ cfi_endproc
+ .size _dl_tlsdesc_resolve_hold, .-_dl_tlsdesc_resolve_hold
+
+#endif /* USE_TLS */
Index: sysdeps/i386/dl-tlsdesc.h
===================================================================
--- /dev/null
+++ sysdeps/i386/dl-tlsdesc.h
@@ -0,0 +1,60 @@
+/* Thread-local storage descriptor handling in the ELF dynamic linker.
+ i386 version.
+ Copyright (C) 2005 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+ 02111-1307 USA. */
+
+#ifndef _I386_DL_TLSDESC_H
+# define _I386_DL_TLSDESC_H 1
+
+/* Type used to represent a TLS descriptor in the GOT. */
+struct tlsdesc
+{
+ ptrdiff_t __attribute__((regparm(1))) (*entry)(struct tlsdesc *);
+ void *arg;
+};
+
+typedef struct dl_tls_index
+{
+ unsigned long int ti_module;
+ unsigned long int ti_offset;
+} tls_index;
+
+/* Type used as the argument in a TLS descriptor for a symbol that
+ needs dynamic TLS offsets. */
+struct tlsdesc_dynamic_arg
+{
+ tls_index tlsinfo;
+ size_t gen_count;
+};
+
+extern ptrdiff_t attribute_hidden __attribute__((regparm(1)))
+ _dl_tlsdesc_return(struct tlsdesc *),
+ _dl_tlsdesc_undefweak(struct tlsdesc *),
+ _dl_tlsdesc_resolve_abs_plus_addend(struct tlsdesc *),
+ _dl_tlsdesc_resolve_rel(struct tlsdesc *),
+ _dl_tlsdesc_resolve_rela(struct tlsdesc *),
+ _dl_tlsdesc_resolve_hold(struct tlsdesc *);
+
+# ifdef SHARED
+extern void *_dl_make_tlsdesc_dynamic (struct link_map *map, size_t ti_offset);
+
+extern ptrdiff_t attribute_hidden __attribute__((regparm(1)))
+ _dl_tlsdesc_dynamic(struct tlsdesc *);
+# endif
+
+#endif
Index: sysdeps/i386/tlsdesc.c
===================================================================
--- /dev/null
+++ sysdeps/i386/tlsdesc.c
@@ -0,0 +1,673 @@
+/* Manage TLS descriptors. i386 version.
+ Copyright (C) 2005 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+ 02111-1307 USA. */
+
+#include <link.h>
+#include <ldsodefs.h>
+#include <elf/dynamic-link.h>
+#include <tls.h>
+#include <dl-tlsdesc.h>
+
+#ifdef USE_TLS
+# ifdef SHARED
+
+extern void weak_function free (void *ptr);
+
+/* The hashcode handling code below is heavily inspired in libiberty's
+ hashtab code, but with most adaptation points and support for
+ deleting elements removed.
+
+ Copyright (C) 1999, 2000, 2001, 2002, 2003 Free Software Foundation, Inc.
+ Contributed by Vladimir Makarov (vmakarov@cygnus.com). */
+
+inline static unsigned long
+higher_prime_number (unsigned long n)
+{
+ /* These are primes that are near, but slightly smaller than, a
+ power of two. */
+ static const unsigned long primes[] = {
+ (unsigned long) 7,
+ (unsigned long) 13,
+ (unsigned long) 31,
+ (unsigned long) 61,
+ (unsigned long) 127,
+ (unsigned long) 251,
+ (unsigned long) 509,
+ (unsigned long) 1021,
+ (unsigned long) 2039,
+ (unsigned long) 4093,
+ (unsigned long) 8191,
+ (unsigned long) 16381,
+ (unsigned long) 32749,
+ (unsigned long) 65521,
+ (unsigned long) 131071,
+ (unsigned long) 262139,
+ (unsigned long) 524287,
+ (unsigned long) 1048573,
+ (unsigned long) 2097143,
+ (unsigned long) 4194301,
+ (unsigned long) 8388593,
+ (unsigned long) 16777213,
+ (unsigned long) 33554393,
+ (unsigned long) 67108859,
+ (unsigned long) 134217689,
+ (unsigned long) 268435399,
+ (unsigned long) 536870909,
+ (unsigned long) 1073741789,
+ (unsigned long) 2147483647,
+ /* 4294967291L */
+ ((unsigned long) 2147483647) + ((unsigned long) 2147483644),
+ };
+
+ const unsigned long *low = &primes[0];
+ const unsigned long *high = &primes[sizeof(primes) / sizeof(primes[0])];
+
+ while (low != high)
+ {
+ const unsigned long *mid = low + (high - low) / 2;
+ if (n > *mid)
+ low = mid + 1;
+ else
+ high = mid;
+ }
+
+#if 0
+ /* If we've run out of primes, abort. */
+ if (n > *low)
+ {
+ fprintf (stderr, "Cannot find prime bigger than %lu\n", n);
+ abort ();
+ }
+#endif
+
+ return *low;
+}
+
+struct hashtab
+{
+ /* Table itself. */
+ void **entries;
+
+ /* Current size (in entries) of the hash table */
+ size_t size;
+
+ /* Current number of elements. */
+ size_t n_elements;
+};
+
+inline static struct hashtab *
+htab_create (void)
+{
+ struct hashtab *ht = malloc (sizeof (struct hashtab));
+
+ if (! ht)
+ return NULL;
+ ht->size = 3;
+ ht->entries = malloc (sizeof (void *) * ht->size);
+ if (! ht->entries)
+ return NULL;
+
+ ht->n_elements = 0;
+
+ memset (ht->entries, 0, sizeof (void *) * ht->size);
+
+ return ht;
+}
+
+/* This is only called from _dl_unmap, so it's safe to call
+ free(). See the discussion below. */
+inline static void
+htab_delete (struct hashtab *htab)
+{
+ int i;
+
+ for (i = htab->size - 1; i >= 0; i--)
+ if (htab->entries[i])
+ free (htab->entries[i]);
+
+ free (htab->entries);
+ free (htab);
+}
+
+/* Similar to htab_find_slot, but without several unwanted side effects:
+ - Does not call htab->eq_f when it finds an existing entry.
+ - Does not change the count of elements/searches/collisions in the
+ hash table.
+ This function also assumes there are no deleted entries in the table.
+ HASH is the hash value for the element to be inserted. */
+
+inline static void **
+find_empty_slot_for_expand (struct hashtab *htab, int hash)
+{
+ size_t size = htab->size;
+ unsigned int index = hash % size;
+ void **slot = htab->entries + index;
+ int hash2;
+
+ if (! *slot)
+ return slot;
+
+ hash2 = 1 + hash % (size - 2);
+ for (;;)
+ {
+ index += hash2;
+ if (index >= size)
+ index -= size;
+
+ slot = htab->entries + index;
+ if (! *slot)
+ return slot;
+ }
+}
+
+/* The following function changes size of memory allocated for the
+ entries and repeatedly inserts the table elements. The occupancy
+ of the table after the call will be about 50%. Naturally the hash
+ table must already exist. Remember also that the place of the
+ table entries is changed. If memory allocation failures are allowed,
+ this function will return zero, indicating that the table could not be
+ expanded. If all goes well, it will return a non-zero value. */
+
+inline static int
+htab_expand (struct hashtab *htab, int (*hash_fn)(void *))
+{
+ void **oentries;
+ void **olimit;
+ void **p;
+ void **nentries;
+ size_t nsize;
+
+ oentries = htab->entries;
+ olimit = oentries + htab->size;
+
+ /* Resize only when table after removal of unused elements is either
+ too full or too empty. */
+ if (htab->n_elements * 2 > htab->size)
+ nsize = higher_prime_number (htab->n_elements * 2);
+ else
+ nsize = htab->size;
+
+ nentries = malloc (sizeof (void *) * nsize);
+ memset (nentries, 0, sizeof (void *) * nsize);
+ if (nentries == NULL)
+ return 0;
+ htab->entries = nentries;
+ htab->size = nsize;
+
+ p = oentries;
+ do
+ {
+ if (*p)
+ *find_empty_slot_for_expand (htab, hash_fn (*p))
+ = *p;
+
+ p++;
+ }
+ while (p < olimit);
+
+#if 0 /* We can't tell whether this was allocated by the malloc()
+ built into ld.so or the one in the main executable or libc,
+ and calling free() for something that wasn't malloc()ed could
+ do Very Bad Things (TM). Take the conservative approach
+ here, potentially wasting as much memory as actually used by
+ the hash table, even if multiple growths occur. That's not
+ so bad as to require some overengineered solution that would
+ enable us to keep track of how it was allocated. */
+ free (oentries);
+#endif
+ return 1;
+}
+
+/* This function searches for a hash table slot containing an entry
+ equal to the given element. To delete an entry, call this with
+ INSERT = 0, then call htab_clear_slot on the slot returned (possibly
+ after doing some checks). To insert an entry, call this with
+ INSERT = 1, then write the value you want into the returned slot.
+ When inserting an entry, NULL may be returned if memory allocation
+ fails. */
+
+inline static void **
+htab_find_slot (struct hashtab *htab, void *ptr, int insert,
+ int (*hash_fn)(void *), int (*eq_fn)(void *, void *))
+{
+ unsigned int index;
+ int hash, hash2;
+ size_t size;
+ void **entry;
+
+ if (htab->size * 3 <= htab->n_elements * 4
+ && htab_expand (htab, hash_fn) == 0)
+ return NULL;
+
+ hash = hash_fn (ptr);
+
+ size = htab->size;
+ index = hash % size;
+
+ entry = &htab->entries[index];
+ if (!*entry)
+ goto empty_entry;
+ else if (eq_fn (*entry, ptr))
+ return entry;
+
+ hash2 = 1 + hash % (size - 2);
+ for (;;)
+ {
+ index += hash2;
+ if (index >= size)
+ index -= size;
+
+ entry = &htab->entries[index];
+ if (!*entry)
+ goto empty_entry;
+ else if (eq_fn (*entry, ptr))
+ return entry;
+ }
+
+ empty_entry:
+ if (!insert)
+ return NULL;
+
+ htab->n_elements++;
+ return entry;
+}
+
+inline static int
+hash_tlsdesc(void *p)
+{
+ struct tlsdesc_dynamic_arg *td = p;
+
+ /* We know all entries are for the same module, so ti_offset is the
+ only distinguishing entry. */
+ return td->tlsinfo.ti_offset;
+}
+
+inline static int
+eq_tlsdesc(void *p, void *q)
+{
+ struct tlsdesc_dynamic_arg *tdp = p, *tdq = q;
+
+ return tdp->tlsinfo.ti_offset == tdq->tlsinfo.ti_offset;
+}
+
+inline static int
+map_generation (struct link_map *map)
+{
+ size_t idx = map->l_tls_modid;
+ struct dtv_slotinfo_list *listp = GL(dl_tls_dtv_slotinfo_list);
+
+ /* Find the place in the dtv slotinfo list. */
+ do
+ {
+ /* Does it fit in the array of this list element? */
+ if (idx < listp->len)
+ {
+ /* We should never get here for a module in static TLS, so
+ we can assume that, if the generation count is zero, we
+ still haven't determined the generation count for this
+ module. */
+ if (listp->slotinfo[idx].gen)
+ return listp->slotinfo[idx].gen;
+ else
+ break;
+ }
+ idx -= listp->len;
+ listp = listp->next;
+ }
+ while (listp != NULL);
+
+ /* If we get to this point, the module still hasn't been assigned an
+ entry in the dtv slotinfo data structures, and it will when we're
+ done with relocations. At that point, the module will get a
+ generation number that is one past the current generation, so
+ return exactly that. */
+ return GL(dl_tls_generation) + 1;
+}
+
+void *
+_dl_make_tlsdesc_dynamic (struct link_map *map, size_t ti_offset)
+{
+ struct hashtab *ht;
+ void **entry;
+ struct tlsdesc_dynamic_arg *td, test;
+
+ /* FIXME: We could use a per-map lock here, but is it worth it? */
+ __rtld_lock_lock_recursive (GL(dl_load_lock));
+
+ ht = map->l_mach.tlsdesc_table;
+ if (! ht)
+ {
+ ht = htab_create ();
+ if (! ht)
+ {
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+ return 0;
+ }
+ map->l_mach.tlsdesc_table = ht;
+ }
+
+ test.tlsinfo.ti_module = map->l_tls_modid;
+ test.tlsinfo.ti_offset = ti_offset;
+ entry = htab_find_slot (ht, &test, 1, hash_tlsdesc, eq_tlsdesc);
+ if (*entry)
+ {
+ td = *entry;
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+ return td;
+ }
+
+ *entry = td = malloc (sizeof (struct tlsdesc_dynamic_arg));
+ /* This may be higher than the map's generation, but it doesn't
+ matter much. Worst case, we'll have one extra DTV update per
+ thread. */
+ td->gen_count = map_generation (map);
+ td->tlsinfo = test.tlsinfo;
+
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+ return td;
+}
+
+# endif /* SHARED */
+
+/* The idea of the following two functions is to stop multiple threads
+ from attempting to resolve the same TLS descriptor without busy
+ waiting. Ideally, we should be able to release the lock right
+ after changing td->entry, and then using say a condition variable
+ or a futex wake to wake up any waiting threads, but let's try to
+ avoid introducing such dependencies. */
+
+inline static int
+_dl_tlsdesc_resolve_early_return_p (struct tlsdesc volatile *td, void *caller)
+{
+ if (caller != td->entry)
+ return 1;
+
+ __rtld_lock_lock_recursive (GL(dl_load_lock));
+ if (caller != td->entry)
+ {
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+ return 1;
+ }
+
+ td->entry = _dl_tlsdesc_resolve_hold;
+
+ return 0;
+}
+
+inline static void
+_dl_tlsdesc_wake_up_held_fixups (void)
+{
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+}
+
+/* The following 4 functions take an entry_check_offset argument.
+ It's computed by the caller as an offset between its entry point
+ and the call site, such that by adding the built-in return address
+ that is implicitly passed to the function with this offset, we can
+ easily obtain the caller's entry point to compare with the entry
+ point given in the TLS descriptor. If it's changed, we want to
+ return immediately. */
+
+/* These macros are copied from elf/dl-reloc.c */
+
+#define CHECK_STATIC_TLS(map, sym_map) \
+ do { \
+ if (__builtin_expect ((sym_map)->l_tls_offset == NO_TLS_OFFSET \
+ || ((sym_map)->l_tls_offset \
+ == FORCED_DYNAMIC_TLS_OFFSET), 0)) \
+ _dl_allocate_static_tls (sym_map); \
+ } while (0)
+
+#define TRY_STATIC_TLS(map, sym_map) \
+ (__builtin_expect ((sym_map)->l_tls_offset \
+ != FORCED_DYNAMIC_TLS_OFFSET, 1) \
+ && (__builtin_expect ((sym_map)->l_tls_offset != NO_TLS_OFFSET, 1) \
+ || _dl_try_allocate_static_tls (sym_map) == 0))
+
+int internal_function _dl_try_allocate_static_tls (struct link_map *map);
+
+/* This function is used to lazily resolve TLS_DESC REL relocations
+ that reference the *ABS* segment in their own link maps. The
+ argument is the addend originally stored there. */
+
+void
+__attribute__ ((regparm (3))) attribute_hidden
+_dl_tlsdesc_resolve_abs_plus_addend_fixup (struct tlsdesc volatile *td,
+ struct link_map *l,
+ ptrdiff_t entry_check_offset)
+{
+ ptrdiff_t addend = (ptrdiff_t) td->arg;
+
+ if (_dl_tlsdesc_resolve_early_return_p (td, __builtin_return_address (0)
+ - entry_check_offset))
+ return;
+
+#ifndef SHARED
+ CHECK_STATIC_TLS (l, l);
+#else
+ if (!TRY_STATIC_TLS (l, l))
+ {
+ td->arg = _dl_make_tlsdesc_dynamic (l, addend);
+ td->entry = _dl_tlsdesc_dynamic;
+ }
+ else
+#endif
+ {
+ td->arg = (void*)(addend - l->l_tls_offset);
+ td->entry = _dl_tlsdesc_return;
+ }
+
+ _dl_tlsdesc_wake_up_held_fixups ();
+}
+
+/* This function is used to lazily resolve TLS_DESC REL relocations
+ that originally had zero addends. The argument location, that
+ originally held the addend, is used to hold a pointer to the
+ relocation, but it has to be restored before we call the function
+ that applies relocations. */
+
+void
+__attribute__ ((regparm (3))) attribute_hidden
+_dl_tlsdesc_resolve_rel_fixup (struct tlsdesc volatile *td,
+ struct link_map *l,
+ ptrdiff_t entry_check_offset)
+{
+ const ElfW(Rel) *reloc = td->arg;
+
+ if (_dl_tlsdesc_resolve_early_return_p (td, __builtin_return_address (0)
+ - entry_check_offset))
+ return;
+
+ /* The code below was borrowed from _dl_fixup(),
+ except for checking for STB_LOCAL. */
+ const ElfW(Sym) *const symtab
+ = (const void *) D_PTR (l, l_info[DT_SYMTAB]);
+ const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]);
+ const ElfW(Sym) *sym = &symtab[ELFW(R_SYM) (reloc->r_info)];
+ lookup_t result;
+
+ /* Look up the target symbol. If the normal lookup rules are not
+ used don't look in the global scope. */
+ if (ELFW(ST_BIND) (sym->st_info) != STB_LOCAL
+ && __builtin_expect (ELFW(ST_VISIBILITY) (sym->st_other), 0) == 0)
+ {
+ const struct r_found_version *version = NULL;
+
+ if (l->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
+ {
+ const ElfW(Half) *vernum =
+ (const void *) D_PTR (l, l_info[VERSYMIDX (DT_VERSYM)]);
+ ElfW(Half) ndx = vernum[ELFW(R_SYM) (reloc->r_info)] & 0x7fff;
+ version = &l->l_versions[ndx];
+ if (version->hash == 0)
+ version = NULL;
+ }
+
+ result = _dl_lookup_symbol_x (strtab + sym->st_name, l, &sym,
+ l->l_scope, version, ELF_RTYPE_CLASS_PLT,
+ DL_LOOKUP_ADD_DEPENDENCY, NULL);
+ }
+ else
+ {
+ /* We already found the symbol. The module (and therefore its load
+ address) is also known. */
+ result = l;
+ }
+
+ if (!sym)
+ {
+ td->arg = 0;
+ td->entry = _dl_tlsdesc_undefweak;
+ }
+ else
+ {
+# ifndef SHARED
+ CHECK_STATIC_TLS (l, result);
+# else
+ if (!TRY_STATIC_TLS (l, result))
+ {
+ td->arg = _dl_make_tlsdesc_dynamic (result, sym->st_value);
+ td->entry = _dl_tlsdesc_dynamic;
+ }
+ else
+# endif
+ {
+ td->arg = (void*)(sym->st_value - result->l_tls_offset);
+ td->entry = _dl_tlsdesc_return;
+ }
+ }
+
+ _dl_tlsdesc_wake_up_held_fixups ();
+}
+
+/* This function is used to lazily resolve TLS_DESC RELA relocations.
+ The argument location is used to hold a pointer to the relocation. */
+
+void
+__attribute__ ((regparm (3))) attribute_hidden
+_dl_tlsdesc_resolve_rela_fixup (struct tlsdesc volatile *td,
+ struct link_map *l,
+ ptrdiff_t entry_check_offset)
+{
+ const ElfW(Rela) *reloc = td->arg;
+
+ if (_dl_tlsdesc_resolve_early_return_p (td, __builtin_return_address (0)
+ - entry_check_offset))
+ return;
+
+ /* The code below was borrowed from _dl_fixup(),
+ except for checking for STB_LOCAL. */
+ const ElfW(Sym) *const symtab
+ = (const void *) D_PTR (l, l_info[DT_SYMTAB]);
+ const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]);
+ const ElfW(Sym) *sym = &symtab[ELFW(R_SYM) (reloc->r_info)];
+ lookup_t result;
+
+ /* Look up the target symbol. If the normal lookup rules are not
+ used don't look in the global scope. */
+ if (ELFW(ST_BIND) (sym->st_info) != STB_LOCAL
+ && __builtin_expect (ELFW(ST_VISIBILITY) (sym->st_other), 0) == 0)
+ {
+ const struct r_found_version *version = NULL;
+
+ if (l->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
+ {
+ const ElfW(Half) *vernum =
+ (const void *) D_PTR (l, l_info[VERSYMIDX (DT_VERSYM)]);
+ ElfW(Half) ndx = vernum[ELFW(R_SYM) (reloc->r_info)] & 0x7fff;
+ version = &l->l_versions[ndx];
+ if (version->hash == 0)
+ version = NULL;
+ }
+
+ result = _dl_lookup_symbol_x (strtab + sym->st_name, l, &sym,
+ l->l_scope, version, ELF_RTYPE_CLASS_PLT,
+ DL_LOOKUP_ADD_DEPENDENCY, NULL);
+ }
+ else
+ {
+ /* We already found the symbol. The module (and therefore its load
+ address) is also known. */
+ result = l;
+ }
+
+ if (!sym)
+ {
+ td->arg = (void*)reloc->r_addend;
+ td->entry = _dl_tlsdesc_undefweak;
+ }
+ else
+ {
+# ifndef SHARED
+ CHECK_STATIC_TLS (l, result);
+# else
+ if (!TRY_STATIC_TLS (l, result))
+ {
+ td->arg = _dl_make_tlsdesc_dynamic (result, sym->st_value
+ + reloc->r_addend);
+ td->entry = _dl_tlsdesc_dynamic;
+ }
+ else
+# endif
+ {
+ td->arg = (void*)(sym->st_value - result->l_tls_offset
+ + reloc->r_addend);
+ td->entry = _dl_tlsdesc_return;
+ }
+ }
+
+ _dl_tlsdesc_wake_up_held_fixups ();
+}
+
+void
+__attribute__ ((regparm (3))) attribute_hidden
+_dl_tlsdesc_resolve_hold_fixup (struct tlsdesc volatile *td,
+ struct link_map *l __attribute__((__unused__)),
+ ptrdiff_t entry_check_offset)
+{
+ /* Maybe we're lucky and can return early. */
+ if (__builtin_return_address (0) - entry_check_offset != td->entry)
+ return;
+
+ /* Locking here will stop execution until the runnign resolver runs
+ _dl_tlsdesc_wake_up_held_fixups(), releasing the lock.
+
+ FIXME: We'd be better off waiting on a condition variable, such
+ that we didn't have to hold the lock throughout the relocation
+ processing. */
+ __rtld_lock_lock_recursive (GL(dl_load_lock));
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+}
+
+#endif /* USE_TLS */
+
+void
+_dl_unmap (struct link_map *map)
+{
+ __munmap ((void *) (map)->l_map_start,
+ (map)->l_map_end - (map)->l_map_start);
+
+#if USE_TLS && SHARED
+ /* _dl_unmap is only called for dlopen()ed libraries, for which
+ calling free() is safe, or before we've completed the initial
+ relocation, in which case calling free() is probably pointless,
+ but still safe. */
+ if (map->l_mach.tlsdesc_table)
+ htab_delete (map->l_mach.tlsdesc_table);
+#endif
+}
Index: sysdeps/i386/tlsdesc.sym
===================================================================
--- /dev/null
+++ sysdeps/i386/tlsdesc.sym
@@ -0,0 +1,20 @@
+#include <stddef.h>
+#include <sysdep.h>
+#include <tls.h>
+#include <link.h>
+#include <dl-tlsdesc.h>
+
+--
+
+-- Abuse tls.h macros to derive offsets relative to the thread register.
+#if defined USE_TLS
+
+DTV_OFFSET offsetof(struct pthread, header.dtv)
+
+TLSDESC_ARG offsetof(struct tlsdesc, arg)
+
+TLSDESC_GEN_COUNT offsetof(struct tlsdesc_dynamic_arg, gen_count)
+TLSDESC_MODID offsetof(struct tlsdesc_dynamic_arg, tlsinfo.ti_module)
+TLSDESC_MODOFF offsetof(struct tlsdesc_dynamic_arg, tlsinfo.ti_offset)
+
+#endif
Index: sysdeps/x86_64/Makefile
===================================================================
--- sysdeps/x86_64/Makefile.orig
+++ sysdeps/x86_64/Makefile
@@ -9,3 +9,13 @@ endif
ifeq ($(subdir),gmon)
sysdep_routines += _mcount
endif
+
+ifeq ($(subdir),elf)
+sysdep-dl-routines += tlsdesc dl-tlsdesc
+sysdep_routines += tlsdesc dl-tlsdesc
+sysdep-rtld-routines += tlsdesc dl-tlsdesc
+endif
+
+ifeq ($(subdir),csu)
+gen-as-const-headers += tlsdesc.sym
+endif
Index: sysdeps/x86_64/bits/linkmap.h
===================================================================
--- sysdeps/x86_64/bits/linkmap.h.orig
+++ sysdeps/x86_64/bits/linkmap.h
@@ -3,6 +3,7 @@ struct link_map_machine
{
Elf64_Addr plt; /* Address of .plt + 0x16 */
Elf64_Addr gotplt; /* Address of .got + 0x18 */
+ void *tlsdesc_table; /* Address of TLS descriptor hash table. */
};
#else
@@ -10,5 +11,6 @@ struct link_map_machine
{
Elf32_Addr plt; /* Address of .plt + 0x16 */
Elf32_Addr gotplt; /* Address of .got + 0x0c */
+ void *tlsdesc_table; /* Address of TLS descriptor hash table. */
};
#endif
Index: sysdeps/x86_64/dl-lookupcfg.h
===================================================================
--- /dev/null
+++ sysdeps/x86_64/dl-lookupcfg.h
@@ -0,0 +1,28 @@
+/* Configuration of lookup functions.
+ Copyright (C) 2005 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+ 02111-1307 USA. */
+
+#define DL_UNMAP_IS_SPECIAL
+
+#include_next <dl-lookupcfg.h>
+
+struct link_map;
+
+extern void _dl_unmap (struct link_map *map);
+
+#define DL_UNMAP(map) _dl_unmap (map)
Index: sysdeps/x86_64/dl-machine.h
===================================================================
--- sysdeps/x86_64/dl-machine.h.orig
+++ sysdeps/x86_64/dl-machine.h
@@ -26,6 +26,7 @@
#include <sys/param.h>
#include <sysdep.h>
#include <tls.h>
+#include <dl-tlsdesc.h>
/* Return nonzero iff ELF header is compatible with the running host. */
static inline int __attribute__ ((unused))
@@ -131,6 +132,10 @@ elf_machine_runtime_setup (struct link_m
got[2] = (Elf64_Addr) &_dl_runtime_resolve;
}
+ if (l->l_info[ADDRIDX (DT_TLSDESC_GOT)] && lazy)
+ *(Elf64_Addr*)(D_PTR (l, l_info[ADDRIDX (DT_TLSDESC_GOT)]) + l->l_addr)
+ = (Elf64_Addr) &_dl_tlsdesc_resolve_rela;
+
return lazy;
}
@@ -194,7 +199,9 @@ _dl_start_user:\n\
# define elf_machine_type_class(type) \
((((type) == R_X86_64_JUMP_SLOT \
|| (type) == R_X86_64_DTPMOD64 \
- || (type) == R_X86_64_DTPOFF64 || (type) == R_X86_64_TPOFF64) \
+ || (type) == R_X86_64_DTPOFF64 \
+ || (type) == R_X86_64_TPOFF64 \
+ || (type) == R_X86_64_TLSDESC) \
* ELF_RTYPE_CLASS_PLT) \
| (((type) == R_X86_64_COPY) * ELF_RTYPE_CLASS_COPY))
#else
@@ -323,6 +330,41 @@ elf_machine_rela (struct link_map *map,
*reloc_addr = sym->st_value + reloc->r_addend;
# endif
break;
+ case R_X86_64_TLSDESC:
+ {
+ struct tlsdesc volatile *td =
+ (struct tlsdesc volatile *)reloc_addr;
+
+# ifndef RTLD_BOOTSTRAP
+ if (! sym)
+ {
+ td->arg = (void*)reloc->r_addend;
+ td->entry = _dl_tlsdesc_undefweak;
+ }
+ else
+# endif
+ {
+# ifndef RTLD_BOOTSTRAP
+# ifndef SHARED
+ CHECK_STATIC_TLS (map, sym_map);
+# else
+ if (!TRY_STATIC_TLS (map, sym_map))
+ {
+ td->arg = _dl_make_tlsdesc_dynamic
+ (sym_map, sym->st_value + reloc->r_addend);
+ td->entry = _dl_tlsdesc_dynamic;
+ }
+ else
+# endif
+# endif
+ {
+ td->arg = (void*)(sym->st_value - sym_map->l_tls_offset
+ + reloc->r_addend);
+ td->entry = _dl_tlsdesc_return;
+ }
+ }
+ break;
+ }
case R_X86_64_TPOFF64:
/* The offset is negative, forward from the thread pointer. */
# ifndef RTLD_BOOTSTRAP
@@ -435,6 +477,15 @@ elf_machine_lazy_rel (struct link_map *m
map->l_mach.plt
+ (((Elf64_Addr) reloc_addr) - map->l_mach.gotplt) * 2;
}
+ else if (__builtin_expect (r_type == R_X86_64_TLSDESC, 1))
+ {
+ struct tlsdesc volatile * __attribute__((__unused__)) td =
+ (struct tlsdesc volatile *)reloc_addr;
+
+ td->arg = (void*)reloc;
+ td->entry = (void*)(D_PTR (map, l_info[ADDRIDX (DT_TLSDESC_PLT)])
+ + map->l_addr);
+ }
else
_dl_reloc_bad_type (map, r_type, 1);
}
Index: sysdeps/x86_64/dl-tls.h
===================================================================
--- sysdeps/x86_64/dl-tls.h.orig
+++ sysdeps/x86_64/dl-tls.h
@@ -1,5 +1,5 @@
/* Thread-local storage handling in the ELF dynamic linker. x86-64 version.
- Copyright (C) 2002 Free Software Foundation, Inc.
+ Copyright (C) 2002, 2005 Free Software Foundation, Inc.
This file is part of the GNU C Library.
The GNU C Library is free software; you can redistribute it and/or
@@ -19,11 +19,13 @@
/* Type used for the representation of TLS information in the GOT. */
-typedef struct
+typedef struct dl_tls_index
{
unsigned long int ti_module;
unsigned long int ti_offset;
} tls_index;
+#ifdef SHARED
extern void *__tls_get_addr (tls_index *ti);
+#endif
Index: sysdeps/x86_64/dl-tlsdesc.S
===================================================================
--- /dev/null
+++ sysdeps/x86_64/dl-tlsdesc.S
@@ -0,0 +1,208 @@
+/* Thread-local storage handling in the ELF dynamic linker. x86_64 version.
+ Copyright (C) 2004, 2005 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+ 02111-1307 USA. */
+
+#include <sysdep.h>
+#include <tls.h>
+#include "tlsdesc.h"
+
+ .text
+#ifdef USE_TLS
+ .hidden _dl_tlsdesc_return
+ .global _dl_tlsdesc_return
+ .type _dl_tlsdesc_return,@function
+ cfi_startproc
+ .align 16
+_dl_tlsdesc_return:
+ movq 8(%rax), %rax
+ ret
+ cfi_endproc
+ .size _dl_tlsdesc_return, .-_dl_tlsdesc_return
+
+ .hidden _dl_tlsdesc_undefweak
+ .global _dl_tlsdesc_undefweak
+ .type _dl_tlsdesc_undefweak,@function
+ cfi_startproc
+ .align 16
+_dl_tlsdesc_undefweak:
+ movq 8(%rax), %rax
+ subq %fs:0, %rax
+ ret
+ cfi_endproc
+ .size _dl_tlsdesc_undefweak, .-_dl_tlsdesc_undefweak
+
+#ifdef SHARED
+ .hidden _dl_tlsdesc_dynamic
+ .global _dl_tlsdesc_dynamic
+ .type _dl_tlsdesc_dynamic,@function
+
+ /* %rax points to the TLS descriptor, such that 0(%rax) points to
+ _dl_tlsdesc_dynamic itself, and 8(%rax) points to a struct
+ tlsdesc_dynamic_arg object. It must return in %rax the offset
+ between the thread pointer and the object denoted by the
+ argument, without clobbering any registers.
+
+ The assembly code that follows is a rendition of the following
+ C code, hand-optimized a little bit.
+
+ptrdiff_t
+_dl_tlsdesc_dynamic (register struct tlsdesc *tdp asm ("%rax"))
+{
+ struct tlsdesc_dynamic_arg *td = tdp->arg;
+ dtv_t *dtv = *(dtv_t **)((char *)__thread_pointer + DTV_OFFSET);
+ if (__builtin_expect (td->gen_count <= dtv[0].counter
+ && (dtv[td->tlsinfo.ti_module].pointer.val
+ != TLS_DTV_UNALLOCATED),
+ 1))
+ return dtv[td->tlsinfo.ti_module].pointer.val + td->tlsinfo.ti_offset
+ - __thread_pointer;
+
+ return __tls_get_addr_internal (&td->tlsinfo) - __thread_pointer;
+}
+*/
+ cfi_startproc
+ .align 16
+_dl_tlsdesc_dynamic:
+ /* Preserve call-clobbered registers that we modify.
+ We need two scratch regs anyway. */
+ pushq %rsi
+ cfi_adjust_cfa_offset (8)
+ movq %fs:DTV_OFFSET, %rsi
+ pushq %rdi
+ cfi_adjust_cfa_offset (8)
+ movq TLSDESC_ARG(%rax), %rdi
+ movq (%rsi), %rax
+ cmpq %rax, TLSDESC_GEN_COUNT(%rdi)
+ ja .Lslow
+ movq TLSDESC_MODID(%rdi), %rax
+ salq $4, %rax
+ movq (%rax,%rsi), %rax
+ cmpq $-1, %rax
+ je .Lslow
+ addq TLSDESC_MODOFF(%rdi), %rax
+.Lret:
+ popq %rdi
+ cfi_adjust_cfa_offset (-8)
+ subq %fs:0, %rax
+ popq %rsi
+ cfi_adjust_cfa_offset (-8)
+ ret
+.Lslow:
+ /* Besides rdi and rsi, saved above, save rdx, rcx, r8, r9,
+ r10 and r11. Also, align the stack, that's off by 8 bytes. */
+ cfi_adjust_cfa_offset (16)
+ subq $56, %rsp
+ cfi_adjust_cfa_offset (56)
+ movq %rdx, 8(%rsp)
+ movq %rcx, 16(%rsp)
+ movq %r8, 24(%rsp)
+ movq %r9, 32(%rsp)
+ movq %r10, 40(%rsp)
+ movq %r11, 48(%rsp)
+ /* %rdi already points to the tlsinfo data structure. */
+ call __tls_get_addr@PLT
+ movq 8(%rsp), %rdx
+ movq 16(%rsp), %rcx
+ movq 24(%rsp), %r8
+ movq 32(%rsp), %r9
+ movq 40(%rsp), %r10
+ movq 48(%rsp), %r11
+ addq $56, %rsp
+ cfi_adjust_cfa_offset (-56)
+ jmp .Lret
+ cfi_endproc
+ .size _dl_tlsdesc_dynamic, .-_dl_tlsdesc_dynamic
+#endif /* SHARED */
+
+ .hidden _dl_tlsdesc_resolve_rela
+ .global _dl_tlsdesc_resolve_rela
+ .type _dl_tlsdesc_resolve_rela,@function
+ cfi_startproc
+ .align 16
+ /* The PLT entry will have pushed the link_map pointer. */
+ cfi_adjust_cfa_offset (8)
+_dl_tlsdesc_resolve_rela:
+ /* Save all call-clobbered registers. */
+ subq $72, %rsp
+ cfi_adjust_cfa_offset (72)
+ movq %rax, (%rsp)
+ movq %rdi, 8(%rsp)
+ movq %rax, %rdi /* Pass tlsdesc* in %rdi. */
+ movq %rsi, 16(%rsp)
+ movq 72(%rsp), %rsi /* Pass link_map* in %rsi. */
+ movq %r8, 24(%rsp)
+ movq %r9, 32(%rsp)
+ movq %r10, 40(%rsp)
+ movq %r11, 48(%rsp)
+ movq %rdx, 56(%rsp)
+ movq %rcx, 64(%rsp)
+ call _dl_tlsdesc_resolve_rela_fixup
+ movq (%rsp), %rax
+ movq 8(%rsp), %rdi
+ movq 16(%rsp), %rsi
+ movq 24(%rsp), %r8
+ movq 32(%rsp), %r9
+ movq 40(%rsp), %r10
+ movq 48(%rsp), %r11
+ movq 56(%rsp), %rdx
+ movq 64(%rsp), %rcx
+ addq $80, %rsp
+ cfi_adjust_cfa_offset (-80)
+ jmp *(%rax)
+ cfi_endproc
+ .size _dl_tlsdesc_resolve_rela, .-_dl_tlsdesc_resolve_rela
+
+ .hidden _dl_tlsdesc_resolve_hold
+ .global _dl_tlsdesc_resolve_hold
+ .type _dl_tlsdesc_resolve_hold,@function
+ cfi_startproc
+ .align 16
+_dl_tlsdesc_resolve_hold:
+0:
+ /* Save all call-clobbered registers. */
+ subq $72, %rsp
+ cfi_adjust_cfa_offset (72)
+ movq %rax, (%rsp)
+ movq %rdi, 8(%rsp)
+ movq %rax, %rdi /* Pass tlsdesc* in %rdi. */
+ movq %rsi, 16(%rsp)
+ movq $1f - 0b, %rsi /* Pass return address offset in %rsi. */
+ movq %r8, 24(%rsp)
+ movq %r9, 32(%rsp)
+ movq %r10, 40(%rsp)
+ movq %r11, 48(%rsp)
+ movq %rdx, 56(%rsp)
+ movq %rcx, 64(%rsp)
+ call _dl_tlsdesc_resolve_hold_fixup
+1:
+ movq (%rsp), %rax
+ movq 8(%rsp), %rdi
+ movq 16(%rsp), %rsi
+ movq 24(%rsp), %r8
+ movq 32(%rsp), %r9
+ movq 40(%rsp), %r10
+ movq 48(%rsp), %r11
+ movq 56(%rsp), %rdx
+ movq 64(%rsp), %rcx
+ addq $72, %rsp
+ cfi_adjust_cfa_offset (-72)
+ jmp *(%eax)
+ cfi_endproc
+ .size _dl_tlsdesc_resolve_hold, .-_dl_tlsdesc_resolve_hold
+
+#endif /* USE_TLS */
Index: sysdeps/x86_64/dl-tlsdesc.h
===================================================================
--- /dev/null
+++ sysdeps/x86_64/dl-tlsdesc.h
@@ -0,0 +1,63 @@
+/* Thread-local storage descriptor handling in the ELF dynamic linker.
+ x86_64 version.
+ Copyright (C) 2005 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+ 02111-1307 USA. */
+
+#ifndef _X86_64_DL_TLSDESC_H
+# define _X86_64_DL_TLSDESC_H 1
+
+/* Use this to access DT_TLSDESC_PLT and DT_TLSDESC_GOT. */
+#ifndef ADDRIDX
+# define ADDRIDX(tag) (DT_NUM + DT_THISPROCNUM + DT_VERSIONTAGNUM \
+ + DT_EXTRANUM + DT_VALNUM + DT_ADDRTAGIDX (tag))
+#endif
+
+/* Type used to represent a TLS descriptor in the GOT. */
+struct tlsdesc
+{
+ ptrdiff_t (*entry)(struct tlsdesc *on_rax);
+ void *arg;
+};
+
+typedef struct dl_tls_index
+{
+ unsigned long int ti_module;
+ unsigned long int ti_offset;
+} tls_index;
+
+/* Type used as the argument in a TLS descriptor for a symbol that
+ needs dynamic TLS offsets. */
+struct tlsdesc_dynamic_arg
+{
+ tls_index tlsinfo;
+ size_t gen_count;
+};
+
+extern ptrdiff_t attribute_hidden
+ _dl_tlsdesc_return(struct tlsdesc *on_rax),
+ _dl_tlsdesc_undefweak(struct tlsdesc *on_rax),
+ _dl_tlsdesc_resolve_rela(struct tlsdesc *on_rax),
+ _dl_tlsdesc_resolve_hold(struct tlsdesc *on_rax);
+
+# ifdef SHARED
+extern void *_dl_make_tlsdesc_dynamic (struct link_map *map, size_t ti_offset);
+
+extern ptrdiff_t attribute_hidden _dl_tlsdesc_dynamic(struct tlsdesc *);
+# endif
+
+#endif
Index: sysdeps/x86_64/tlsdesc.c
===================================================================
--- /dev/null
+++ sysdeps/x86_64/tlsdesc.c
@@ -0,0 +1,556 @@
+/* Manage TLS descriptors. x86_64 version.
+ Copyright (C) 2005 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, write to the Free
+ Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+ 02111-1307 USA. */
+
+#include <link.h>
+#include <ldsodefs.h>
+#include <elf/dynamic-link.h>
+#include <tls.h>
+#include <dl-tlsdesc.h>
+
+#ifdef USE_TLS
+# ifdef SHARED
+
+extern void weak_function free (void *ptr);
+
+/* The hashcode handling code below is heavily inspired in libiberty's
+ hashtab code, but with most adaptation points and support for
+ deleting elements removed.
+
+ Copyright (C) 1999, 2000, 2001, 2002, 2003 Free Software Foundation, Inc.
+ Contributed by Vladimir Makarov (vmakarov@cygnus.com). */
+
+inline static unsigned long
+higher_prime_number (unsigned long n)
+{
+ /* These are primes that are near, but slightly smaller than, a
+ power of two. */
+ static const unsigned long primes[] = {
+ (unsigned long) 7,
+ (unsigned long) 13,
+ (unsigned long) 31,
+ (unsigned long) 61,
+ (unsigned long) 127,
+ (unsigned long) 251,
+ (unsigned long) 509,
+ (unsigned long) 1021,
+ (unsigned long) 2039,
+ (unsigned long) 4093,
+ (unsigned long) 8191,
+ (unsigned long) 16381,
+ (unsigned long) 32749,
+ (unsigned long) 65521,
+ (unsigned long) 131071,
+ (unsigned long) 262139,
+ (unsigned long) 524287,
+ (unsigned long) 1048573,
+ (unsigned long) 2097143,
+ (unsigned long) 4194301,
+ (unsigned long) 8388593,
+ (unsigned long) 16777213,
+ (unsigned long) 33554393,
+ (unsigned long) 67108859,
+ (unsigned long) 134217689,
+ (unsigned long) 268435399,
+ (unsigned long) 536870909,
+ (unsigned long) 1073741789,
+ (unsigned long) 2147483647,
+ /* 4294967291L */
+ ((unsigned long) 2147483647) + ((unsigned long) 2147483644),
+ };
+
+ const unsigned long *low = &primes[0];
+ const unsigned long *high = &primes[sizeof(primes) / sizeof(primes[0])];
+
+ while (low != high)
+ {
+ const unsigned long *mid = low + (high - low) / 2;
+ if (n > *mid)
+ low = mid + 1;
+ else
+ high = mid;
+ }
+
+#if 0
+ /* If we've run out of primes, abort. */
+ if (n > *low)
+ {
+ fprintf (stderr, "Cannot find prime bigger than %lu\n", n);
+ abort ();
+ }
+#endif
+
+ return *low;
+}
+
+struct hashtab
+{
+ /* Table itself. */
+ void **entries;
+
+ /* Current size (in entries) of the hash table */
+ size_t size;
+
+ /* Current number of elements. */
+ size_t n_elements;
+};
+
+inline static struct hashtab *
+htab_create (void)
+{
+ struct hashtab *ht = malloc (sizeof (struct hashtab));
+
+ if (! ht)
+ return NULL;
+ ht->size = 3;
+ ht->entries = malloc (sizeof (void *) * ht->size);
+ if (! ht->entries)
+ return NULL;
+
+ ht->n_elements = 0;
+
+ memset (ht->entries, 0, sizeof (void *) * ht->size);
+
+ return ht;
+}
+
+/* This is only called from _dl_unmap, so it's safe to call
+ free(). See the discussion below. */
+inline static void
+htab_delete (struct hashtab *htab)
+{
+ int i;
+
+ for (i = htab->size - 1; i >= 0; i--)
+ if (htab->entries[i])
+ free (htab->entries[i]);
+
+ free (htab->entries);
+ free (htab);
+}
+
+/* Similar to htab_find_slot, but without several unwanted side effects:
+ - Does not call htab->eq_f when it finds an existing entry.
+ - Does not change the count of elements/searches/collisions in the
+ hash table.
+ This function also assumes there are no deleted entries in the table.
+ HASH is the hash value for the element to be inserted. */
+
+inline static void **
+find_empty_slot_for_expand (struct hashtab *htab, int hash)
+{
+ size_t size = htab->size;
+ unsigned int index = hash % size;
+ void **slot = htab->entries + index;
+ int hash2;
+
+ if (! *slot)
+ return slot;
+
+ hash2 = 1 + hash % (size - 2);
+ for (;;)
+ {
+ index += hash2;
+ if (index >= size)
+ index -= size;
+
+ slot = htab->entries + index;
+ if (! *slot)
+ return slot;
+ }
+}
+
+/* The following function changes size of memory allocated for the
+ entries and repeatedly inserts the table elements. The occupancy
+ of the table after the call will be about 50%. Naturally the hash
+ table must already exist. Remember also that the place of the
+ table entries is changed. If memory allocation failures are allowed,
+ this function will return zero, indicating that the table could not be
+ expanded. If all goes well, it will return a non-zero value. */
+
+inline static int
+htab_expand (struct hashtab *htab, int (*hash_fn)(void *))
+{
+ void **oentries;
+ void **olimit;
+ void **p;
+ void **nentries;
+ size_t nsize;
+
+ oentries = htab->entries;
+ olimit = oentries + htab->size;
+
+ /* Resize only when table after removal of unused elements is either
+ too full or too empty. */
+ if (htab->n_elements * 2 > htab->size)
+ nsize = higher_prime_number (htab->n_elements * 2);
+ else
+ nsize = htab->size;
+
+ nentries = malloc (sizeof (void *) * nsize);
+ memset (nentries, 0, sizeof (void *) * nsize);
+ if (nentries == NULL)
+ return 0;
+ htab->entries = nentries;
+ htab->size = nsize;
+
+ p = oentries;
+ do
+ {
+ if (*p)
+ *find_empty_slot_for_expand (htab, hash_fn (*p))
+ = *p;
+
+ p++;
+ }
+ while (p < olimit);
+
+#if 0 /* We can't tell whether this was allocated by the malloc()
+ built into ld.so or the one in the main executable or libc,
+ and calling free() for something that wasn't malloc()ed could
+ do Very Bad Things (TM). Take the conservative approach
+ here, potentially wasting as much memory as actually used by
+ the hash table, even if multiple growths occur. That's not
+ so bad as to require some overengineered solution that would
+ enable us to keep track of how it was allocated. */
+ free (oentries);
+#endif
+ return 1;
+}
+
+/* This function searches for a hash table slot containing an entry
+ equal to the given element. To delete an entry, call this with
+ INSERT = 0, then call htab_clear_slot on the slot returned (possibly
+ after doing some checks). To insert an entry, call this with
+ INSERT = 1, then write the value you want into the returned slot.
+ When inserting an entry, NULL may be returned if memory allocation
+ fails. */
+
+inline static void **
+htab_find_slot (struct hashtab *htab, void *ptr, int insert,
+ int (*hash_fn)(void *), int (*eq_fn)(void *, void *))
+{
+ unsigned int index;
+ int hash, hash2;
+ size_t size;
+ void **entry;
+
+ if (htab->size * 3 <= htab->n_elements * 4
+ && htab_expand (htab, hash_fn) == 0)
+ return NULL;
+
+ hash = hash_fn (ptr);
+
+ size = htab->size;
+ index = hash % size;
+
+ entry = &htab->entries[index];
+ if (!*entry)
+ goto empty_entry;
+ else if (eq_fn (*entry, ptr))
+ return entry;
+
+ hash2 = 1 + hash % (size - 2);
+ for (;;)
+ {
+ index += hash2;
+ if (index >= size)
+ index -= size;
+
+ entry = &htab->entries[index];
+ if (!*entry)
+ goto empty_entry;
+ else if (eq_fn (*entry, ptr))
+ return entry;
+ }
+
+ empty_entry:
+ if (!insert)
+ return NULL;
+
+ htab->n_elements++;
+ return entry;
+}
+
+inline static int
+hash_tlsdesc(void *p)
+{
+ struct tlsdesc_dynamic_arg *td = p;
+
+ /* We know all entries are for the same module, so ti_offset is the
+ only distinguishing entry. */
+ return td->tlsinfo.ti_offset;
+}
+
+inline static int
+eq_tlsdesc(void *p, void *q)
+{
+ struct tlsdesc_dynamic_arg *tdp = p, *tdq = q;
+
+ return tdp->tlsinfo.ti_offset == tdq->tlsinfo.ti_offset;
+}
+
+inline static int
+map_generation (struct link_map *map)
+{
+ size_t idx = map->l_tls_modid;
+ struct dtv_slotinfo_list *listp = GL(dl_tls_dtv_slotinfo_list);
+
+ /* Find the place in the dtv slotinfo list. */
+ do
+ {
+ /* Does it fit in the array of this list element? */
+ if (idx < listp->len)
+ {
+ /* We should never get here for a module in static TLS, so
+ we can assume that, if the generation count is zero, we
+ still haven't determined the generation count for this
+ module. */
+ if (listp->slotinfo[idx].gen)
+ return listp->slotinfo[idx].gen;
+ else
+ break;
+ }
+ idx -= listp->len;
+ listp = listp->next;
+ }
+ while (listp != NULL);
+
+ /* If we get to this point, the module still hasn't been assigned an
+ entry in the dtv slotinfo data structures, and it will when we're
+ done with relocations. At that point, the module will get a
+ generation number that is one past the current generation, so
+ return exactly that. */
+ return GL(dl_tls_generation) + 1;
+}
+
+void *
+_dl_make_tlsdesc_dynamic (struct link_map *map, size_t ti_offset)
+{
+ struct hashtab *ht;
+ void **entry;
+ struct tlsdesc_dynamic_arg *td, test;
+
+ /* FIXME: We could use a per-map lock here, but is it worth it? */
+ __rtld_lock_lock_recursive (GL(dl_load_lock));
+
+ ht = map->l_mach.tlsdesc_table;
+ if (! ht)
+ {
+ ht = htab_create ();
+ if (! ht)
+ {
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+ return 0;
+ }
+ map->l_mach.tlsdesc_table = ht;
+ }
+
+ test.tlsinfo.ti_module = map->l_tls_modid;
+ test.tlsinfo.ti_offset = ti_offset;
+ entry = htab_find_slot (ht, &test, 1, hash_tlsdesc, eq_tlsdesc);
+ if (*entry)
+ {
+ td = *entry;
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+ return td;
+ }
+
+ *entry = td = malloc (sizeof (struct tlsdesc_dynamic_arg));
+ /* This may be higher than the map's generation, but it doesn't
+ matter much. Worst case, we'll have one extra DTV update per
+ thread. */
+ td->gen_count = map_generation (map);
+ td->tlsinfo = test.tlsinfo;
+
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+ return td;
+}
+
+# endif /* SHARED */
+
+/* The idea of the following two functions is to stop multiple threads
+ from attempting to resolve the same TLS descriptor without busy
+ waiting. Ideally, we should be able to release the lock right
+ after changing td->entry, and then using say a condition variable
+ or a futex wake to wake up any waiting threads, but let's try to
+ avoid introducing such dependencies. */
+
+inline static int
+_dl_tlsdesc_resolve_early_return_p (struct tlsdesc volatile *td, void *caller)
+{
+ if (caller != td->entry)
+ return 1;
+
+ __rtld_lock_lock_recursive (GL(dl_load_lock));
+ if (caller != td->entry)
+ {
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+ return 1;
+ }
+
+ td->entry = _dl_tlsdesc_resolve_hold;
+
+ return 0;
+}
+
+inline static void
+_dl_tlsdesc_wake_up_held_fixups (void)
+{
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+}
+
+/* The following 2 functions take an entry_check_offset argument.
+ It's computed by the caller as an offset between its entry point
+ and the call site, such that by adding the built-in return address
+ that is implicitly passed to the function with this offset, we can
+ easily obtain the caller's entry point to compare with the entry
+ point given in the TLS descriptor. If it's changed, we want to
+ return immediately. */
+
+/* These macros are copied from elf/dl-reloc.c */
+
+#define CHECK_STATIC_TLS(map, sym_map) \
+ do { \
+ if (__builtin_expect ((sym_map)->l_tls_offset == NO_TLS_OFFSET \
+ || ((sym_map)->l_tls_offset \
+ == FORCED_DYNAMIC_TLS_OFFSET), 0)) \
+ _dl_allocate_static_tls (sym_map); \
+ } while (0)
+
+#define TRY_STATIC_TLS(map, sym_map) \
+ (__builtin_expect ((sym_map)->l_tls_offset \
+ != FORCED_DYNAMIC_TLS_OFFSET, 1) \
+ && (__builtin_expect ((sym_map)->l_tls_offset != NO_TLS_OFFSET, 1) \
+ || _dl_try_allocate_static_tls (sym_map) == 0))
+
+int internal_function _dl_try_allocate_static_tls (struct link_map *map);
+
+/* This function is used to lazily resolve TLS_DESC RELA relocations.
+ The argument location is used to hold a pointer to the relocation. */
+
+void
+attribute_hidden
+_dl_tlsdesc_resolve_rela_fixup (struct tlsdesc volatile *td,
+ struct link_map *l)
+{
+ const ElfW(Rela) *reloc = td->arg;
+
+ if (_dl_tlsdesc_resolve_early_return_p
+ (td, (void*)(D_PTR (l, l_info[ADDRIDX (DT_TLSDESC_PLT)]) + l->l_addr)))
+ return;
+
+ /* The code below was borrowed from _dl_fixup(). */
+ const ElfW(Sym) *const symtab
+ = (const void *) D_PTR (l, l_info[DT_SYMTAB]);
+ const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]);
+ const ElfW(Sym) *sym = &symtab[ELFW(R_SYM) (reloc->r_info)];
+ lookup_t result;
+
+ /* Look up the target symbol. If the normal lookup rules are not
+ used don't look in the global scope. */
+ if (ELFW(ST_BIND) (sym->st_info) != STB_LOCAL
+ && __builtin_expect (ELFW(ST_VISIBILITY) (sym->st_other), 0) == 0)
+ {
+ const struct r_found_version *version = NULL;
+
+ if (l->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
+ {
+ const ElfW(Half) *vernum =
+ (const void *) D_PTR (l, l_info[VERSYMIDX (DT_VERSYM)]);
+ ElfW(Half) ndx = vernum[ELFW(R_SYM) (reloc->r_info)] & 0x7fff;
+ version = &l->l_versions[ndx];
+ if (version->hash == 0)
+ version = NULL;
+ }
+
+ result = _dl_lookup_symbol_x (strtab + sym->st_name, l, &sym,
+ l->l_scope, version, ELF_RTYPE_CLASS_PLT,
+ DL_LOOKUP_ADD_DEPENDENCY, NULL);
+ }
+ else
+ {
+ /* We already found the symbol. The module (and therefore its load
+ address) is also known. */
+ result = l;
+ }
+
+ if (! sym)
+ {
+ td->arg = (void*)reloc->r_addend;
+ td->entry = _dl_tlsdesc_undefweak;
+ }
+ else
+ {
+# ifndef SHARED
+ CHECK_STATIC_TLS (l, result);
+# else
+ if (!TRY_STATIC_TLS (l, result))
+ {
+ td->arg = _dl_make_tlsdesc_dynamic (result, sym->st_value
+ + reloc->r_addend);
+ td->entry = _dl_tlsdesc_dynamic;
+ }
+ else
+# endif
+ {
+ td->arg = (void*)(sym->st_value - result->l_tls_offset
+ + reloc->r_addend);
+ td->entry = _dl_tlsdesc_return;
+ }
+ }
+
+ _dl_tlsdesc_wake_up_held_fixups ();
+}
+
+void
+attribute_hidden
+_dl_tlsdesc_resolve_hold_fixup (struct tlsdesc volatile *td,
+ ptrdiff_t entry_check_offset)
+{
+ /* Maybe we're lucky and can return early. */
+ if (__builtin_return_address (0) - entry_check_offset != td->entry)
+ return;
+
+ /* Locking here will stop execution until the runnign resolver runs
+ _dl_tlsdesc_wake_up_held_fixups(), releasing the lock.
+
+ FIXME: We'd be better off waiting on a condition variable, such
+ that we didn't have to hold the lock throughout the relocation
+ processing. */
+ __rtld_lock_lock_recursive (GL(dl_load_lock));
+ __rtld_lock_unlock_recursive (GL(dl_load_lock));
+}
+
+#endif /* USE_TLS */
+
+void
+_dl_unmap (struct link_map *map)
+{
+ __munmap ((void *) (map)->l_map_start,
+ (map)->l_map_end - (map)->l_map_start);
+
+#if USE_TLS && SHARED
+ /* _dl_unmap is only called for dlopen()ed libraries, for which
+ calling free() is safe, or before we've completed the initial
+ relocation, in which case calling free() is probably pointless,
+ but still safe. */
+ if (map->l_mach.tlsdesc_table)
+ htab_delete (map->l_mach.tlsdesc_table);
+#endif
+}
Index: sysdeps/x86_64/tlsdesc.sym
===================================================================
--- /dev/null
+++ sysdeps/x86_64/tlsdesc.sym
@@ -0,0 +1,20 @@
+#include <stddef.h>
+#include <sysdep.h>
+#include <tls.h>
+#include <link.h>
+#include <dl-tlsdesc.h>
+
+--
+
+-- Abuse tls.h macros to derive offsets relative to the thread register.
+#if defined USE_TLS
+
+DTV_OFFSET offsetof(struct pthread, header.dtv)
+
+TLSDESC_ARG offsetof(struct tlsdesc, arg)
+
+TLSDESC_GEN_COUNT offsetof(struct tlsdesc_dynamic_arg, gen_count)
+TLSDESC_MODID offsetof(struct tlsdesc_dynamic_arg, tlsinfo.ti_module)
+TLSDESC_MODOFF offsetof(struct tlsdesc_dynamic_arg, tlsinfo.ti_offset)
+
+#endif
--
Alexandre Oliva http://www.lsd.ic.unicamp.br/~oliva/
Red Hat Compiler Engineer aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist oliva@{lsd.ic.unicamp.br, gnu.org}