This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: TLS improvements for IA32 and AMD64/EM64T


On Oct 10, 2006, Alexandre Oliva <aoliva@redhat.com> wrote:

> On Feb  6, 2006, Alexandre Oliva <aoliva@redhat.com> wrote:
>> On Sep 22, 2005, Alexandre Oliva <aoliva@redhat.com> wrote:
>>> On Sep 17, 2005, Alexandre Oliva <aoliva@redhat.com> wrote:
>>>> On Sep 16, 2005, Alexandre Oliva <aoliva@redhat.com> wrote:
>>>>> On Sep 16, 2005, Alexandre Oliva <aoliva@redhat.com> wrote:
>>>>>> Over the past few months, I've been working on porting to IA32 and
>>>>>> AMD64/EM64T the interesting bits of the TLS design I came up with for
>>>>>> FR-V, achieving some impressive speedups along with slight code size
>>>>>> reductions in the most common cases.

>>>>>> Although the design is not set in stone yet, it's fully implemented
>>>>>> and functional with patches I'm about to post for binutils, gcc and
>>>>>> glibc mainline, as follow-ups to this message, except that the GCC
>>>>>> patch will go to gcc-patches, as expected.

>>>>> This is the glibc portion of the implementation.  I've built it for
>>>>> amd64-linux-gnu and i686-pc-linux-gnu, with both the current TLS
>>>>> dialect and the new -mtls-dialect=gnu2, without regressions.

>>>> Revised patch using new relocation and dynamic entry numbers.  Tested
>>>> again on all 4 combinations mentioned above.

Updated arranging for more code to be shared between x86 and x86_64
ports, and adding more comments.  Re-tested using a GCC 4.3-ish
compiler and binutils as in Fedora 8.

for ChangeLog
2008-01-30  Alexandre Oliva  <aoliva@redhat.com>

	Introduce TLS descriptors for i386 and x86_64.
	* include/inline-hashtab.h: New file, copied from 2005's
	libiberty, with fix for memory leak imported afterwards by
	Glauber de Oliveira Costa.
	* elf/tlsdeschtab.h: New file.
	* elf/dl-reloc.c (_dl_try_allocate_static_tls): Extract from...
	(_dl_allocate_static_tls): ... here.  Rearrange failure path.
	(CHECK_STATIC_TLS): Move to...
	* elf/dynamic-link.h: ... this file.
	(TRY_STATIC_TLS): New macro.
	* elf/dl-conflict.c (CHECK_STATIC_TLS, TRY_STATIC_TLS): Override.
	* elf/elf.h (R_386_TLS_GOTDESC, R_386_TLS_DESC_CALL,
	R_386_TLS_DESC): Define.
	(R_X86_64_PC64, R_X86_GOTOFF64, R_X86_64_GOTPC32): Merge from
	binutils.
	(R_X86_64_GOTPC32_TLSDESC, R_X86_64_TLSDESC_CALL,
	R_X86_64_TLSDESC): Define.
	(R_386_NUM, R_X86_64_NUM): Adjust.
	* sysdeps/i386/Makefile (sysdep-dl-routines, sysdep_routines,
	systep-rtld-routines): Add tlsdesc and dl-tlsdesc for elf subdir.
	(gen-as-const-headers): Add tlsdesc.sym to csu subdir.
	* sysdeps/i386/dl-lookupcfg.h: New file.  Introduce _dl_unmap to
	release tlsdesc_table.
	* sysdeps/i386/dl-machine.h: Include dl-tlsdesc.h.
	(elf_machine_type_class): Mark R_386_TLS_DESC as PLT class.
	(elf_machine_rel): Handle R_386_TLS_DESC.
	(elf_machine_rela): Likewise.
	(elf_machine_lazy_rel): Likewise.
	(elf_machine_lazy_rela): Likewise.
	* sysdeps/i386/dl-tls.h (struct dl_tls_index): Name it.
	* sysdeps/i386/dl-tlsdesc.S: New file.
	* sysdeps/i386/dl-tlsdesc.h: New file.
	* sysdeps/i386/tlsdesc.c: New file.
	* sysdeps/i386/tlsdesc.sym: New file.
	* sysdeps/i386/bits/linkmap.h (struct link_map_machine): Add
	tlsdesc_table.
	* sysdeps/x86_64/Makefile (sysdep-dl-routines, sysdep_routines,
	systep-rtld-routines): Add tlsdesc and dl-tlsdesc for elf subdir.
	(gen-as-const-headers): Add tlsdesc.sym to csu subdir.
	* sysdeps/x86_64/dl-lookupcfg.h: New file.  Introduce _dl_unmap to
	release tlsdesc_table.
	* sysdeps/x86_64/dl-machine.h: Include dl-tlsdesc.h.
	(elf_machine_runtime_setup): Set up lazy TLSDESC GOT entry.
	(elf_machine_type_class): Mark R_X86_64_TLSDESC as PLT class.
	(elf_machine_rel): Handle R_X86_64_TLSDESC.
	(elf_machine_rela): Likewise.
	(elf_machine_lazy_rel): Likewise.
	* sysdeps/x86_64/dl-tls.h (struct dl_tls_index): Name it.
	(__tls_get_addr): Do not declare for non-shared compiles.
	* sysdeps/x86_64/dl-tlsdesc.S: New file.
	* sysdeps/x86_64/dl-tlsdesc.h: New file.
	* sysdeps/x86_64/tlsdesc.c: New file.
	* sysdeps/x86_64/tlsdesc.sym: New file.
	* sysdeps/x86_64/bits/linkmap.h (struct link_map_machine): Add
	tlsdesc_table for both 32- and 64-bit structs.

Index: elf/elf.h
===================================================================
--- elf/elf.h.orig	2008-01-31 03:37:39.000000000 -0200
+++ elf/elf.h	2008-01-31 03:37:42.000000000 -0200
@@ -1157,8 +1157,17 @@ typedef struct
 #define R_386_TLS_DTPMOD32 35		/* ID of module containing symbol */
 #define R_386_TLS_DTPOFF32 36		/* Offset in TLS block */
 #define R_386_TLS_TPOFF32  37		/* Negated offset in static TLS block */
+/* 38? */
+#define R_386_TLS_GOTDESC  39		/* GOT offset for TLS descriptor.  */
+#define R_386_TLS_DESC_CALL 40		/* Marker of call through TLS
+					   descriptor for
+					   relaxation.  */
+#define R_386_TLS_DESC     41		/* TLS descriptor containing
+					   pointer to code and to
+					   argument, returning the TLS
+					   offset for the symbol.  */
 /* Keep this the last entry.  */
-#define R_386_NUM	   38
+#define R_386_NUM	   42
 
 /* SUN SPARC specific definitions.  */
 
@@ -2555,8 +2564,17 @@ typedef Elf32_Addr Elf32_Conflict;
 #define R_X86_64_GOTTPOFF	22	/* 32 bit signed PC relative offset
 					   to GOT entry for IE symbol */
 #define R_X86_64_TPOFF32	23	/* Offset in initial TLS block */
+#define R_X86_64_PC64		24	/* PC relative 64 bit */
+#define R_X86_64_GOTOFF64	25	/* 64 bit offset to GOT */
+#define R_X86_64_GOTPC32	26	/* 32 bit signed pc relative
+					   offset to GOT */
+/* 27 .. 33 */
+#define R_X86_64_GOTPC32_TLSDESC 34	/* GOT offset for TLS descriptor.  */
+#define R_X86_64_TLSDESC_CALL   35	/* Marker for call through TLS
+					   descriptor.  */
+#define R_X86_64_TLSDESC        36	/* TLS descriptor.  */
 
-#define R_X86_64_NUM		24
+#define R_X86_64_NUM		37
 
 
 /* AM33 relocations.  */
Index: sysdeps/i386/Makefile
===================================================================
--- sysdeps/i386/Makefile.orig	2008-01-31 03:37:39.000000000 -0200
+++ sysdeps/i386/Makefile	2008-01-31 03:37:42.000000000 -0200
@@ -65,3 +65,13 @@ endif
 ifneq (,$(filter -mno-tls-direct-seg-refs,$(CFLAGS)))
 defines += -DNO_TLS_DIRECT_SEG_REFS
 endif
+
+ifeq ($(subdir),elf)
+sysdep-dl-routines += tlsdesc dl-tlsdesc
+sysdep_routines += tlsdesc dl-tlsdesc
+sysdep-rtld-routines += tlsdesc dl-tlsdesc
+endif
+
+ifeq ($(subdir),csu)
+gen-as-const-headers += tlsdesc.sym
+endif
Index: sysdeps/i386/bits/linkmap.h
===================================================================
--- sysdeps/i386/bits/linkmap.h.orig	2008-01-31 03:37:39.000000000 -0200
+++ sysdeps/i386/bits/linkmap.h	2008-01-31 03:37:42.000000000 -0200
@@ -2,4 +2,5 @@ struct link_map_machine
   {
     Elf32_Addr plt; /* Address of .plt + 0x16 */
     Elf32_Addr gotplt; /* Address of .got + 0x0c */
+    void *tlsdesc_table; /* Address of TLS descriptor hash table.  */
   };
Index: sysdeps/i386/dl-lookupcfg.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ sysdeps/i386/dl-lookupcfg.h	2008-01-31 03:37:42.000000000 -0200
@@ -0,0 +1,28 @@
+/* Configuration of lookup functions.
+   Copyright (C) 2005, 2008 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#define DL_UNMAP_IS_SPECIAL
+
+#include_next <dl-lookupcfg.h>
+
+struct link_map;
+
+extern void internal_function _dl_unmap (struct link_map *map);
+
+#define DL_UNMAP(map) _dl_unmap (map)
Index: sysdeps/i386/dl-machine.h
===================================================================
--- sysdeps/i386/dl-machine.h.orig	2008-01-31 03:37:39.000000000 -0200
+++ sysdeps/i386/dl-machine.h	2008-01-31 03:37:42.000000000 -0200
@@ -25,6 +25,7 @@
 #include <sys/param.h>
 #include <sysdep.h>
 #include <tls.h>
+#include <dl-tlsdesc.h>
 
 /* Return nonzero iff ELF header is compatible with the running host.  */
 static inline int __attribute__ ((unused))
@@ -246,7 +247,7 @@ _dl_start_user:\n\
 # define elf_machine_type_class(type) \
   ((((type) == R_386_JMP_SLOT || (type) == R_386_TLS_DTPMOD32		      \
      || (type) == R_386_TLS_DTPOFF32 || (type) == R_386_TLS_TPOFF32	      \
-     || (type) == R_386_TLS_TPOFF)					      \
+     || (type) == R_386_TLS_TPOFF || (type) == R_386_TLS_DESC)		      \
     * ELF_RTYPE_CLASS_PLT)						      \
    | (((type) == R_386_COPY) * ELF_RTYPE_CLASS_COPY))
 #else
@@ -373,6 +374,38 @@ elf_machine_rel (struct link_map *map, c
 	    *reloc_addr = sym->st_value;
 # endif
 	  break;
+	case R_386_TLS_DESC:
+	  {
+	    struct tlsdesc volatile *td =
+	      (struct tlsdesc volatile *)reloc_addr;
+
+# ifndef RTLD_BOOTSTRAP
+	    if (! sym)
+	      td->entry = _dl_tlsdesc_undefweak;
+	    else
+# endif
+	      {
+# ifndef RTLD_BOOTSTRAP
+#  ifndef SHARED
+		CHECK_STATIC_TLS (map, sym_map);
+#  else
+		if (!TRY_STATIC_TLS (map, sym_map))
+		  {
+		    td->arg = _dl_make_tlsdesc_dynamic
+		      (sym_map, sym->st_value + (ElfW(Word))td->arg);
+		    td->entry = _dl_tlsdesc_dynamic;
+		  }
+		else
+#  endif
+# endif
+		  {
+		    td->arg = (void*)(sym->st_value - sym_map->l_tls_offset
+				      + (ElfW(Word))td->arg);
+		    td->entry = _dl_tlsdesc_return;
+		  }
+	      }
+	    break;
+	  }
 	case R_386_TLS_TPOFF32:
 	  /* The offset is positive, backward from the thread pointer.  */
 # ifdef RTLD_BOOTSTRAP
@@ -485,6 +518,41 @@ elf_machine_rela (struct link_map *map, 
 	     Therefore the offset is already correct.  */
 	  *reloc_addr = (sym == NULL ? 0 : sym->st_value) + reloc->r_addend;
 	  break;
+	case R_386_TLS_DESC:
+	  {
+	    struct tlsdesc volatile *td =
+	      (struct tlsdesc volatile *)reloc_addr;
+
+# ifndef RTLD_BOOTSTRAP
+	    if (!sym)
+	      {
+		td->arg = (void*)reloc->r_addend;
+		td->entry = _dl_tlsdesc_undefweak;
+	      }
+	    else
+# endif
+	      {
+# ifndef RTLD_BOOTSTRAP
+#  ifndef SHARED
+		CHECK_STATIC_TLS (map, sym_map);
+#  else
+		if (!TRY_STATIC_TLS (map, sym_map))
+		  {
+		    td->arg = _dl_make_tlsdesc_dynamic
+		      (sym_map, sym->st_value + reloc->r_addend);
+		    td->entry = _dl_tlsdesc_dynamic;
+		  }
+		else
+#  endif
+# endif
+		  {
+		    td->arg = (void*)(sym->st_value - sym_map->l_tls_offset
+				      + reloc->r_addend);
+		    td->entry = _dl_tlsdesc_return;
+		  }
+	      }
+	  }
+	  break;
 	case R_386_TLS_TPOFF32:
 	  /* The offset is positive, backward from the thread pointer.  */
 	  /* We know the offset of object the symbol is contained in.
@@ -578,6 +646,53 @@ elf_machine_lazy_rel (struct link_map *m
 	*reloc_addr = (map->l_mach.plt
 		       + (((Elf32_Addr) reloc_addr) - map->l_mach.gotplt) * 4);
     }
+  else if (__builtin_expect (r_type == R_386_TLS_DESC, 1))
+    {
+      struct tlsdesc volatile * __attribute__((__unused__)) td =
+	(struct tlsdesc volatile *)reloc_addr;
+
+      /* Handle relocations that reference the local *ABS* in a simple
+	 way, so as to preserve a potential addend.  */
+      if (ELF32_R_SYM (reloc->r_info) == 0)
+	td->entry = _dl_tlsdesc_resolve_abs_plus_addend;
+      /* Given a known-zero addend, we can store a pointer to the
+	 reloc in the arg position.  */
+      else if (td->arg == 0)
+	{
+	  td->arg = (void*)reloc;
+	  td->entry = _dl_tlsdesc_resolve_rel;
+	}
+      else
+	{
+	  /* We could handle non-*ABS* relocations with non-zero addends
+	     by allocating dynamically an arg to hold a pointer to the
+	     reloc, but that sounds pointless.  */
+	  const Elf32_Rel *const r = reloc;
+	  /* The code below was borrowed from elf_dynamic_do_rel().  */
+	  const ElfW(Sym) *const symtab =
+	    (const void *) D_PTR (map, l_info[DT_SYMTAB]);
+
+#ifdef RTLD_BOOTSTRAP
+	  /* The dynamic linker always uses versioning.  */
+	  assert (map->l_info[VERSYMIDX (DT_VERSYM)] != NULL);
+#else
+	  if (map->l_info[VERSYMIDX (DT_VERSYM)])
+#endif
+	    {
+	      const ElfW(Half) *const version =
+		(const void *) D_PTR (map, l_info[VERSYMIDX (DT_VERSYM)]);
+	      ElfW(Half) ndx = version[ELFW(R_SYM) (r->r_info)] & 0x7fff;
+	      elf_machine_rel (map, r, &symtab[ELFW(R_SYM) (r->r_info)],
+			       &map->l_versions[ndx],
+			       (void *) (l_addr + r->r_offset));
+	    }
+#ifndef RTLD_BOOTSTRAP
+	  else
+	    elf_machine_rel (map, r, &symtab[ELFW(R_SYM) (r->r_info)], NULL,
+			     (void *) (l_addr + r->r_offset));
+#endif
+	}
+    }
   else
     _dl_reloc_bad_type (map, r_type, 1);
 }
@@ -589,6 +704,20 @@ __attribute__ ((always_inline))
 elf_machine_lazy_rela (struct link_map *map,
 		       Elf32_Addr l_addr, const Elf32_Rela *reloc)
 {
+  Elf32_Addr *const reloc_addr = (void *) (l_addr + reloc->r_offset);
+  const unsigned int r_type = ELF32_R_TYPE (reloc->r_info);
+  if (__builtin_expect (r_type == R_386_JMP_SLOT, 1))
+    ;
+  else if (__builtin_expect (r_type == R_386_TLS_DESC, 1))
+    {
+      struct tlsdesc volatile * __attribute__((__unused__)) td =
+	(struct tlsdesc volatile *)reloc_addr;
+
+      td->arg = (void*)reloc;
+      td->entry = _dl_tlsdesc_resolve_rela;
+    }
+  else
+    _dl_reloc_bad_type (map, r_type, 1);
 }
 
 #endif	/* !RTLD_BOOTSTRAP */
Index: sysdeps/i386/dl-tls.h
===================================================================
--- sysdeps/i386/dl-tls.h.orig	2008-01-31 03:37:39.000000000 -0200
+++ sysdeps/i386/dl-tls.h	2008-01-31 03:37:42.000000000 -0200
@@ -19,7 +19,7 @@
 
 
 /* Type used for the representation of TLS information in the GOT.  */
-typedef struct
+typedef struct dl_tls_index
 {
   unsigned long int ti_module;
   unsigned long int ti_offset;
Index: sysdeps/i386/dl-tlsdesc.S
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ sysdeps/i386/dl-tlsdesc.S	2008-01-31 04:16:19.000000000 -0200
@@ -0,0 +1,290 @@
+/* Thread-local storage handling in the ELF dynamic linker.  i386 version.
+   Copyright (C) 2004, 2005, 2008 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#include <sysdep.h>
+#include <tls.h>
+#include "tlsdesc.h"
+
+	.text
+
+     /* This function is used to compute the TP offset for symbols in
+	Static TLS, i.e., whose TP offset is the same for all
+	threads.
+
+	The incoming %eax points to the TLS descriptor, such that
+	0(%eax) points to _dl_tlsdesc_return itself, and 4(%eax) holds
+	the TP offset of the symbol corresponding to the object
+	denoted by the argument.  */
+
+	.hidden _dl_tlsdesc_return
+	.global	_dl_tlsdesc_return
+	.type	_dl_tlsdesc_return,@function
+	cfi_startproc
+	.align 16
+_dl_tlsdesc_return:
+	movl	4(%eax), %eax
+	ret
+	cfi_endproc
+	.size	_dl_tlsdesc_return, .-_dl_tlsdesc_return
+
+     /* This function is used for undefined weak TLS symbols, for
+	which the base address (i.e., disregarding any addend) should
+	resolve to NULL.
+
+	%eax points to the TLS descriptor, such that 0(%eax) points to
+	_dl_tlsdesc_undefweak itself, and 4(%eax) holds the addend.
+	We return the addend minus the TP, such that, when the caller
+	adds TP, it gets the addend back.  If that's zero, as usual,
+	that's most likely a NULL pointer.  */
+
+	.hidden _dl_tlsdesc_undefweak
+	.global	_dl_tlsdesc_undefweak
+	.type	_dl_tlsdesc_undefweak,@function
+	cfi_startproc
+	.align 16
+_dl_tlsdesc_undefweak:
+	movl	4(%eax), %eax
+	subl	%gs:0, %eax
+	ret
+	cfi_endproc
+	.size	_dl_tlsdesc_undefweak, .-_dl_tlsdesc_undefweak
+
+#ifdef SHARED
+	.hidden _dl_tlsdesc_dynamic
+	.global	_dl_tlsdesc_dynamic
+	.type	_dl_tlsdesc_dynamic,@function
+
+     /* This function is used for symbols that need dynamic TLS.
+
+	%eax points to the TLS descriptor, such that 0(%eax) points to
+	_dl_tlsdesc_dynamic itself, and 4(%eax) points to a struct
+	tlsdesc_dynamic_arg object.  It must return in %eax the offset
+	between the thread pointer and the object denoted by the
+	argument, without clobbering any registers.
+
+	The assembly code that follows is a rendition of the following
+	C code, hand-optimized a little bit.
+
+ptrdiff_t
+__attribute__ ((__regparm__ (1)))
+_dl_tlsdesc_dynamic (struct tlsdesc *tdp)
+{
+  struct tlsdesc_dynamic_arg *td = tdp->arg;
+  dtv_t *dtv = *(dtv_t **)((char *)__thread_pointer + DTV_OFFSET);
+  if (__builtin_expect (td->gen_count <= dtv[0].counter
+			&& (dtv[td->tlsinfo.ti_module].pointer.val
+			    != TLS_DTV_UNALLOCATED),
+			1))
+    return dtv[td->tlsinfo.ti_module].pointer.val + td->tlsinfo.ti_offset
+      - __thread_pointer;
+
+  return ___tls_get_addr (&td->tlsinfo) - __thread_pointer;
+}
+*/
+	cfi_startproc
+	.align 16
+_dl_tlsdesc_dynamic:
+	/* Like all TLS resolvers, preserve call-clobbered registers.
+	   We need two scratch regs anyway.  */
+	subl	$28, %esp
+	cfi_adjust_cfa_offset (28)
+	movl	%ecx, 20(%esp)
+	movl	%edx, 24(%esp)
+	movl	TLSDESC_ARG(%eax), %eax
+	movl	%gs:DTV_OFFSET, %edx
+	movl	TLSDESC_GEN_COUNT(%eax), %ecx
+	cmpl	(%edx), %ecx
+	ja	.Lslow
+	movl	TLSDESC_MODID(%eax), %ecx
+	movl	(%edx,%ecx,8), %edx
+	cmpl	$-1, %edx
+	je	.Lslow
+	movl	TLSDESC_MODOFF(%eax), %eax
+	addl	%edx, %eax
+.Lret:
+	movl	20(%esp), %ecx
+	subl	%gs:0, %eax
+	movl	24(%esp), %edx
+	addl	$28, %esp
+	cfi_adjust_cfa_offset (-28)
+	ret
+	.p2align 4,,7
+.Lslow:
+	cfi_adjust_cfa_offset (28)
+	movl	%ebx, 16(%esp)
+	call	__i686.get_pc_thunk.bx
+	addl	$_GLOBAL_OFFSET_TABLE_, %ebx
+	call	___tls_get_addr@PLT
+	movl	16(%esp), %ebx
+	jmp	.Lret
+	cfi_endproc
+	.size	_dl_tlsdesc_dynamic, .-_dl_tlsdesc_dynamic
+#endif /* SHARED */
+
+     /* This function is a wrapper for a lazy resolver for TLS_DESC
+	REL relocations that reference the *ABS* segment in their own
+	link maps.  %ebx points to the caller's GOT.  %eax points to a
+	TLS descriptor, such that 0(%eax) holds the address of the
+	resolver wrapper itself (unless some other thread beat us to
+	it) and 4(%eax) holds the addend in the relocation.
+
+	When the actual resolver returns, it will have adjusted the
+	TLS descriptor such that we can tail-call it for it to return
+	the TP offset of the symbol.  */
+
+	.hidden _dl_tlsdesc_resolve_abs_plus_addend
+	.global	_dl_tlsdesc_resolve_abs_plus_addend
+	.type	_dl_tlsdesc_resolve_abs_plus_addend,@function
+	cfi_startproc
+	.align 16
+_dl_tlsdesc_resolve_abs_plus_addend:
+0:
+	pushl	%eax
+	cfi_adjust_cfa_offset (4)
+	pushl	%ecx
+	cfi_adjust_cfa_offset (4)
+	pushl	%edx
+	cfi_adjust_cfa_offset (4)
+	movl	$1f - 0b, %ecx
+	movl	4(%ebx), %edx
+	call	_dl_tlsdesc_resolve_abs_plus_addend_fixup
+1:
+	popl	%edx
+	cfi_adjust_cfa_offset (-4)
+	popl	%ecx
+	cfi_adjust_cfa_offset (-4)
+	popl	%eax
+	cfi_adjust_cfa_offset (-4)
+	jmp	*(%eax)
+	cfi_endproc
+	.size	_dl_tlsdesc_resolve_abs_plus_addend, .-_dl_tlsdesc_resolve_abs_plus_addend
+
+     /* This function is a wrapper for a lazy resolver for TLS_DESC
+	REL relocations that had zero addends.  %ebx points to the
+	caller's GOT.  %eax points to a TLS descriptor, such that
+	0(%eax) holds the address of the resolver wrapper itself
+	(unless some other thread beat us to it) and 4(%eax) holds a
+	pointer to the relocation.
+
+	When the actual resolver returns, it will have adjusted the
+	TLS descriptor such that we can tail-call it for it to return
+	the TP offset of the symbol.  */
+
+	.hidden _dl_tlsdesc_resolve_rel
+	.global	_dl_tlsdesc_resolve_rel
+	.type	_dl_tlsdesc_resolve_rel,@function
+	cfi_startproc
+	.align 16
+_dl_tlsdesc_resolve_rel:
+0:
+	pushl	%eax
+	cfi_adjust_cfa_offset (4)
+	pushl	%ecx
+	cfi_adjust_cfa_offset (4)
+	pushl	%edx
+	cfi_adjust_cfa_offset (4)
+	movl	$1f - 0b, %ecx
+	movl	4(%ebx), %edx
+	call	_dl_tlsdesc_resolve_rel_fixup
+1:
+	popl	%edx
+	cfi_adjust_cfa_offset (-4)
+	popl	%ecx
+	cfi_adjust_cfa_offset (-4)
+	popl	%eax
+	cfi_adjust_cfa_offset (-4)
+	jmp	*(%eax)
+	cfi_endproc
+	.size	_dl_tlsdesc_resolve_rel, .-_dl_tlsdesc_resolve_rel
+
+     /* This function is a wrapper for a lazy resolver for TLS_DESC
+	RELA relocations.  %ebx points to the caller's GOT.  %eax
+	points to a TLS descriptor, such that 0(%eax) holds the
+	address of the resolver wrapper itself (unless some other
+	thread beat us to it) and 4(%eax) holds a pointer to the
+	relocation.
+
+	When the actual resolver returns, it will have adjusted the
+	TLS descriptor such that we can tail-call it for it to return
+	the TP offset of the symbol.  */
+
+	.hidden _dl_tlsdesc_resolve_rela
+	.global	_dl_tlsdesc_resolve_rela
+	.type	_dl_tlsdesc_resolve_rela,@function
+	cfi_startproc
+	.align 16
+_dl_tlsdesc_resolve_rela:
+0:
+	pushl	%eax
+	cfi_adjust_cfa_offset (4)
+	pushl	%ecx
+	cfi_adjust_cfa_offset (4)
+	pushl	%edx
+	cfi_adjust_cfa_offset (4)
+	movl	$1f - 0b, %ecx
+	movl	4(%ebx), %edx
+	call	_dl_tlsdesc_resolve_rela_fixup
+1:
+	popl	%edx
+	cfi_adjust_cfa_offset (-4)
+	popl	%ecx
+	cfi_adjust_cfa_offset (-4)
+	popl	%eax
+	cfi_adjust_cfa_offset (-4)
+	jmp	*(%eax)
+	cfi_endproc
+	.size	_dl_tlsdesc_resolve_rela, .-_dl_tlsdesc_resolve_rela
+
+     /* This function is a placeholder for lazy resolving of TLS
+	relocations.  Once some thread starts resolving a TLS
+	relocation, it sets up the TLS descriptor to use this
+	resolver, such that other threads that would attempt to
+	resolve it concurrently may skip the call to the original lazy
+	resolver and go straight to a condition wait.
+
+	When the actual resolver returns, it will have adjusted the
+	TLS descriptor such that we can tail-call it for it to return
+	the TP offset of the symbol.  */
+
+	.hidden _dl_tlsdesc_resolve_hold
+	.global	_dl_tlsdesc_resolve_hold
+	.type	_dl_tlsdesc_resolve_hold,@function
+	cfi_startproc
+	.align 16
+_dl_tlsdesc_resolve_hold:
+0:
+	pushl	%eax
+	cfi_adjust_cfa_offset (4)
+	pushl	%ecx
+	cfi_adjust_cfa_offset (4)
+	pushl	%edx
+	cfi_adjust_cfa_offset (4)
+	movl	$1f - 0b, %ecx
+	movl	4(%ebx), %edx
+	call	_dl_tlsdesc_resolve_hold_fixup
+1:
+	popl	%edx
+	cfi_adjust_cfa_offset (-4)
+	popl	%ecx
+	cfi_adjust_cfa_offset (-4)
+	popl	%eax
+	cfi_adjust_cfa_offset (-4)
+	jmp	*(%eax)
+	cfi_endproc
+	.size	_dl_tlsdesc_resolve_hold, .-_dl_tlsdesc_resolve_hold
Index: sysdeps/i386/dl-tlsdesc.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ sysdeps/i386/dl-tlsdesc.h	2008-01-31 03:37:42.000000000 -0200
@@ -0,0 +1,61 @@
+/* Thread-local storage descriptor handling in the ELF dynamic linker.
+   i386 version.
+   Copyright (C) 2005, 2008 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#ifndef _I386_DL_TLSDESC_H
+# define _I386_DL_TLSDESC_H 1
+
+/* Type used to represent a TLS descriptor in the GOT.  */
+struct tlsdesc
+{
+  ptrdiff_t __attribute__((regparm(1))) (*entry)(struct tlsdesc *);
+  void *arg;
+};
+
+typedef struct dl_tls_index
+{
+  unsigned long int ti_module;
+  unsigned long int ti_offset;
+} tls_index;
+
+/* Type used as the argument in a TLS descriptor for a symbol that
+   needs dynamic TLS offsets.  */
+struct tlsdesc_dynamic_arg
+{
+  tls_index tlsinfo;
+  size_t gen_count;
+};
+
+extern ptrdiff_t attribute_hidden __attribute__((regparm(1)))
+  _dl_tlsdesc_return(struct tlsdesc *),
+  _dl_tlsdesc_undefweak(struct tlsdesc *),
+  _dl_tlsdesc_resolve_abs_plus_addend(struct tlsdesc *),
+  _dl_tlsdesc_resolve_rel(struct tlsdesc *),
+  _dl_tlsdesc_resolve_rela(struct tlsdesc *),
+  _dl_tlsdesc_resolve_hold(struct tlsdesc *);
+
+# ifdef SHARED
+extern void *internal_function _dl_make_tlsdesc_dynamic (struct link_map *map,
+							 size_t ti_offset);
+
+extern ptrdiff_t attribute_hidden __attribute__((regparm(1)))
+  _dl_tlsdesc_dynamic(struct tlsdesc *);
+# endif
+
+#endif
Index: sysdeps/i386/tlsdesc.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ sysdeps/i386/tlsdesc.c	2008-01-31 04:06:43.000000000 -0200
@@ -0,0 +1,269 @@
+/* Manage TLS descriptors.  i386 version.
+   Copyright (C) 2005, 2008 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#include <link.h>
+#include <ldsodefs.h>
+#include <elf/dynamic-link.h>
+#include <tls.h>
+#include <dl-tlsdesc.h>
+#include <tlsdeschtab.h>
+
+/* The following 4 functions take an entry_check_offset argument.
+   It's computed by the caller as an offset between its entry point
+   and the call site, such that by adding the built-in return address
+   that is implicitly passed to the function with this offset, we can
+   easily obtain the caller's entry point to compare with the entry
+   point given in the TLS descriptor.  If it's changed, we want to
+   return immediately.  */
+
+/* This function is used to lazily resolve TLS_DESC REL relocations
+   that reference the *ABS* segment in their own link maps.  The
+   argument is the addend originally stored there.  */
+
+void
+__attribute__ ((regparm (3))) attribute_hidden
+_dl_tlsdesc_resolve_abs_plus_addend_fixup (struct tlsdesc volatile *td,
+					   struct link_map *l,
+					   ptrdiff_t entry_check_offset)
+{
+  ptrdiff_t addend = (ptrdiff_t) td->arg;
+
+  if (_dl_tlsdesc_resolve_early_return_p (td, __builtin_return_address (0)
+					  - entry_check_offset))
+    return;
+
+#ifndef SHARED
+  CHECK_STATIC_TLS (l, l);
+#else
+  if (!TRY_STATIC_TLS (l, l))
+    {
+      td->arg = _dl_make_tlsdesc_dynamic (l, addend);
+      td->entry = _dl_tlsdesc_dynamic;
+    }
+  else
+#endif
+    {
+      td->arg = (void*)(addend - l->l_tls_offset);
+      td->entry = _dl_tlsdesc_return;
+    }
+
+  _dl_tlsdesc_wake_up_held_fixups ();
+}
+
+/* This function is used to lazily resolve TLS_DESC REL relocations
+   that originally had zero addends.  The argument location, that
+   originally held the addend, is used to hold a pointer to the
+   relocation, but it has to be restored before we call the function
+   that applies relocations.  */
+
+void
+__attribute__ ((regparm (3))) attribute_hidden
+_dl_tlsdesc_resolve_rel_fixup (struct tlsdesc volatile *td,
+			       struct link_map *l,
+			       ptrdiff_t entry_check_offset)
+{
+  const ElfW(Rel) *reloc = td->arg;
+
+  if (_dl_tlsdesc_resolve_early_return_p (td, __builtin_return_address (0)
+					  - entry_check_offset))
+    return;
+
+  /* The code below was borrowed from _dl_fixup(),
+     except for checking for STB_LOCAL.  */
+  const ElfW(Sym) *const symtab
+    = (const void *) D_PTR (l, l_info[DT_SYMTAB]);
+  const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]);
+  const ElfW(Sym) *sym = &symtab[ELFW(R_SYM) (reloc->r_info)];
+  lookup_t result;
+
+   /* Look up the target symbol.  If the normal lookup rules are not
+      used don't look in the global scope.  */
+  if (ELFW(ST_BIND) (sym->st_info) != STB_LOCAL
+      && __builtin_expect (ELFW(ST_VISIBILITY) (sym->st_other), 0) == 0)
+    {
+      const struct r_found_version *version = NULL;
+
+      if (l->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
+	{
+	  const ElfW(Half) *vernum =
+	    (const void *) D_PTR (l, l_info[VERSYMIDX (DT_VERSYM)]);
+	  ElfW(Half) ndx = vernum[ELFW(R_SYM) (reloc->r_info)] & 0x7fff;
+	  version = &l->l_versions[ndx];
+	  if (version->hash == 0)
+	    version = NULL;
+	}
+
+      result = _dl_lookup_symbol_x (strtab + sym->st_name, l, &sym,
+				    l->l_scope, version, ELF_RTYPE_CLASS_PLT,
+				    DL_LOOKUP_ADD_DEPENDENCY, NULL);
+    }
+  else
+    {
+      /* We already found the symbol.  The module (and therefore its load
+	 address) is also known.  */
+      result = l;
+    }
+
+  if (!sym)
+    {
+      td->arg = 0;
+      td->entry = _dl_tlsdesc_undefweak;
+    }
+  else
+    {
+#  ifndef SHARED
+      CHECK_STATIC_TLS (l, result);
+#  else
+      if (!TRY_STATIC_TLS (l, result))
+	{
+	  td->arg = _dl_make_tlsdesc_dynamic (result, sym->st_value);
+	  td->entry = _dl_tlsdesc_dynamic;
+	}
+      else
+#  endif
+	{
+	  td->arg = (void*)(sym->st_value - result->l_tls_offset);
+	  td->entry = _dl_tlsdesc_return;
+	}
+    }
+
+  _dl_tlsdesc_wake_up_held_fixups ();
+}
+
+/* This function is used to lazily resolve TLS_DESC RELA relocations.
+   The argument location is used to hold a pointer to the relocation.  */
+
+void
+__attribute__ ((regparm (3))) attribute_hidden
+_dl_tlsdesc_resolve_rela_fixup (struct tlsdesc volatile *td,
+				struct link_map *l,
+				ptrdiff_t entry_check_offset)
+{
+  const ElfW(Rela) *reloc = td->arg;
+
+  if (_dl_tlsdesc_resolve_early_return_p (td, __builtin_return_address (0)
+					  - entry_check_offset))
+    return;
+
+  /* The code below was borrowed from _dl_fixup(),
+     except for checking for STB_LOCAL.  */
+  const ElfW(Sym) *const symtab
+    = (const void *) D_PTR (l, l_info[DT_SYMTAB]);
+  const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]);
+  const ElfW(Sym) *sym = &symtab[ELFW(R_SYM) (reloc->r_info)];
+  lookup_t result;
+
+   /* Look up the target symbol.  If the normal lookup rules are not
+      used don't look in the global scope.  */
+  if (ELFW(ST_BIND) (sym->st_info) != STB_LOCAL
+      && __builtin_expect (ELFW(ST_VISIBILITY) (sym->st_other), 0) == 0)
+    {
+      const struct r_found_version *version = NULL;
+
+      if (l->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
+	{
+	  const ElfW(Half) *vernum =
+	    (const void *) D_PTR (l, l_info[VERSYMIDX (DT_VERSYM)]);
+	  ElfW(Half) ndx = vernum[ELFW(R_SYM) (reloc->r_info)] & 0x7fff;
+	  version = &l->l_versions[ndx];
+	  if (version->hash == 0)
+	    version = NULL;
+	}
+
+      result = _dl_lookup_symbol_x (strtab + sym->st_name, l, &sym,
+				    l->l_scope, version, ELF_RTYPE_CLASS_PLT,
+				    DL_LOOKUP_ADD_DEPENDENCY, NULL);
+    }
+  else
+    {
+      /* We already found the symbol.  The module (and therefore its load
+	 address) is also known.  */
+      result = l;
+    }
+
+  if (!sym)
+    {
+      td->arg = (void*)reloc->r_addend;
+      td->entry = _dl_tlsdesc_undefweak;
+    }
+  else
+    {
+#  ifndef SHARED
+      CHECK_STATIC_TLS (l, result);
+#  else
+      if (!TRY_STATIC_TLS (l, result))
+	{
+	  td->arg = _dl_make_tlsdesc_dynamic (result, sym->st_value
+					      + reloc->r_addend);
+	  td->entry = _dl_tlsdesc_dynamic;
+	}
+      else
+#  endif
+	{
+	  td->arg = (void*)(sym->st_value - result->l_tls_offset
+			    + reloc->r_addend);
+	  td->entry = _dl_tlsdesc_return;
+	}
+    }
+
+  _dl_tlsdesc_wake_up_held_fixups ();
+}
+
+/* This function is used to avoid busy waiting for other threads to
+   complete the lazy relocation.  Once another thread wins the race to
+   relocate a TLS descriptor, it sets the descriptor up such that this
+   function is called to wait until the resolver releases the
+   lock.  */
+
+void
+__attribute__ ((regparm (3))) attribute_hidden
+_dl_tlsdesc_resolve_hold_fixup (struct tlsdesc volatile *td,
+				struct link_map *l __attribute__((__unused__)),
+				ptrdiff_t entry_check_offset)
+{
+  /* Maybe we're lucky and can return early.  */
+  if (__builtin_return_address (0) - entry_check_offset != td->entry)
+    return;
+
+  /* Locking here will stop execution until the running resolver runs
+     _dl_tlsdesc_wake_up_held_fixups(), releasing the lock.
+
+     FIXME: We'd be better off waiting on a condition variable, such
+     that we didn't have to hold the lock throughout the relocation
+     processing.  */
+  __rtld_lock_lock_recursive (GL(dl_load_lock));
+  __rtld_lock_unlock_recursive (GL(dl_load_lock));
+}
+
+
+/* Unmap the dynamic object, but also release its TLS descriptor table
+   if there is one.  */
+
+void
+internal_function
+_dl_unmap (struct link_map *map)
+{
+  __munmap ((void *) (map)->l_map_start,
+	    (map)->l_map_end - (map)->l_map_start);
+
+#if SHARED
+  if (map->l_mach.tlsdesc_table)
+    htab_delete (map->l_mach.tlsdesc_table);
+#endif
+}
Index: sysdeps/i386/tlsdesc.sym
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ sysdeps/i386/tlsdesc.sym	2008-01-31 03:37:42.000000000 -0200
@@ -0,0 +1,17 @@
+#include <stddef.h>
+#include <sysdep.h>
+#include <tls.h>
+#include <link.h>
+#include <dl-tlsdesc.h>
+
+--
+
+-- Abuse tls.h macros to derive offsets relative to the thread register.
+
+DTV_OFFSET			offsetof(struct pthread, header.dtv)
+
+TLSDESC_ARG			offsetof(struct tlsdesc, arg)
+
+TLSDESC_GEN_COUNT		offsetof(struct tlsdesc_dynamic_arg, gen_count)
+TLSDESC_MODID			offsetof(struct tlsdesc_dynamic_arg, tlsinfo.ti_module)
+TLSDESC_MODOFF			offsetof(struct tlsdesc_dynamic_arg, tlsinfo.ti_offset)
Index: sysdeps/x86_64/Makefile
===================================================================
--- sysdeps/x86_64/Makefile.orig	2008-01-31 03:37:39.000000000 -0200
+++ sysdeps/x86_64/Makefile	2008-01-31 03:37:42.000000000 -0200
@@ -13,3 +13,13 @@ endif
 ifeq ($(subdir),string)
 sysdep_routines += cacheinfo
 endif
+
+ifeq ($(subdir),elf)
+sysdep-dl-routines += tlsdesc dl-tlsdesc
+sysdep_routines += tlsdesc dl-tlsdesc
+sysdep-rtld-routines += tlsdesc dl-tlsdesc
+endif
+
+ifeq ($(subdir),csu)
+gen-as-const-headers += tlsdesc.sym
+endif
Index: sysdeps/x86_64/bits/linkmap.h
===================================================================
--- sysdeps/x86_64/bits/linkmap.h.orig	2008-01-31 03:37:39.000000000 -0200
+++ sysdeps/x86_64/bits/linkmap.h	2008-01-31 03:37:42.000000000 -0200
@@ -3,6 +3,7 @@ struct link_map_machine
   {
     Elf64_Addr plt; /* Address of .plt + 0x16 */
     Elf64_Addr gotplt; /* Address of .got + 0x18 */
+    void *tlsdesc_table; /* Address of TLS descriptor hash table.  */
   };
 
 #else
@@ -10,5 +11,6 @@ struct link_map_machine
   {
     Elf32_Addr plt; /* Address of .plt + 0x16 */
     Elf32_Addr gotplt; /* Address of .got + 0x0c */
+    void *tlsdesc_table; /* Address of TLS descriptor hash table.  */
   };
 #endif
Index: sysdeps/x86_64/dl-lookupcfg.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ sysdeps/x86_64/dl-lookupcfg.h	2008-01-31 03:37:42.000000000 -0200
@@ -0,0 +1,28 @@
+/* Configuration of lookup functions.
+   Copyright (C) 2005, 2008 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#define DL_UNMAP_IS_SPECIAL
+
+#include_next <dl-lookupcfg.h>
+
+struct link_map;
+
+extern void internal_function _dl_unmap (struct link_map *map);
+
+#define DL_UNMAP(map) _dl_unmap (map)
Index: sysdeps/x86_64/dl-machine.h
===================================================================
--- sysdeps/x86_64/dl-machine.h.orig	2008-01-31 03:37:39.000000000 -0200
+++ sysdeps/x86_64/dl-machine.h	2008-01-31 03:37:42.000000000 -0200
@@ -26,6 +26,7 @@
 #include <sys/param.h>
 #include <sysdep.h>
 #include <tls.h>
+#include <dl-tlsdesc.h>
 
 /* Return nonzero iff ELF header is compatible with the running host.  */
 static inline int __attribute__ ((unused))
@@ -131,6 +132,10 @@ elf_machine_runtime_setup (struct link_m
 	got[2] = (Elf64_Addr) &_dl_runtime_resolve;
     }
 
+  if (l->l_info[ADDRIDX (DT_TLSDESC_GOT)] && lazy)
+    *(Elf64_Addr*)(D_PTR (l, l_info[ADDRIDX (DT_TLSDESC_GOT)]) + l->l_addr)
+      = (Elf64_Addr) &_dl_tlsdesc_resolve_rela;
+
   return lazy;
 }
 
@@ -194,7 +199,9 @@ _dl_start_user:\n\
 # define elf_machine_type_class(type)					      \
   ((((type) == R_X86_64_JUMP_SLOT					      \
      || (type) == R_X86_64_DTPMOD64					      \
-     || (type) == R_X86_64_DTPOFF64 || (type) == R_X86_64_TPOFF64)	      \
+     || (type) == R_X86_64_DTPOFF64					      \
+     || (type) == R_X86_64_TPOFF64					      \
+     || (type) == R_X86_64_TLSDESC)					      \
     * ELF_RTYPE_CLASS_PLT)						      \
    | (((type) == R_X86_64_COPY) * ELF_RTYPE_CLASS_COPY))
 #else
@@ -323,6 +330,41 @@ elf_machine_rela (struct link_map *map, 
 	    *reloc_addr = sym->st_value + reloc->r_addend;
 # endif
 	  break;
+	case R_X86_64_TLSDESC:
+	  {
+	    struct tlsdesc volatile *td =
+	      (struct tlsdesc volatile *)reloc_addr;
+
+# ifndef RTLD_BOOTSTRAP
+	    if (! sym)
+	      {
+		td->arg = (void*)reloc->r_addend;
+		td->entry = _dl_tlsdesc_undefweak;
+	      }
+	    else
+# endif
+	      {
+# ifndef RTLD_BOOTSTRAP
+#  ifndef SHARED
+		CHECK_STATIC_TLS (map, sym_map);
+#  else
+		if (!TRY_STATIC_TLS (map, sym_map))
+		  {
+		    td->arg = _dl_make_tlsdesc_dynamic
+		      (sym_map, sym->st_value + reloc->r_addend);
+		    td->entry = _dl_tlsdesc_dynamic;
+		  }
+		else
+#  endif
+# endif
+		  {
+		    td->arg = (void*)(sym->st_value - sym_map->l_tls_offset
+				      + reloc->r_addend);
+		    td->entry = _dl_tlsdesc_return;
+		  }
+	      }
+	    break;
+	  }
 	case R_X86_64_TPOFF64:
 	  /* The offset is negative, forward from the thread pointer.  */
 # ifndef RTLD_BOOTSTRAP
@@ -435,6 +477,15 @@ elf_machine_lazy_rel (struct link_map *m
 	  map->l_mach.plt
 	  + (((Elf64_Addr) reloc_addr) - map->l_mach.gotplt) * 2;
     }
+  else if (__builtin_expect (r_type == R_X86_64_TLSDESC, 1))
+    {
+      struct tlsdesc volatile * __attribute__((__unused__)) td =
+	(struct tlsdesc volatile *)reloc_addr;
+
+      td->arg = (void*)reloc;
+      td->entry = (void*)(D_PTR (map, l_info[ADDRIDX (DT_TLSDESC_PLT)])
+			  + map->l_addr);
+    }
   else
     _dl_reloc_bad_type (map, r_type, 1);
 }
Index: sysdeps/x86_64/dl-tls.h
===================================================================
--- sysdeps/x86_64/dl-tls.h.orig	2008-01-31 03:37:39.000000000 -0200
+++ sysdeps/x86_64/dl-tls.h	2008-01-31 03:37:42.000000000 -0200
@@ -1,5 +1,5 @@
 /* Thread-local storage handling in the ELF dynamic linker.  x86-64 version.
-   Copyright (C) 2002 Free Software Foundation, Inc.
+   Copyright (C) 2002, 2005 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it and/or
@@ -19,7 +19,7 @@
 
 
 /* Type used for the representation of TLS information in the GOT.  */
-typedef struct
+typedef struct dl_tls_index
 {
   unsigned long int ti_module;
   unsigned long int ti_offset;
Index: sysdeps/x86_64/dl-tlsdesc.S
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ sysdeps/x86_64/dl-tlsdesc.S	2008-01-31 04:16:22.000000000 -0200
@@ -0,0 +1,245 @@
+/* Thread-local storage handling in the ELF dynamic linker.  x86_64 version.
+   Copyright (C) 2004, 2005, 2008 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#include <sysdep.h>
+#include <tls.h>
+#include "tlsdesc.h"
+
+	.text
+
+     /* This function is used to compute the TP offset for symbols in
+	Static TLS, i.e., whose TP offset is the same for all
+	threads.
+
+	The incoming %rax points to the TLS descriptor, such that
+	0(%rax) points to _dl_tlsdesc_return itself, and 8(%rax) holds
+	the TP offset of the symbol corresponding to the object
+	denoted by the argument.  */
+
+	.hidden _dl_tlsdesc_return
+	.global	_dl_tlsdesc_return
+	.type	_dl_tlsdesc_return,@function
+	cfi_startproc
+	.align 16
+_dl_tlsdesc_return:
+	movq	8(%rax), %rax
+	ret
+	cfi_endproc
+	.size	_dl_tlsdesc_return, .-_dl_tlsdesc_return
+
+     /* This function is used for undefined weak TLS symbols, for
+	which the base address (i.e., disregarding any addend) should
+	resolve to NULL.
+
+	%rax points to the TLS descriptor, such that 0(%rax) points to
+	_dl_tlsdesc_undefweak itself, and 8(%rax) holds the addend.
+	We return the addend minus the TP, such that, when the caller
+	adds TP, it gets the addend back.  If that's zero, as usual,
+	that's most likely a NULL pointer.  */
+
+	.hidden _dl_tlsdesc_undefweak
+	.global	_dl_tlsdesc_undefweak
+	.type	_dl_tlsdesc_undefweak,@function
+	cfi_startproc
+	.align 16
+_dl_tlsdesc_undefweak:
+	movq	8(%rax), %rax
+	subq	%fs:0, %rax
+	ret
+	cfi_endproc
+	.size	_dl_tlsdesc_undefweak, .-_dl_tlsdesc_undefweak
+
+#ifdef SHARED
+	.hidden _dl_tlsdesc_dynamic
+	.global	_dl_tlsdesc_dynamic
+	.type	_dl_tlsdesc_dynamic,@function
+
+     /* %rax points to the TLS descriptor, such that 0(%rax) points to
+	_dl_tlsdesc_dynamic itself, and 8(%rax) points to a struct
+	tlsdesc_dynamic_arg object.  It must return in %rax the offset
+	between the thread pointer and the object denoted by the
+	argument, without clobbering any registers.
+
+	The assembly code that follows is a rendition of the following
+	C code, hand-optimized a little bit.
+
+ptrdiff_t
+_dl_tlsdesc_dynamic (register struct tlsdesc *tdp asm ("%rax"))
+{
+  struct tlsdesc_dynamic_arg *td = tdp->arg;
+  dtv_t *dtv = *(dtv_t **)((char *)__thread_pointer + DTV_OFFSET);
+  if (__builtin_expect (td->gen_count <= dtv[0].counter
+			&& (dtv[td->tlsinfo.ti_module].pointer.val
+			    != TLS_DTV_UNALLOCATED),
+			1))
+    return dtv[td->tlsinfo.ti_module].pointer.val + td->tlsinfo.ti_offset
+      - __thread_pointer;
+
+  return __tls_get_addr_internal (&td->tlsinfo) - __thread_pointer;
+}
+*/
+	cfi_startproc
+	.align 16
+_dl_tlsdesc_dynamic:
+	/* Preserve call-clobbered registers that we modify.
+	   We need two scratch regs anyway.  */
+	movq	%rsi, -16(%rsp)
+	movq	%fs:DTV_OFFSET, %rsi
+	movq	%rdi, -8(%rsp)
+	movq	TLSDESC_ARG(%rax), %rdi
+	movq	(%rsi), %rax
+	cmpq	%rax, TLSDESC_GEN_COUNT(%rdi)
+	ja	.Lslow
+	movq	TLSDESC_MODID(%rdi), %rax
+	salq	$4, %rax
+	movq	(%rax,%rsi), %rax
+	cmpq	$-1, %rax
+	je	.Lslow
+	addq	TLSDESC_MODOFF(%rdi), %rax
+.Lret:
+	movq	-16(%rsp), %rsi
+	subq	%fs:0, %rax
+	movq	-8(%rsp), %rdi
+	ret
+.Lslow:
+	/* Besides rdi and rsi, saved above, save rdx, rcx, r8, r9,
+	   r10 and r11.  Also, align the stack, that's off by 8 bytes.	*/
+	subq	$72, %rsp
+	cfi_adjust_cfa_offset (72)
+	movq	%rdx, 8(%rsp)
+	movq	%rcx, 16(%rsp)
+	movq	%r8, 24(%rsp)
+	movq	%r9, 32(%rsp)
+	movq	%r10, 40(%rsp)
+	movq	%r11, 48(%rsp)
+	/* %rdi already points to the tlsinfo data structure.  */
+	call	__tls_get_addr@PLT
+	movq	8(%rsp), %rdx
+	movq	16(%rsp), %rcx
+	movq	24(%rsp), %r8
+	movq	32(%rsp), %r9
+	movq	40(%rsp), %r10
+	movq	48(%rsp), %r11
+	addq	$72, %rsp
+	cfi_adjust_cfa_offset (-72)
+	jmp	.Lret
+	cfi_endproc
+	.size	_dl_tlsdesc_dynamic, .-_dl_tlsdesc_dynamic
+#endif /* SHARED */
+
+     /* This function is a wrapper for a lazy resolver for TLS_DESC
+	RELA relocations.  The incoming 0(%rsp) points to the caller's
+	link map, pushed by the dynamic object's internal lazy TLS
+	resolver front-end before tail-calling us.  We need to pop it
+	ourselves.  %rax points to a TLS descriptor, such that 0(%rax)
+	holds the address of the internal resolver front-end (unless
+	some other thread beat us to resolving it) and 8(%rax) holds a
+	pointer to the relocation.
+
+	When the actual resolver returns, it will have adjusted the
+	TLS descriptor such that we can tail-call it for it to return
+	the TP offset of the symbol.  */
+
+	.hidden _dl_tlsdesc_resolve_rela
+	.global	_dl_tlsdesc_resolve_rela
+	.type	_dl_tlsdesc_resolve_rela,@function
+	cfi_startproc
+	.align 16
+	/* The PLT entry will have pushed the link_map pointer.  */
+_dl_tlsdesc_resolve_rela:
+	cfi_adjust_cfa_offset (8)
+	/* Save all call-clobbered registers.  */
+	subq	$72, %rsp
+	cfi_adjust_cfa_offset (72)
+	movq	%rax, (%rsp)
+	movq	%rdi, 8(%rsp)
+	movq	%rax, %rdi	/* Pass tlsdesc* in %rdi.  */
+	movq	%rsi, 16(%rsp)
+	movq	72(%rsp), %rsi	/* Pass link_map* in %rsi.  */
+	movq	%r8, 24(%rsp)
+	movq	%r9, 32(%rsp)
+	movq	%r10, 40(%rsp)
+	movq	%r11, 48(%rsp)
+	movq	%rdx, 56(%rsp)
+	movq	%rcx, 64(%rsp)
+	call	_dl_tlsdesc_resolve_rela_fixup
+	movq	(%rsp), %rax
+	movq	8(%rsp), %rdi
+	movq	16(%rsp), %rsi
+	movq	24(%rsp), %r8
+	movq	32(%rsp), %r9
+	movq	40(%rsp), %r10
+	movq	48(%rsp), %r11
+	movq	56(%rsp), %rdx
+	movq	64(%rsp), %rcx
+	addq	$80, %rsp
+	cfi_adjust_cfa_offset (-80)
+	jmp	*(%rax)
+	cfi_endproc
+	.size	_dl_tlsdesc_resolve_rela, .-_dl_tlsdesc_resolve_rela
+
+     /* This function is a placeholder for lazy resolving of TLS
+	relocations.  Once some thread starts resolving a TLS
+	relocation, it sets up the TLS descriptor to use this
+	resolver, such that other threads that would attempt to
+	resolve it concurrently may skip the call to the original lazy
+	resolver and go straight to a condition wait.
+
+	When the actual resolver returns, it will have adjusted the
+	TLS descriptor such that we can tail-call it for it to return
+	the TP offset of the symbol.  */
+
+	.hidden _dl_tlsdesc_resolve_hold
+	.global	_dl_tlsdesc_resolve_hold
+	.type	_dl_tlsdesc_resolve_hold,@function
+	cfi_startproc
+	.align 16
+_dl_tlsdesc_resolve_hold:
+0:
+	/* Save all call-clobbered registers.  */
+	subq	$72, %rsp
+	cfi_adjust_cfa_offset (72)
+	movq	%rax, (%rsp)
+	movq	%rdi, 8(%rsp)
+	movq	%rax, %rdi	/* Pass tlsdesc* in %rdi.  */
+	movq	%rsi, 16(%rsp)
+	/* Pass _dl_tlsdesc_resolve_hold's address in %rsi.  */
+	leaq	. - _dl_tlsdesc_resolve_hold(%rip), %rsi
+	movq	%r8, 24(%rsp)
+	movq	%r9, 32(%rsp)
+	movq	%r10, 40(%rsp)
+	movq	%r11, 48(%rsp)
+	movq	%rdx, 56(%rsp)
+	movq	%rcx, 64(%rsp)
+	call	_dl_tlsdesc_resolve_hold_fixup
+1:
+	movq	(%rsp), %rax
+	movq	8(%rsp), %rdi
+	movq	16(%rsp), %rsi
+	movq	24(%rsp), %r8
+	movq	32(%rsp), %r9
+	movq	40(%rsp), %r10
+	movq	48(%rsp), %r11
+	movq	56(%rsp), %rdx
+	movq	64(%rsp), %rcx
+	addq	$72, %rsp
+	cfi_adjust_cfa_offset (-72)
+	jmp	*(%eax)
+	cfi_endproc
+	.size	_dl_tlsdesc_resolve_hold, .-_dl_tlsdesc_resolve_hold
Index: sysdeps/x86_64/dl-tlsdesc.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ sysdeps/x86_64/dl-tlsdesc.h	2008-01-31 03:37:42.000000000 -0200
@@ -0,0 +1,64 @@
+/* Thread-local storage descriptor handling in the ELF dynamic linker.
+   x86_64 version.
+   Copyright (C) 2005, 2008 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#ifndef _X86_64_DL_TLSDESC_H
+# define _X86_64_DL_TLSDESC_H 1
+
+/* Use this to access DT_TLSDESC_PLT and DT_TLSDESC_GOT.  */
+#ifndef ADDRIDX
+# define ADDRIDX(tag) (DT_NUM + DT_THISPROCNUM + DT_VERSIONTAGNUM \
+		       + DT_EXTRANUM + DT_VALNUM + DT_ADDRTAGIDX (tag))
+#endif
+
+/* Type used to represent a TLS descriptor in the GOT.  */
+struct tlsdesc
+{
+  ptrdiff_t (*entry)(struct tlsdesc *on_rax);
+  void *arg;
+};
+
+typedef struct dl_tls_index
+{
+  unsigned long int ti_module;
+  unsigned long int ti_offset;
+} tls_index;
+
+/* Type used as the argument in a TLS descriptor for a symbol that
+   needs dynamic TLS offsets.  */
+struct tlsdesc_dynamic_arg
+{
+  tls_index tlsinfo;
+  size_t gen_count;
+};
+
+extern ptrdiff_t attribute_hidden
+  _dl_tlsdesc_return(struct tlsdesc *on_rax),
+  _dl_tlsdesc_undefweak(struct tlsdesc *on_rax),
+  _dl_tlsdesc_resolve_rela(struct tlsdesc *on_rax),
+  _dl_tlsdesc_resolve_hold(struct tlsdesc *on_rax);
+
+# ifdef SHARED
+extern void *internal_function _dl_make_tlsdesc_dynamic (struct link_map *map,
+							 size_t ti_offset);
+
+extern ptrdiff_t attribute_hidden _dl_tlsdesc_dynamic(struct tlsdesc *);
+# endif
+
+#endif
Index: sysdeps/x86_64/tlsdesc.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ sysdeps/x86_64/tlsdesc.c	2008-01-31 04:07:30.000000000 -0200
@@ -0,0 +1,151 @@
+/* Manage TLS descriptors.  x86_64 version.
+   Copyright (C) 2005, 2008 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#include <link.h>
+#include <ldsodefs.h>
+#include <elf/dynamic-link.h>
+#include <tls.h>
+#include <dl-tlsdesc.h>
+#include <tlsdeschtab.h>
+
+/* The following 2 functions take a caller argument, that contains the
+   address expected to be in the TLS descriptor.  If it's changed, we
+   want to return immediately.  */
+
+/* This function is used to lazily resolve TLS_DESC RELA relocations.
+   The argument location is used to hold a pointer to the relocation.  */
+
+void
+attribute_hidden
+_dl_tlsdesc_resolve_rela_fixup (struct tlsdesc volatile *td,
+				struct link_map *l)
+{
+  const ElfW(Rela) *reloc = td->arg;
+
+  if (_dl_tlsdesc_resolve_early_return_p
+      (td, (void*)(D_PTR (l, l_info[ADDRIDX (DT_TLSDESC_PLT)]) + l->l_addr)))
+    return;
+
+  /* The code below was borrowed from _dl_fixup().  */
+  const ElfW(Sym) *const symtab
+    = (const void *) D_PTR (l, l_info[DT_SYMTAB]);
+  const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]);
+  const ElfW(Sym) *sym = &symtab[ELFW(R_SYM) (reloc->r_info)];
+  lookup_t result;
+
+   /* Look up the target symbol.  If the normal lookup rules are not
+      used don't look in the global scope.  */
+  if (ELFW(ST_BIND) (sym->st_info) != STB_LOCAL
+      && __builtin_expect (ELFW(ST_VISIBILITY) (sym->st_other), 0) == 0)
+    {
+      const struct r_found_version *version = NULL;
+
+      if (l->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
+	{
+	  const ElfW(Half) *vernum =
+	    (const void *) D_PTR (l, l_info[VERSYMIDX (DT_VERSYM)]);
+	  ElfW(Half) ndx = vernum[ELFW(R_SYM) (reloc->r_info)] & 0x7fff;
+	  version = &l->l_versions[ndx];
+	  if (version->hash == 0)
+	    version = NULL;
+	}
+
+      result = _dl_lookup_symbol_x (strtab + sym->st_name, l, &sym,
+				    l->l_scope, version, ELF_RTYPE_CLASS_PLT,
+				    DL_LOOKUP_ADD_DEPENDENCY, NULL);
+    }
+  else
+    {
+      /* We already found the symbol.  The module (and therefore its load
+	 address) is also known.  */
+      result = l;
+    }
+
+  if (! sym)
+    {
+      td->arg = (void*)reloc->r_addend;
+      td->entry = _dl_tlsdesc_undefweak;
+    }
+  else
+    {
+#  ifndef SHARED
+      CHECK_STATIC_TLS (l, result);
+#  else
+      if (!TRY_STATIC_TLS (l, result))
+	{
+	  td->arg = _dl_make_tlsdesc_dynamic (result, sym->st_value
+					      + reloc->r_addend);
+	  td->entry = _dl_tlsdesc_dynamic;
+	}
+      else
+#  endif
+	{
+	  td->arg = (void*)(sym->st_value - result->l_tls_offset
+			    + reloc->r_addend);
+	  td->entry = _dl_tlsdesc_return;
+	}
+    }
+
+  _dl_tlsdesc_wake_up_held_fixups ();
+}
+
+/* This function is used to avoid busy waiting for other threads to
+   complete the lazy relocation.  Once another thread wins the race to
+   relocate a TLS descriptor, it sets the descriptor up such that this
+   function is called to wait until the resolver releases the
+   lock.  */
+
+void
+attribute_hidden
+_dl_tlsdesc_resolve_hold_fixup (struct tlsdesc volatile *td,
+				void *caller)
+{
+  /* Maybe we're lucky and can return early.  */
+  if (caller != td->entry)
+    return;
+
+  /* Locking here will stop execution until the running resolver runs
+     _dl_tlsdesc_wake_up_held_fixups(), releasing the lock.
+
+     FIXME: We'd be better off waiting on a condition variable, such
+     that we didn't have to hold the lock throughout the relocation
+     processing.  */
+  __rtld_lock_lock_recursive (GL(dl_load_lock));
+  __rtld_lock_unlock_recursive (GL(dl_load_lock));
+}
+
+/* Unmap the dynamic object, but also release its TLS descriptor table
+   if there is one.  */
+
+void
+internal_function
+_dl_unmap (struct link_map *map)
+{
+  __munmap ((void *) (map)->l_map_start,
+	    (map)->l_map_end - (map)->l_map_start);
+
+#if SHARED
+  /* _dl_unmap is only called for dlopen()ed libraries, for which
+     calling free() is safe, or before we've completed the initial
+     relocation, in which case calling free() is probably pointless,
+     but still safe.  */
+  if (map->l_mach.tlsdesc_table)
+    htab_delete (map->l_mach.tlsdesc_table);
+#endif
+}
Index: sysdeps/x86_64/tlsdesc.sym
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ sysdeps/x86_64/tlsdesc.sym	2008-01-31 03:37:42.000000000 -0200
@@ -0,0 +1,17 @@
+#include <stddef.h>
+#include <sysdep.h>
+#include <tls.h>
+#include <link.h>
+#include <dl-tlsdesc.h>
+
+--
+
+-- Abuse tls.h macros to derive offsets relative to the thread register.
+
+DTV_OFFSET			offsetof(struct pthread, header.dtv)
+
+TLSDESC_ARG			offsetof(struct tlsdesc, arg)
+
+TLSDESC_GEN_COUNT		offsetof(struct tlsdesc_dynamic_arg, gen_count)
+TLSDESC_MODID			offsetof(struct tlsdesc_dynamic_arg, tlsinfo.ti_module)
+TLSDESC_MODOFF			offsetof(struct tlsdesc_dynamic_arg, tlsinfo.ti_offset)
Index: elf/tlsdeschtab.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ elf/tlsdeschtab.h	2008-01-31 03:37:42.000000000 -0200
@@ -0,0 +1,157 @@
+/* Hash table for TLS descriptors.
+   Copyright (C) 2005, 2008 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Alexandre Oliva  <aoliva@redhat.com>
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#ifndef TLSDESCHTAB_H
+# define TLSDESCHTAB_H 1
+
+# ifdef SHARED
+
+#  include <inline-hashtab.h>
+
+inline static int
+hash_tlsdesc(void *p)
+{
+  struct tlsdesc_dynamic_arg *td = p;
+
+  /* We know all entries are for the same module, so ti_offset is the
+     only distinguishing entry.  */
+  return td->tlsinfo.ti_offset;
+}
+
+inline static int
+eq_tlsdesc(void *p, void *q)
+{
+  struct tlsdesc_dynamic_arg *tdp = p, *tdq = q;
+
+  return tdp->tlsinfo.ti_offset == tdq->tlsinfo.ti_offset;
+}
+
+inline static int
+map_generation (struct link_map *map)
+{
+  size_t idx = map->l_tls_modid;
+  struct dtv_slotinfo_list *listp = GL(dl_tls_dtv_slotinfo_list);
+
+  /* Find the place in the dtv slotinfo list.  */
+  do
+    {
+      /* Does it fit in the array of this list element?  */
+      if (idx < listp->len)
+	{
+	  /* We should never get here for a module in static TLS, so
+	     we can assume that, if the generation count is zero, we
+	     still haven't determined the generation count for this
+	     module.  */
+	  if (listp->slotinfo[idx].gen)
+	    return listp->slotinfo[idx].gen;
+	  else
+	    break;
+	}
+      idx -= listp->len;
+      listp = listp->next;
+    }
+  while (listp != NULL);
+
+  /* If we get to this point, the module still hasn't been assigned an
+     entry in the dtv slotinfo data structures, and it will when we're
+     done with relocations.  At that point, the module will get a
+     generation number that is one past the current generation, so
+     return exactly that.  */
+  return GL(dl_tls_generation) + 1;
+}
+
+void *
+internal_function
+_dl_make_tlsdesc_dynamic (struct link_map *map, size_t ti_offset)
+{
+  struct hashtab *ht;
+  void **entry;
+  struct tlsdesc_dynamic_arg *td, test;
+
+  /* FIXME: We could use a per-map lock here, but is it worth it?  */
+  __rtld_lock_lock_recursive (GL(dl_load_lock));
+
+  ht = map->l_mach.tlsdesc_table;
+  if (! ht)
+    {
+      ht = htab_create ();
+      if (! ht)
+	{
+	  __rtld_lock_unlock_recursive (GL(dl_load_lock));
+	  return 0;
+	}
+      map->l_mach.tlsdesc_table = ht;
+    }
+
+  test.tlsinfo.ti_module = map->l_tls_modid;
+  test.tlsinfo.ti_offset = ti_offset;
+  entry = htab_find_slot (ht, &test, 1, hash_tlsdesc, eq_tlsdesc);
+  if (*entry)
+    {
+      td = *entry;
+      __rtld_lock_unlock_recursive (GL(dl_load_lock));
+      return td;
+    }
+
+  *entry = td = malloc (sizeof (struct tlsdesc_dynamic_arg));
+  /* This may be higher than the map's generation, but it doesn't
+     matter much.  Worst case, we'll have one extra DTV update per
+     thread.  */
+  td->gen_count = map_generation (map);
+  td->tlsinfo = test.tlsinfo;
+
+  __rtld_lock_unlock_recursive (GL(dl_load_lock));
+  return td;
+}
+
+# endif /* SHARED */
+
+/* The idea of the following two functions is to stop multiple threads
+   from attempting to resolve the same TLS descriptor without busy
+   waiting.  Ideally, we should be able to release the lock right
+   after changing td->entry, and then using say a condition variable
+   or a futex wake to wake up any waiting threads, but let's try to
+   avoid introducing such dependencies.  */
+
+inline static int
+_dl_tlsdesc_resolve_early_return_p (struct tlsdesc volatile *td, void *caller)
+{
+  if (caller != td->entry)
+    return 1;
+
+  __rtld_lock_lock_recursive (GL(dl_load_lock));
+  if (caller != td->entry)
+    {
+      __rtld_lock_unlock_recursive (GL(dl_load_lock));
+      return 1;
+    }
+
+  td->entry = _dl_tlsdesc_resolve_hold;
+
+  return 0;
+}
+
+inline static void
+_dl_tlsdesc_wake_up_held_fixups (void)
+{
+  __rtld_lock_unlock_recursive (GL(dl_load_lock));
+}
+
+#endif
Index: include/inline-hashtab.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ include/inline-hashtab.h	2008-01-31 03:37:42.000000000 -0200
@@ -0,0 +1,302 @@
+/* Fully-inline hash table, used mainly for managing TLS descriptors.
+
+   Copyright (C) 1999, 2000, 2001, 2002, 2003, 2005, 2008
+     Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Alexandre Oliva  <aoliva@redhat.com>
+
+   This file is derived from a 2003's version of libiberty's
+   hashtab.c, contributed by Vladimir Makarov (vmakarov@cygnus.com),
+   but with most adaptation points and support for deleting elements
+   removed.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, write to the Free
+   Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
+   02111-1307 USA.  */
+
+#ifndef INLINE_HASHTAB_H
+# define INLINE_HASHTAB_H 1
+
+extern void weak_function free (void *ptr);
+
+inline static unsigned long
+higher_prime_number (unsigned long n)
+{
+  /* These are primes that are near, but slightly smaller than, a
+     power of two.  */
+  static const unsigned long primes[] = {
+    (unsigned long) 7,
+    (unsigned long) 13,
+    (unsigned long) 31,
+    (unsigned long) 61,
+    (unsigned long) 127,
+    (unsigned long) 251,
+    (unsigned long) 509,
+    (unsigned long) 1021,
+    (unsigned long) 2039,
+    (unsigned long) 4093,
+    (unsigned long) 8191,
+    (unsigned long) 16381,
+    (unsigned long) 32749,
+    (unsigned long) 65521,
+    (unsigned long) 131071,
+    (unsigned long) 262139,
+    (unsigned long) 524287,
+    (unsigned long) 1048573,
+    (unsigned long) 2097143,
+    (unsigned long) 4194301,
+    (unsigned long) 8388593,
+    (unsigned long) 16777213,
+    (unsigned long) 33554393,
+    (unsigned long) 67108859,
+    (unsigned long) 134217689,
+    (unsigned long) 268435399,
+    (unsigned long) 536870909,
+    (unsigned long) 1073741789,
+    (unsigned long) 2147483647,
+					/* 4294967291L */
+    ((unsigned long) 2147483647) + ((unsigned long) 2147483644),
+  };
+
+  const unsigned long *low = &primes[0];
+  const unsigned long *high = &primes[sizeof(primes) / sizeof(primes[0])];
+
+  while (low != high)
+    {
+      const unsigned long *mid = low + (high - low) / 2;
+      if (n > *mid)
+	low = mid + 1;
+      else
+	high = mid;
+    }
+
+#if 0
+  /* If we've run out of primes, abort.  */
+  if (n > *low)
+    {
+      fprintf (stderr, "Cannot find prime bigger than %lu\n", n);
+      abort ();
+    }
+#endif
+
+  return *low;
+}
+
+struct hashtab
+{
+  /* Table itself.  */
+  void **entries;
+
+  /* Current size (in entries) of the hash table */
+  size_t size;
+
+  /* Current number of elements.  */
+  size_t n_elements;
+
+  /* Free function for the entries array.  This may vary depending on
+     how early the array was allocated.  If it is NULL, then the array
+     can't be freed.  */
+  void (*free) (void *ptr);
+};
+
+inline static struct hashtab *
+htab_create (void)
+{
+  struct hashtab *ht = malloc (sizeof (struct hashtab));
+
+  if (! ht)
+    return NULL;
+  ht->size = 3;
+  ht->entries = malloc (sizeof (void *) * ht->size);
+  ht->free = free;
+  if (! ht->entries)
+    {
+      if (ht->free)
+	ht->free (ht);
+      return NULL;
+    }
+
+  ht->n_elements = 0;
+
+  memset (ht->entries, 0, sizeof (void *) * ht->size);
+
+  return ht;
+}
+
+/* This is only called from _dl_unmap, so it's safe to call
+   free().  */
+inline static void
+htab_delete (struct hashtab *htab)
+{
+  int i;
+
+  for (i = htab->size - 1; i >= 0; i--)
+    if (htab->entries[i])
+      free (htab->entries[i]);
+
+  if (htab->free)
+    htab->free (htab->entries);
+  free (htab);
+}
+
+/* Similar to htab_find_slot, but without several unwanted side effects:
+    - Does not call htab->eq_f when it finds an existing entry.
+    - Does not change the count of elements/searches/collisions in the
+      hash table.
+   This function also assumes there are no deleted entries in the table.
+   HASH is the hash value for the element to be inserted.  */
+
+inline static void **
+find_empty_slot_for_expand (struct hashtab *htab, int hash)
+{
+  size_t size = htab->size;
+  unsigned int index = hash % size;
+  void **slot = htab->entries + index;
+  int hash2;
+
+  if (! *slot)
+    return slot;
+
+  hash2 = 1 + hash % (size - 2);
+  for (;;)
+    {
+      index += hash2;
+      if (index >= size)
+	index -= size;
+
+      slot = htab->entries + index;
+      if (! *slot)
+	return slot;
+    }
+}
+
+/* The following function changes size of memory allocated for the
+   entries and repeatedly inserts the table elements.  The occupancy
+   of the table after the call will be about 50%.  Naturally the hash
+   table must already exist.  Remember also that the place of the
+   table entries is changed.  If memory allocation failures are allowed,
+   this function will return zero, indicating that the table could not be
+   expanded.  If all goes well, it will return a non-zero value.  */
+
+inline static int
+htab_expand (struct hashtab *htab, int (*hash_fn)(void *))
+{
+  void **oentries;
+  void **olimit;
+  void **p;
+  void **nentries;
+  size_t nsize;
+
+  oentries = htab->entries;
+  olimit = oentries + htab->size;
+
+  /* Resize only when table after removal of unused elements is either
+     too full or too empty.  */
+  if (htab->n_elements * 2 > htab->size)
+    nsize = higher_prime_number (htab->n_elements * 2);
+  else
+    nsize = htab->size;
+
+  nentries = malloc (sizeof (void *) * nsize);
+  memset (nentries, 0, sizeof (void *) * nsize);
+  if (nentries == NULL)
+    return 0;
+  htab->entries = nentries;
+  htab->size = nsize;
+
+  p = oentries;
+  do
+    {
+      if (*p)
+	*find_empty_slot_for_expand (htab, hash_fn (*p))
+	  = *p;
+
+      p++;
+    }
+  while (p < olimit);
+
+  /* Without recording the free corresponding to the malloc used to
+     allocate the table, we couldn't tell whether this was allocated
+     by the malloc() built into ld.so or the one in the main
+     executable or libc.  Calling free() for something that was
+     allocated by the early malloc(), rather than the final run-time
+     malloc() could do Very Bad Things (TM).  We will waste memory
+     allocated early as long as there's no corresponding free(), but
+     this isn't so much memory as to be significant.  */
+
+  if (htab->free)
+    htab->free (oentries);
+
+  /* Use the free() corresponding to the malloc() above to free this
+     up.  */
+  htab->free = free;
+
+  return 1;
+}
+
+/* This function searches for a hash table slot containing an entry
+   equal to the given element.  To delete an entry, call this with
+   INSERT = 0, then call htab_clear_slot on the slot returned (possibly
+   after doing some checks).  To insert an entry, call this with
+   INSERT = 1, then write the value you want into the returned slot.
+   When inserting an entry, NULL may be returned if memory allocation
+   fails.  */
+
+inline static void **
+htab_find_slot (struct hashtab *htab, void *ptr, int insert,
+		int (*hash_fn)(void *), int (*eq_fn)(void *, void *))
+{
+  unsigned int index;
+  int hash, hash2;
+  size_t size;
+  void **entry;
+
+  if (htab->size * 3 <= htab->n_elements * 4
+      && htab_expand (htab, hash_fn) == 0)
+    return NULL;
+
+  hash = hash_fn (ptr);
+
+  size = htab->size;
+  index = hash % size;
+
+  entry = &htab->entries[index];
+  if (!*entry)
+    goto empty_entry;
+  else if (eq_fn (*entry, ptr))
+    return entry;
+
+  hash2 = 1 + hash % (size - 2);
+  for (;;)
+    {
+      index += hash2;
+      if (index >= size)
+	index -= size;
+
+      entry = &htab->entries[index];
+      if (!*entry)
+	goto empty_entry;
+      else if (eq_fn (*entry, ptr))
+	return entry;
+    }
+
+ empty_entry:
+  if (!insert)
+    return NULL;
+
+  htab->n_elements++;
+  return entry;
+}
+
+#endif /* INLINE_HASHTAB_H */
Index: elf/dl-conflict.c
===================================================================
--- elf/dl-conflict.c.orig	2008-01-31 03:37:39.000000000 -0200
+++ elf/dl-conflict.c	2008-01-31 03:37:42.000000000 -0200
@@ -1,5 +1,5 @@
 /* Resolve conflicts against already prelinked libraries.
-   Copyright (C) 2001, 2002, 2003, 2004, 2005 Free Software Foundation, Inc.
+   Copyright (C) 2001, 2002, 2003, 2004, 2005, 2008 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
    Contributed by Jakub Jelinek <jakub@redhat.com>, 2001.
 
@@ -44,7 +44,6 @@ _dl_resolve_conflicts (struct link_map *
     /* This macro is used as a callback from the ELF_DYNAMIC_RELOCATE code.  */
 #define RESOLVE_MAP(ref, version, flags) (*ref = NULL, NULL)
 #define RESOLVE(ref, version, flags) (*ref = NULL, 0)
-#define CHECK_STATIC_TLS(ref_map, sym_map) ((void) 0)
 #define RESOLVE_CONFLICT_FIND_MAP(map, r_offset) \
   do {									      \
     while ((resolve_conflict_map->l_map_end < (ElfW(Addr)) (r_offset))	      \
@@ -61,6 +60,12 @@ _dl_resolve_conflicts (struct link_map *
 
 #include "dynamic-link.h"
 
+    /* Override these, defined in dynamic-link.h.  */
+#undef CHECK_STATIC_TLS
+#define CHECK_STATIC_TLS(ref_map, sym_map) ((void) 0)
+#undef TRY_STATIC_TLS
+#define TRY_STATIC_TLS(ref_map, sym_map) (0)
+
     GL(dl_num_cache_relocations) += conflictend - conflict;
 
     for (; conflict < conflictend; ++conflict)
Index: elf/dl-reloc.c
===================================================================
--- elf/dl-reloc.c.orig	2008-01-31 03:37:39.000000000 -0200
+++ elf/dl-reloc.c	2008-01-31 03:37:42.000000000 -0200
@@ -1,5 +1,5 @@
 /* Relocate a shared object and resolve its references to other loaded objects.
-   Copyright (C) 1995-2004, 2005, 2006 Free Software Foundation, Inc.
+   Copyright (C) 1995-2004, 2005, 2006, 2008 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it and/or
@@ -43,9 +43,9 @@
    This function intentionally does not return any value but signals error
    directly, as static TLS should be rare and code handling it should
    not be inlined as much as possible.  */
-void
-internal_function __attribute_noinline__
-_dl_allocate_static_tls (struct link_map *map)
+int
+internal_function
+_dl_try_allocate_static_tls (struct link_map *map)
 {
   /* If we've already used the variable with dynamic access, or if the
      alignment requirements are too high, fail.  */
@@ -53,8 +53,7 @@ _dl_allocate_static_tls (struct link_map
       || map->l_tls_align > GL(dl_tls_static_align))
     {
     fail:
-      _dl_signal_error (0, map->l_name, NULL, N_("\
-cannot allocate memory in static TLS block"));
+      return -1;
     }
 
 #if TLS_TCB_AT_TP
@@ -108,6 +107,20 @@ cannot allocate memory in static TLS blo
     }
   else
     map->l_need_tls_init = 1;
+
+  return 0;
+}
+
+void
+internal_function __attribute_noinline__
+_dl_allocate_static_tls (struct link_map *map)
+{
+  if (map->l_tls_offset == FORCED_DYNAMIC_TLS_OFFSET
+      || _dl_try_allocate_static_tls (map))
+    {
+      _dl_signal_error (0, map->l_name, NULL, N_("\
+cannot allocate memory in static TLS block"));
+    }
 }
 
 /* Initialize static TLS area and DTV for current (only) thread.
@@ -248,23 +261,6 @@ _dl_relocate_object (struct link_map *l,
 	     l->l_lookup_cache.value = _lr; }))				      \
      : l)
 
-    /* This macro is used as a callback from elf_machine_rel{a,} when a
-       static TLS reloc is about to be performed.  Since (in dl-load.c) we
-       permit dynamic loading of objects that might use such relocs, we
-       have to check whether each use is actually doable.  If the object
-       whose TLS segment the reference resolves to was allocated space in
-       the static TLS block at startup, then it's ok.  Otherwise, we make
-       an attempt to allocate it in surplus space on the fly.  If that
-       can't be done, we fall back to the error that DF_STATIC_TLS is
-       intended to produce.  */
-#define CHECK_STATIC_TLS(map, sym_map)					\
-    do {								\
-      if (__builtin_expect ((sym_map)->l_tls_offset == NO_TLS_OFFSET	\
-			    || ((sym_map)->l_tls_offset			\
-				== FORCED_DYNAMIC_TLS_OFFSET), 0))	\
-	_dl_allocate_static_tls (sym_map);				\
-    } while (0)
-
 #include "dynamic-link.h"
 
     ELF_DYNAMIC_RELOCATE (l, lazy, consider_profiling);
Index: elf/dynamic-link.h
===================================================================
--- elf/dynamic-link.h.orig	2008-01-31 03:37:39.000000000 -0200
+++ elf/dynamic-link.h	2008-01-31 03:37:42.000000000 -0200
@@ -1,5 +1,5 @@
 /* Inline functions for dynamic linking.
-   Copyright (C) 1995-2005, 2006 Free Software Foundation, Inc.
+   Copyright (C) 1995-2005, 2006, 2008 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it and/or
@@ -17,6 +17,31 @@
    Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
    02111-1307 USA.  */
 
+/* This macro is used as a callback from elf_machine_rel{a,} when a
+   static TLS reloc is about to be performed.  Since (in dl-load.c) we
+   permit dynamic loading of objects that might use such relocs, we
+   have to check whether each use is actually doable.  If the object
+   whose TLS segment the reference resolves to was allocated space in
+   the static TLS block at startup, then it's ok.  Otherwise, we make
+   an attempt to allocate it in surplus space on the fly.  If that
+   can't be done, we fall back to the error that DF_STATIC_TLS is
+   intended to produce.  */
+#define CHECK_STATIC_TLS(map, sym_map)					\
+    do {								\
+      if (__builtin_expect ((sym_map)->l_tls_offset == NO_TLS_OFFSET	\
+			    || ((sym_map)->l_tls_offset			\
+				== FORCED_DYNAMIC_TLS_OFFSET), 0))	\
+	_dl_allocate_static_tls (sym_map);				\
+    } while (0)
+
+#define TRY_STATIC_TLS(map, sym_map)					\
+    (__builtin_expect ((sym_map)->l_tls_offset				\
+		       != FORCED_DYNAMIC_TLS_OFFSET, 1)			\
+     && (__builtin_expect ((sym_map)->l_tls_offset != NO_TLS_OFFSET, 1)	\
+	 || _dl_try_allocate_static_tls (sym_map) == 0))
+
+int internal_function _dl_try_allocate_static_tls (struct link_map *map);
+
 #include <elf.h>
 #include <assert.h>
 
-- 
Alexandre Oliva         http://www.lsd.ic.unicamp.br/~oliva/
FSF Latin America Board Member         http://www.fsfla.org/
Red Hat Compiler Engineer   aoliva@{redhat.com, gcc.gnu.org}
Free Software Evangelist  oliva@{lsd.ic.unicamp.br, gnu.org}

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]