[ created from http://gcc.gnu.org/PR68802 ] The following program fails on ARM when statically linked: #include <cstdio> #include <thread> __thread int __attribute__ ((tls_model ("global-dynamic"))) x = 10; class Thread { public: void operator()(){ fprintf(stderr, "testing (%i) ...\n", x); } }; int main(void){ Thread t; std::thread th(std::ref(t)); th.join(); return 0; } Compiling: $ arm-unknown-linux-gnueabihf-g++ -o test_static -Wall -Wl,--whole-archive -lpthread -Wl,--no-whole-archive -static tsimple.c when run on a raspberry pi: $ ./test_static Segmentation fault
f8aeae347377f3dfa8cbadde057adf1827fb1d44 is the first bad commit commit f8aeae347377f3dfa8cbadde057adf1827fb1d44 Author: Alexandre Oliva <aoliva@redhat.com> Date: Tue Mar 17 01:14:11 2015 -0300 Fix DTV race, assert, DTV_SURPLUS Static TLS limit, and nptl_db garbage for ChangeLog [BZ #17090] [BZ #17620] [BZ #17621] [BZ #17628]
Plain C testcase: #include <stdio.h> #include <pthread.h> __thread int __attribute__ ((tls_model ("global-dynamic"))) x = 10; void * th (void *p) { printf ("x = %d\n", x); return 0; } int main (void) { printf ("x = %d\n", x); pthread_t t; pthread_create (&t, 0, th, 0); pthread_join (t, 0); }
Thanks for the report, I'm looking into it. The relevant question to ask IMHO is why linker relaxations are not transforming the GD accesses into LE as expected. AFAIK GD is not expected to be functional in a fully-static environment; because of relaxations, it is only functional after dlopen and, even then, maybe only from the loaded modules themselves, because all TLS accesses in the executable are supposed to be relaxed to LE. Anyway, I'll see whether we can remove this assumption and make it work as it used to. Unfortunately, my access to the ARM platform is somewhat limited; can you confirm that adding an explicit successful dlopen call to the program before the GD access fixes it? Perhaps any internal dlopening (nss, locale) would suffice.
How do you even get references to __tls_get_addr in a static program? I don't see that the symbol is defined in any of the static libraries! I wonder if it's ARM-specific, or whether the call is relaxed, after all, just not correctly.
On Fri, 16 Sep 2016, aoliva at sourceware dot org wrote: > How do you even get references to __tls_get_addr in a static program? I don't > see that the symbol is defined in any of the static libraries! I wonder if > it's ARM-specific, or whether the call is relaxed, after all, just not > correctly. See sysdeps/arm/libc-tls.c. I don't know if this bug is the same problem as nptl/tst-cancel24-static failing, but that was traced back to the same commit some time ago <https://sourceware.org/ml/libc-alpha/2015-11/msg00529.html>, and appears on lots of architectures as can be seen in the release test results on the wiki.
Thanks, Joseph, I see that a number of architectures that use this such a custom tls_get_addr that doesn't update the DTV. I can see how this would not work well after I removed the DTV initialization code from both elf/dl-reloc.c and nptl/allocatestack.c, that didn't seem to be needed any more, and that in some cases could cause data races. Oops. Fortunately, the potential race seems to no longer be present. At least I don't immediately see how the early initialization of the static entries of the DTV would cause a race, with the new structure of the DTV and the way it is updated, but I haven't thought it through yet. This patch restores the early initialization, which should make static TLS work again on the affected platforms. The alternative would be to bring DTV-updating abilities to every one of the custom versions of tls_get_addr, which is IMHO undesirable. Could anyone with easy access to ARM or other affected platforms please give this a spin and let me know how it goes? I've only regression-tested it on x86_64-linux-gnu so far, and at least there this logical reversal (because the data structure changed, it's not literal) doesn't cause any further regressions. Obviously, it doesn't prove the problem is fixed either, since x86_64 didn't trigger it to begin with. Thanks in advance, diff --git a/elf/dl-reloc.c b/elf/dl-reloc.c index 42bddc1..dcab666 100644 --- a/elf/dl-reloc.c +++ b/elf/dl-reloc.c @@ -137,6 +137,12 @@ _dl_nothread_init_static_tls (struct link_map *map) # error "Either TLS_TCB_AT_TP or TLS_DTV_AT_TP must be defined" #endif + /* Fill in the DTV slot so that a later LD/GD access will find it. */ + dtv_t *dtv = THREAD_DTV (); + assert (map->l_tls_modid <= dtv[-1].counter); + dtv[map->l_tls_modid].pointer.to_free = NULL; + dtv[map->l_tls_modid].pointer.val = dest; + /* Initialize the memory. */ memset (__mempcpy (dest, map->l_tls_initimage, map->l_tls_initimage_size), '\0', map->l_tls_blocksize - map->l_tls_initimage_size); diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c index 60b34dc..e49005c 100644 --- a/nptl/allocatestack.c +++ b/nptl/allocatestack.c @@ -1199,6 +1199,7 @@ __nptl_setxid (struct xid_command *cmdp) static inline void __attribute__((always_inline)) init_one_static_tls (struct pthread *curp, struct link_map *map) { + dtv_t *dtv = GET_DTV (TLS_TPADJ (curp)); # if TLS_TCB_AT_TP void *dest = (char *) curp - map->l_tls_offset; # elif TLS_DTV_AT_TP @@ -1207,9 +1208,14 @@ init_one_static_tls (struct pthread *curp, struct link_map *map) # error "Either TLS_TCB_AT_TP or TLS_DTV_AT_TP must be defined" # endif - /* We cannot delay the initialization of the Static TLS area, since - it can be accessed with LE or IE, but since the DTV is only used - by GD and LD, we can delay its update to avoid a race. */ + /* Fill in the DTV slot so that a later LD/GD access will find it. + A number of platforms require at least the main executable's DTV + to be set up for static TLS to work, because they don't require + linker relaxations. */ + dtv[map->l_tls_modid].pointer.to_free = NULL; + dtv[map->l_tls_modid].pointer.val = dest; + + /* Initialize the memory. */ memset (__mempcpy (dest, map->l_tls_initimage, map->l_tls_initimage_size), '\0', map->l_tls_blocksize - map->l_tls_initimage_size); }
That patch doesn't help.
Neither of the two functions is actually called in the test case. Any fixes need to be made to _dl_allocate_tls_init.
Err... I don't see where _dl_allocate_tls_init is called either. AFAICT it is __libc_setup_tls (csu/libc-tls.c) that sets up the DTV in static programs. The weirdness I see is that it appears to set up DTV[2] with the main program's TLS static block pointer, but then __tls_get_addr appears to access DTV[1]. I'm still trying to create a static binary linked with current glibc that will run on either my phone (Neo FreeRunner) or my BBB, but I'm not having much luck so far. Maybe I'll resort to the Yeeloong: it seems like the same problem occurs on MIPS.
You can also run it on a aarch64 system, there are a few on the GCC compile farm. #0 _dl_allocate_tls_init (result=0xf77ff7c0) at dl-tls.c:476 #1 0x000129fc in allocate_stack (stack=<synthetic pointer>, pdp=<synthetic pointer>, attr=<optimized out>) at allocatestack.c:584 #2 __pthread_create_2_1 (newthread=<optimized out>, attr=<optimized out>, start_routine=<optimized out>, arg=<optimized out>) at pthread_create.c:539 #3 0x00010518 in main () at tls.c:18 sysdeps/arm/nptl/tls.h: /* Install the dtv pointer. The pointer passed is to the element with index -1 which contain the length. */ # define INSTALL_DTV(tcbp, dtvp) \ (((tcbhead_t *) (tcbp))->dtv = (dtvp) + 1)
aarch64 was the first thing I tried, but the compile farm hosts won't let me in. It seems like the old key types with which I registered are no longer acceptable to either my local ssh client or their ssh server, and my newer keys are not there, so I can't use them yet. I sent an update request last week, but I haven't got a response yet. As for _dl_tls_allocate_init, thanks for the backtrace. I was expecting it a lot earlier than that. The report didn't make it clear the problem was in the separate thread, rather than in the main one.
Given all the information I got so far, it looks like this will fix it: diff --git a/elf/dl-tls.c b/elf/dl-tls.c index 17567ad..60f4c1d 100644 --- a/elf/dl-tls.c +++ b/elf/dl-tls.c @@ -538,6 +538,10 @@ _dl_allocate_tls_init (void *result) # error "Either TLS_TCB_AT_TP or TLS_DTV_AT_TP must be defined" #endif + /* Set up the DTV entry. The simplified __tls_get_addr that + some platforms use in static programs requires it. */ + dtv[map->l_tls_modid].pointer.val = dest; + /* Copy the initialization image and clear the BSS part. */ memset (__mempcpy (dest, map->l_tls_initimage, map->l_tls_initimage_size), '\0',
That fixes the issue.
This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "GNU C Library master sources". The branch, master has been updated via 17af5da98cd2c9ec958421ae2108f877e0945451 (commit) from 444eacba82f675d4657ad55da67b355536be90ab (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=17af5da98cd2c9ec958421ae2108f877e0945451 commit 17af5da98cd2c9ec958421ae2108f877e0945451 Author: Alexandre Oliva <aoliva@redhat.com> Date: Wed Sep 21 22:01:16 2016 -0300 [PR19826] fix non-LE TLS in static programs An earlier fix for TLS dropped early initialization of DTV entries for modules using static TLS, leaving it for __tls_get_addr to set them up. That worked on platforms that require the GD access model to be relaxed to LE in the main executable, but it caused a regression on platforms that allow GD in the main executable, particularly in statically-linked programs: they use a custom __tls_get_addr that does not update the DTV, which fails when the DTV early initialization is not performed. In static programs, __libc_setup_tls performs the DTV initialization for the main thread, but the DTV of other threads is set up in _dl_allocate_tls_init, so that's the fix that matters. Restoring the initialization in the remaining functions modified by this patch was just for uniformity. It's not clear that it is ever needed: even on platforms that allow GD in the main executable, the dynamically-linked version of __tls_get_addr would set up the DTV entries, even for static TLS modules, while updating the DTV counter. for ChangeLog [BZ #19826] * elf/dl-tls.c (_dl_allocate_tls_init): Restore DTV early initialization of static TLS entries. * elf/dl-reloc.c (_dl_nothread_init_static_tls): Likewise. * nptl/allocatestack.c (init_one_static_tls): Likewise. ----------------------------------------------------------------------- Summary of changes: ChangeLog | 8 ++++++++ elf/dl-reloc.c | 6 ++++++ elf/dl-tls.c | 4 ++++ nptl/allocatestack.c | 9 ++++++--- 4 files changed, 24 insertions(+), 3 deletions(-)
Fixed
Please set the target milestone (to the first mainline version with the fix) when marking a bug as fixed.
Thanks for the reminder. It would be nice of bugzilla helped enforce this requirement somehow; it seems like the sort of thing that it would be desirable to automate in the interest of reducing mistakes.
*** Bug 19303 has been marked as a duplicate of this bug. ***