This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PR18457] Don't require rtld lock to compute DTV addr for static TLS


On Wed, Jun 03, 2015 at 03:44:58AM -0300, Alexandre Oliva wrote:
> We used to store static TLS addrs in the DTV at module load time, but
> this required one thread to modify another thread's DTV.  Now that we
> defer the DTV update to the first use in the thread, we should do so
> without taking the rtld lock if the module is already assigned to static
> TLS.  Taking the lock introduces deadlocks where there weren't any
> before.
> 
> This patch fixes the deadlock caused by tls_get_addr's unnecessarily
> taking the rtld lock to initialize the DTV entry for tls_dtor_list
> within __call_dtors_list, which deadlocks with a dlclose of a module
> whose finalizer joins with that thread.  The patch does not, however,
> attempt to fix other potential sources of similar deadlocks, such as
> the explicit rtld locks taken by call_dtors_list, when the dtor list
> is not empty; lazy relocation of the reference to tls_dtor_list, when
> TLS Descriptors are in use; when tls dtors call functions through the
> PLT and lazy relocation needs to be performed, or any such called
> functions interact with the dynamic loader in ways that require its
> lock to be taken.
> 
> Ok to install?

It's not good enough and is in fact probably just dancing around the
problem.  The simple patch to the test case below will cause the test
case to deadlock.  Andreas' reproducer can be fixed by simply setting
the TLS variables in cxa_thread_atexit as initial exec; I've got a
patch for it that I'll post shortly.  That would leave two other
problems:

1. All of the lock taking and NODELETE flag clearing in
   cxa_thread_atexit.  Not only can it cause a deadlock, clearing the
   flag like that may actually be wrong.  We may be better off not
   unloading the DSO at all, but I'll see if there's another way out.

2. The lock taking in tls_get_addr_tail.  That has to go and we need
   to figure out another way to wait for another dlopen to complete.
   I haven't wrapped my head around this bit of the code properly yet
   and you may be better placed to debug this.  If you don't have
   enough time, I could run with any kind of help/guidance you may
   provide.

Siddhesh

diff --git a/nptl/tst-join7mod.c b/nptl/tst-join7mod.c
index a8c7bc0..a066a1f 100644
--- a/nptl/tst-join7mod.c
+++ b/nptl/tst-join7mod.c
@@ -4,12 +4,15 @@
 static pthread_t th;
 static int running = 1;
 
+static __thread int foo;
+
 static void *
 test_run (void *p)
 {
   while (running)
     fprintf (stderr, "XXX test_run\n");
   fprintf (stderr, "XXX test_run FINISHED\n");
+  foo = 42;
   return NULL;
 }
 

Attachment: pgpxFMgM45mLS.pgp
Description: PGP signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]