This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug nptl/11291] New: potential deadlock in sem_*wait and sem_post for MIPS architectures


I've encountered the following issue while heavily using glibc's
sem_wait and sem_post simultaneously in a lot of threads: it seems
that the sem_post and sem_wait functions can enter an endless loop in
some corner cases. After diving into the code, it seems that (for the
case of sem_post, nptl/sysdeps/unix/sysv/linux/sem_post.c) the
following code is causing the problem:

int
__new_sem_post (sem_t *sem)
{
 struct new_sem *isem = (struct new_sem *) sem;

 __typeof (isem->value) cur;
 do
   {
     cur = isem->value;
     if (isem->value == SEM_VALUE_MAX)
       {
         __set_errno (EOVERFLOW);
         return -1;
       }
   }
 while (atomic_compare_and_exchange_bool_acq (&isem->value, cur + 1, cur));


The problem occurs because gcc is optimizing this piece of code, and
puts 'isem->value' in a register before the while loop actually
begins. The atomic_compare_and_exchange_bool_acq macro then compares
'cur' (from the register, which is never updated in the loop) with
'isem->value' (from memory, which could be updated by another thread).
When we have multiple threads running and using the same semaphore,
the isem->value might be updated by another thread after the register
is filled with the value from memory, and then we run into an endless
loop...

I was able to fix this (and more issues of the same kind in the other
semaphore functions) by adding a 'volatile' keyword:

int
__new_sem_post (sem_t *sem)
{
 struct new_sem volatile *isem = (struct new_sem *) sem;

===8<===8<===8<=== Additional info
glibc 2.9, but still present in git

kernel Linux 2.6.28.10  Wed Feb 17 13:32:34 CET 2010 mips unknown

$ mipsel-linux-gcc -v
Using built-in specs.
Target: mipsel-linux
Configured with: --with-gnu-ld --enable-shared --enable-target-optspace --
enable-languages=c,c++,objc --enable-threads=posix --enable-multilib --enable-
c99 --enable-long-long --enable-symvers=gnu --enable-libstdcxx-pch --program-
prefix=mipsel-linux- --enable-libssp --disable-bootstrap --enable-libgomp --
disable-libmudflap --disable-libunwind-exceptions --enable-libssp --enable-
libgomp --disable-libmudflap --enable-__cxa_atexit
Thread model: posix
gcc version 4.2.4

$ mipsel-linux-ld -v
GNU ld (Linux/GNU Binutils) 2.18.50.0.7.20080502

===8<===8<===8<=== Patch that solved the problem for me:

diff --git a/nptl/sysdeps/unix/sysv/linux/sem_post.c 
b/nptl/sysdeps/unix/sysv/linux/sem_post.c
index 58b226f..0d4a6d6 100644
--- a/nptl/sysdeps/unix/sysv/linux/sem_post.c
+++ b/nptl/sysdeps/unix/sysv/linux/sem_post.c
@@ -29,7 +29,7 @@
 int
 __new_sem_post (sem_t *sem)
 {
-  struct new_sem *isem = (struct new_sem *) sem;
+  struct new_sem volatile *isem = (struct new_sem *) sem;
 
   __typeof (isem->value) cur;
   do
@@ -64,7 +64,7 @@ int
 attribute_compat_text_section
 __old_sem_post (sem_t *sem)
 {
-  int *futex = (int *) sem;
+  int volatile *futex = (int *) sem;
 
   int nr = atomic_increment_val (futex);
   /* We always have to assume it is a shared semaphore.  */
diff --git a/nptl/sysdeps/unix/sysv/linux/sem_timedwait.c 
b/nptl/sysdeps/unix/sysv/linux/sem_timedwait.c
index fdf0d74..a241ad6 100644
--- a/nptl/sysdeps/unix/sysv/linux/sem_timedwait.c
+++ b/nptl/sysdeps/unix/sysv/linux/sem_timedwait.c
@@ -34,7 +34,7 @@ extern void __sem_wait_cleanup (void *arg) attribute_hidden;
 int
 sem_timedwait (sem_t *sem, const struct timespec *abstime)
 {
-  struct new_sem *isem = (struct new_sem *) sem;
+  struct new_sem volatile *isem = (struct new_sem *) sem;
   int err;
 
   if (atomic_decrement_if_positive (&isem->value) > 0)
diff --git a/nptl/sysdeps/unix/sysv/linux/sem_trywait.c 
b/nptl/sysdeps/unix/sysv/linux/sem_trywait.c
index f500361..74e1170 100644
--- a/nptl/sysdeps/unix/sysv/linux/sem_trywait.c
+++ b/nptl/sysdeps/unix/sysv/linux/sem_trywait.c
@@ -30,7 +30,7 @@
 int
 __new_sem_trywait (sem_t *sem)
 {
-  int *futex = (int *) sem;
+  int volatile *futex = (int *) sem;
   int val;
 
   if (*futex > 0)
diff --git a/nptl/sysdeps/unix/sysv/linux/sem_wait.c 
b/nptl/sysdeps/unix/sysv/linux/sem_wait.c
index 20e2b48..5601e1a 100644
--- a/nptl/sysdeps/unix/sysv/linux/sem_wait.c
+++ b/nptl/sysdeps/unix/sysv/linux/sem_wait.c
@@ -32,7 +32,7 @@ void
 attribute_hidden
 __sem_wait_cleanup (void *arg)
 {
-  struct new_sem *isem = (struct new_sem *) arg;
+  struct new_sem volatile *isem = (struct new_sem *) arg;
 
   atomic_decrement (&isem->nwaiters);
 }
@@ -41,7 +41,7 @@ __sem_wait_cleanup (void *arg)
 int
 __new_sem_wait (sem_t *sem)
 {
-  struct new_sem *isem = (struct new_sem *) sem;
+  struct new_sem volatile *isem = (struct new_sem *) sem;
   int err;
 
   if (atomic_decrement_if_positive (&isem->value) > 0)
@@ -90,7 +90,7 @@ int
 attribute_compat_text_section
 __old_sem_wait (sem_t *sem)
 {
-  int *futex = (int *) sem;
+  int volatile *futex = (int *) sem;
   int err;
 
   do

-- 
           Summary: potential deadlock in sem_*wait and sem_post for MIPS
                    architectures
           Product: glibc
           Version: 2.9
            Status: NEW
          Severity: normal
          Priority: P2
         Component: nptl
        AssignedTo: drepper at redhat dot com
        ReportedBy: mischa dot jonker at viragelogic dot com
                CC: glibc-bugs at sources dot redhat dot com,mischa dot
                    jonker at viragelogic dot com
 GCC build triplet: i386-linux-gnu
  GCC host triplet: mipsel-linux-gnu
GCC target triplet: mipsel-linux-gnu


http://sourceware.org/bugzilla/show_bug.cgi?id=11291

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]