This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug nptl/18463] pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM


https://sourceware.org/bugzilla/show_bug.cgi?id=18463

--- Comment #6 from Marc Kleine-Budde <mkl at pengutronix dot de> ---
We had another, deeper look at the glibc and kernel code.

For PI and robust mutexes there is runtime detection, but no
fallback code. But that's not possible without kernel support. Instead a
proper error code is returned:

For example PI mutex:

> int
> __pthread_mutex_init (mutex, mutexattr)
>      pthread_mutex_t *mutex;
>      const pthread_mutexattr_t *mutexattr;
> {
...
>     case PTHREAD_PRIO_INHERIT << PTHREAD_MUTEXATTR_PROTOCOL_SHIFT:
>       if (__glibc_unlikely (prio_inherit_missing ()))
>         return ENOTSUP;
>       break;

and prio_inherit_missing() is:

> static bool
> prio_inherit_missing (void)
> {
> #ifdef __NR_futex
> # ifndef __ASSUME_FUTEX_LOCK_PI
>   static int tpi_supported;
>   if (__glibc_unlikely (tpi_supported == 0))
>     {
>       int lock = 0;
>       INTERNAL_SYSCALL_DECL (err);
>       int ret = INTERNAL_SYSCALL (futex, err, 4, &lock, FUTEX_UNLOCK_PI, 0, 0);

This is the runtime detection, if the syscall return ENOSYS, then the kernel
has no support for PI mutexes.

>       assert (INTERNAL_SYSCALL_ERROR_P (ret, err));
>       tpi_supported = INTERNAL_SYSCALL_ERRNO (ret, err) == ENOSYS ? -1 : 1;
>     }
>   return __glibc_unlikely (tpi_supported < 0);
> # endif
>   return false;
> #endif
>   return true;
> }

So far so good. We have runtime detection for FUTEX_LOCK_PI and
SET_ROBUST_LIST (implemented likewise), but the problem is REQUEUE_PI.

The way PI mutexes are implemented you _have_ to use REQUEUE_PI, quoting
the Kernel's ./Documentation/futex-requeue-pi.txt:

> In order to ensure the rt_mutex has an owner if it has waiters, it
> is necessary for both the requeue code, as well as the waiting code,
> to be able to acquire the rt_mutex before returning to user space.
> The requeue code cannot simply wake the waiter and leave it to
> acquire the rt_mutex as it would open a race window between the
                                          ^^^^^^^^^^^
> requeue call returning to user space and the waiter waking and
> starting to run.  This is especially true in the uncontended case.
> 
> The solution involves two new rt_mutex helper routines,
> rt_mutex_start_proxy_lock() and rt_mutex_finish_proxy_lock(), which
> allow the requeue code to acquire an uncontended rt_mutex on behalf
> of the waiter and to enqueue the waiter on a contended rt_mutex.
> Two new system calls provide the kernel<->user interface to
> requeue_pi: FUTEX_WAIT_REQUEUE_PI and FUTEX_CMP_REQUEUE_PI.
              ^^^^^^^^^^^^^^^^^^^^^
> 
> FUTEX_WAIT_REQUEUE_PI is called by the waiter (pthread_cond_wait()
  ^^^^^^^^^^^^^^^^^^^^^
> and pthread_cond_timedwait()) to block on the initial futex and wait
> to be requeued to a PI-aware futex.  The implementation is the
> result of a high-speed collision between futex_wait() and
> futex_lock_pi(), with some extra logic to check for the additional
> wake-up scenarios.

Looking at pthread_cond_wait:

> int
> __pthread_cond_wait (pthread_cond_t *cond, pthread_mutex_t *mutex)
> {
[...]
> #if (defined lll_futex_wait_requeue_pi \
>      && defined __ASSUME_REQUEUE_PI)
                  ^^^^^^^^^^^^^^^^^^^
>       /* If pi_flag remained 1 then it means that we had the lock and the mutex                                                                                                                                    
>          but a spurious waker raced ahead of us.  Give back the mutex before                                                                                                                                       
>          going into wait again.  */
>       if (pi_flag)
>         {
>           __pthread_mutex_cond_lock_adjust (mutex);
>           __pthread_mutex_unlock_usercnt (mutex, 0);
>         }
>       pi_flag = USE_REQUEUE_PI (mutex);
> 
>       if (pi_flag)
>         {
>           err = lll_futex_wait_requeue_pi (&cond->__data.__futex,
>                                            futex_val, &mutex->__data.__lock,
>                                            pshared);

Here's the requeue_pi syscall, but no runtime check for it. See
__ASSUME_REQUEUE_PI above. If __ASSUME_REQUEUE_PI is not present this
code is not compiled in, resulting in the breakage we're seeing.

> 
>           pi_flag = (err == 0);
>         }
>       else
> #endif
>           /* Wait until woken by signal or broadcast.  */
>         lll_futex_wait (&cond->__data.__futex, futex_val, pshared);


Looking a the kernel's kernel/futex.c:

> long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout,
>                 u32 __user *uaddr2, u32 val2, u32 val3)
> {
...
>         switch (cmd) {
>         case FUTEX_LOCK_PI:
>         case FUTEX_UNLOCK_PI:
>         case FUTEX_TRYLOCK_PI:
>         case FUTEX_WAIT_REQUEUE_PI:
>         case FUTEX_CMP_REQUEUE_PI:
>                 if (!futex_cmpxchg_enabled)
>                         return -ENOSYS;
>         }

This is the corresponding code for the runtime checks in the glibc. The
runtime check for:

>         case FUTEX_WAIT_REQUEUE_PI:
>         case FUTEX_CMP_REQUEUE_PI:

has been added in v3.4-rc1~195^2~1. The glibc however only uses
lll_futex_wait_requeue_pi on PI mutexes, which are runtime checked
during pthread_mutex_init. So from my point of view the correct fix in
the glibc is to get rid of __ASSUME_REQUEUE_PI completely - or - for ARM
remove the undef __ASSUME_REQUEUE_PI.


=======================================

diff --git a/sysdeps/unix/sysv/linux/arm/kernel-features.h
b/sysdeps/unix/sysv/linux/arm/kernel-features.h
index cb407dbc300c..65046b16974b 100644

--- a/sysdeps/unix/sysv/linux/arm/kernel-features.h

+++ b/sysdeps/unix/sysv/linux/arm/kernel-features.h

@@ -39,6 +39,6 @@

    configuration.  */ 
 #if __LINUX_KERNEL_VERSION < 0x030E03 
 # undef __ASSUME_FUTEX_LOCK_PI 
-# undef __ASSUME_REQUEUE_PI 
+//# undef __ASSUME_REQUEUE_PI 
 # undef __ASSUME_SET_ROBUST_LIST 
 #endif 

Marc

-- 
You are receiving this mail because:
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]