18463 – pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM

Bug 18463 - pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on ARM

Summary: pthread_cond_broadcast issue when surrounded by PTHREAD_PRIO_INHERIT mutex on...

Status:	RESOLVED FIXED

Alias:	None

Product:	glibc
Classification:	Unclassified
Component:	nptl (show other bugs)
Version:	2.20

Importance:	P2 normal
Target Milestone:	2.25
Assignee:	Not yet assigned to anyone

URL:
Keywords:

Duplicates (1):	19941 (view as bug list)
Depends on:
Blocks:

Reported:	2015-05-29 11:09 UTC by mpa
Modified:	2017-01-11 18:06 UTC (History)
CC List:	6 users (show)

See Also:
Host:
Target:
Build:
Last reconfirmed:

Flags:	fweimer: security-

Attachments
test program that triggers this issue (508 bytes, text/x-csrc) 2015-05-29 11:09 UTC, mpa	Details
Proposed fix for ARM (565 bytes, patch) 2015-06-26 16:17 UTC, Marc Kleine-Budde	Details \| Diff
signature.asc (374 bytes, application/pgp-signature) 2015-07-17 10:17 UTC, Marc Kleine-Budde	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description mpa 2015-05-29 11:09:38 UTC

Created attachment 8335 [details]
test program that triggers this issue

The attached program uses pthread_cond_broadcast() to wake up two threads
waiting on a condition variable which is surrounded by a mutex with
PTHREAD_PRIO_INHERIT set. The pthread_mutex_unlock() after
pthread_cond_broadcast() returns EPERM in the main thread.

This does not happen when using pthread_cond_signal(). Also it does not happen
when using just one thread that is waiting or when PTHREAD_PRIO_INHERIT is not
set.

This issue is reproducable with the attached program on the following setups:
 - Kernel 4.0 with OSELAS Toolchain 2014.12.0 for cortex a8, glibc-2.20
 - Kernel 4.0 with OSELAS Toolchain 2014.12.0 for armv7a without NEON, glibc-2.20
 - Kernel 4.0.0-1-armmp with native Debian experimental toolchain for armhf, glibc-2.21, gcc-5
 - Kernel 3.18.0-trunk-kirkwood native Debian experimental toolchain for armel, glibc-2.21, gcc-5

This issue does not exist on:
 - Kernel 3.16 with Debian toolchain for x86_64 , glibc-2.19, gcc-4.9.2
 - Kernel 4.0.4 (vanilla, not debian) with Debian experimental toolchain for x86_64, glibc-2.21, gcc-5
 - Kernel 4.0.0-1-armmp with native Debian testing toolchain for armhf, glibc-2.19, gcc-4.9.2
 - Kernel 4.0.0-1-armmp crosscompile Debian unstable toolchain for armhf, glibc-2.19, gcc-4.9.2
 - Kernel 3.18.0-trunk-kirkwood native Debian testing toolchain for armel, glibc-2.19, gcc-4.9.2

I found bug 17013 which seems related. But 17013 is marked as fixed in
glibc-2.20 which is where the issue described above was introduced.

We tried reverting
        8f630cca5c36941db1cb48726016bbed80ec1041 Fix lll_unlock twice in pthread_cond_broadcast
it does not solve the issue.

Comment 1 Marc Kleine-Budde 2015-06-13 17:06:14 UTC

I bisected the problem to the following comit:

    47c5adebd2c8 Correct robust mutex / PI futex kernel assumptions (bug 9894).

Applying this HACK to glibc-2.20 works around to problem:

index e755741de60b..a56290ce4ca4 100644
--- a/sysdeps/unix/sysv/linux/arm/kernel-features.h
+++ b/sysdeps/unix/sysv/linux/arm/kernel-features.h
@@ -37,6 +37,6 @@
 /* The ARM kernel may or may not support
    futex_atomic_cmpxchg_inatomic, depending on kernel
    configuration.  */
-#undef __ASSUME_FUTEX_LOCK_PI
-#undef __ASSUME_REQUEUE_PI
-#undef __ASSUME_SET_ROBUST_LIST
+//#undef __ASSUME_FUTEX_LOCK_PI
+//#undef __ASSUME_REQUEUE_PI
+//#undef __ASSUME_SET_ROBUST_LIST

The glibc-2.20 is build with "--enable-kernel=3.0" using linux-3.16 header generated by "make headers_install".

Marc

-------->8-------->8-------->8-------->8-------->8-------->8--------

47c5adebd2c864a098c3af66e61e1147dc3cf0b4 is the first bad commit
commit 47c5adebd2c864a098c3af66e61e1147dc3cf0b4
Author: Joseph Myers <joseph@codesourcery.com>
Date:   Mon Mar 31 12:55:18 2014 +0000

    Correct robust mutex / PI futex kernel assumptions (bug 9894).
    
    This patch continues fixing __ASSUME_* issues in preparation for
    moving to a 2.6.32 minimum kernel version by addressing assumptions on
    robust mutex and PI futex support availability.  Those assumptions are
    bug 9894, but to be clear this patch does not address all the issues
    from that bug about wrong version assumptions, only those still
    applicable for --enable-kernel=2.6.32 or later (with the expectation
    that the move to that minimum kernel will obsolete the other parts of
    the bug).  The patch is independent of
    <https://sourceware.org/ml/libc-alpha/2014-03/msg00585.html>, my other
    pending-review patch preparing for the kernel version change; the two
    together complete all the changes I believe are needed in preparation
    regarding any macro in sysdeps/unix/sysv/linux/kernel-features.h that
    would be affected by such a change.  (I have not checked the
    correctness of macros whose conditions are unaffected by such a
    change, or macros only defined in other kernel-features.h files.)
    
    As discussed in that bug, robust mutexes and PI futexes need
    futex_atomic_cmpxchg_inatomic to be implemented, in addition to
    certain syscalls needed for robust mutexes (and
    architecture-independent kernel pieces for all the features in
    question).  That is, as I understand it, they need
    futex_atomic_cmpxchg_inatomic to *work* (not return an ENOSYS error).
    
    The issues identified in my analysis relate to ARM, M68K, MicroBlaze,
    MIPS and SPARC.
    
    On ARM, whether futex_atomic_cmpxchg_inatomic works depends on the
    kernel configuration.  As of 3.13, the condition for *not* working is
    CONFIG_CPU_USE_DOMAINS && CONFIG_SMP.  As of 2.6.32 it was simply
    CONFIG_SMP that meant the feature was not implemented.  I don't know
    if there are any circumstances in which we can say "we can assume a
    userspace glibc binary built with these options will never run on a
    kernel with the problematic configuration", but at least for now I'm
    just undefining the relevant __ASSUME_* macros for ARM.
    
    On M68K, two of the three macros are undefined for kernels before
    3.10, but as far as I can see __ASSUME_FUTEX_LOCK_PI is in the same
    group needing futex_atomic_cmpxchg_inatomic support and so should be
    undefined as well.
    
    On MicroBlaze the required support was added in 2.6.33.
    
    On MIPS, the support depends on cpu_has_llsc in the kernel - that is,
    actual hardware LL/SC support (GCC and glibc for MIPS GNU/Linux rely
    on the instructions being supported in some way, but it may be kernel
    emulation; futex_atomic_cmpxchg_inatomic doesn't work with that
    emulation).  The same condition as in GCC for indicating LL/SC support
    may not be available is used for undefining the macros in glibc,
    __mips == 1 || defined _MIPS_ARCH_R5900.  (Maybe we could in fact
    desupport MIPS processors without the hardware support in glibc.)
    
    On SPARC, 32-bit kernels don't support futex_atomic_cmpxchg_inatomic;
    __arch64__ || __sparc_v9__ is used as the condition for binaries that
    won't run on 32-bit kernels.
    
    This patch is not tested beyond the sanity check of an x86_64 build.
        [BZ #9894]
        * sysdeps/unix/sysv/linux/kernel-features.h
        [__sparc__ && !__arch64__ && !__sparc_v9__]
        (__ASSUME_SET_ROBUST_LIST): Do not define.
        [__sparc__ && !__arch64__ && !__sparc_v9__]
        (__ASSUME_FUTEX_LOCK_PI): Likewise.
        [__sparc__ && !__arch64__ && !__sparc_v9__] (__ASSUME_REQUEUE_PI):
        Likewise.
        * sysdeps/unix/sysv/linux/arm/kernel-features.h
        (__ASSUME_FUTEX_LOCK_PI): Undefine.
        (__ASSUME_REQUEUE_PI): Likewise.
        (__ASSUME_SET_ROBUST_LIST): Likewise.
        * sysdeps/unix/sysv/linux/m68k/kernel-features.h
        [__LINUX_KERNEL_VERSION < 0x030a00] (__ASSUME_FUTEX_LOCK_PI):
        Undefine.
        * sysdeps/unix/sysv/linux/microblaze/kernel-features.h
        [__LINUX_KERNEL_VERSION < 0x020621] (__ASSUME_FUTEX_LOCK_PI):
        Likewise.
        [__LINUX_KERNEL_VERSION < 0x020621] (__ASSUME_REQUEUE_PI):
        Likewise.
        [__LINUX_KERNEL_VERSION < 0x020621] (__ASSUME_SET_ROBUST_LIST):
        Likewise.
        * sysdeps/unix/sysv/linux/mips/kernel-features.h
        [__mips == 1 || _MIPS_ARCH_R5900] (__ASSUME_FUTEX_LOCK_PI):
        Undefine.
        [__mips == 1 || _MIPS_ARCH_R5900] (__ASSUME_REQUEUE_PI): Likewise.
        [__mips == 1 || _MIPS_ARCH_R5900] (__ASSUME_SET_ROBUST_LIST):
        Likewise.

:100644 100644 f33dd23ea0b30c71c4b96993b011a0ac02f174cf ee4ede675ac9f30aeda61917312114242f174c9e M      ChangeLog
:040000 040000 08bd195d28b8e094591eafc08616c99a707eef25 030ca0573138769c7de36870b31c431a925cc7c3 M      sysdeps

Comment 2 Marc Kleine-Budde 2015-06-13 22:09:38 UTC

Just checked, the problem exists on current glibc master (3164bf09f529 Fix syslog fputs_unlocked namespace (bug 18530).), too.

Comment 3 jsm-csl@polyomino.org.uk 2015-06-14 14:07:38 UTC

Since pthread_cond_* are being rewritten, it might be worth seeing if 
Torvald's new implementation 
<https://sourceware.org/ml/libc-alpha/2015-02/msg00578.html> helps.

Comment 4 Marc Kleine-Budde 2015-06-14 15:30:43 UTC

I dug a bit deeper:

Due to commit:

    03d41216fe09 arm: Re-enable PI futex support for ARM kernels >= 3.14.3

(a.k.a. glibc-2.21~456) and compiling glibc with --enable-kernel=3.15 it PI futex support is working again.

This means PI futex support is deliberately broken^wdisabled for _all_ ARMs on glibc-2.20.

Marc

Comment 5 Marc Kleine-Budde 2015-06-14 16:31:52 UTC

(In reply to joseph@codesourcery.com from comment #3)
> Since pthread_cond_* are being rewritten, it might be worth seeing if 
> Torvald's new implementation 
> <https://sourceware.org/ml/libc-alpha/2015-02/msg00578.html> helps.

With Torvald Riegel's patch ported to current master, the test case works for both pre and post 3.14.3 kernel versions supplied to --enable-kernel=. I've tested --enable-kernel=3.0 and --enable-kernel=3.15.

This makes sense, as AFAICS all the __ASSUME defines in question[1] have been removed from the pthread_cond_ code.

Marc

[1] __ASSUME_FUTEX_LOCK_PI, __ASSUME_REQUEUE_PI, __ASSUME_SET_ROBUST_LIST

Comment 6 Marc Kleine-Budde 2015-06-20 16:12:24 UTC

We had another, deeper look at the glibc and kernel code.

For PI and robust mutexes there is runtime detection, but no
fallback code. But that's not possible without kernel support. Instead a
proper error code is returned:

For example PI mutex:

> int
> __pthread_mutex_init (mutex, mutexattr)
>      pthread_mutex_t *mutex;
>      const pthread_mutexattr_t *mutexattr;
> {
...
>     case PTHREAD_PRIO_INHERIT << PTHREAD_MUTEXATTR_PROTOCOL_SHIFT:
>       if (__glibc_unlikely (prio_inherit_missing ()))
>         return ENOTSUP;
>       break;

and prio_inherit_missing() is:

> static bool
> prio_inherit_missing (void)
> {
> #ifdef __NR_futex
> # ifndef __ASSUME_FUTEX_LOCK_PI
>   static int tpi_supported;
>   if (__glibc_unlikely (tpi_supported == 0))
>     {
>       int lock = 0;
>       INTERNAL_SYSCALL_DECL (err);
>       int ret = INTERNAL_SYSCALL (futex, err, 4, &lock, FUTEX_UNLOCK_PI, 0, 0);

This is the runtime detection, if the syscall return ENOSYS, then the kernel
has no support for PI mutexes.

>       assert (INTERNAL_SYSCALL_ERROR_P (ret, err));
>       tpi_supported = INTERNAL_SYSCALL_ERRNO (ret, err) == ENOSYS ? -1 : 1;
>     }
>   return __glibc_unlikely (tpi_supported < 0);
> # endif
>   return false;
> #endif
>   return true;
> }

So far so good. We have runtime detection for FUTEX_LOCK_PI and
SET_ROBUST_LIST (implemented likewise), but the problem is REQUEUE_PI.

The way PI mutexes are implemented you _have_ to use REQUEUE_PI, quoting
the Kernel's ./Documentation/futex-requeue-pi.txt:

> In order to ensure the rt_mutex has an owner if it has waiters, it
> is necessary for both the requeue code, as well as the waiting code,
> to be able to acquire the rt_mutex before returning to user space.
> The requeue code cannot simply wake the waiter and leave it to
> acquire the rt_mutex as it would open a race window between the
                                          ^^^^^^^^^^^
> requeue call returning to user space and the waiter waking and
> starting to run.  This is especially true in the uncontended case.
> 
> The solution involves two new rt_mutex helper routines,
> rt_mutex_start_proxy_lock() and rt_mutex_finish_proxy_lock(), which
> allow the requeue code to acquire an uncontended rt_mutex on behalf
> of the waiter and to enqueue the waiter on a contended rt_mutex.
> Two new system calls provide the kernel<->user interface to
> requeue_pi: FUTEX_WAIT_REQUEUE_PI and FUTEX_CMP_REQUEUE_PI.
              ^^^^^^^^^^^^^^^^^^^^^
> 
> FUTEX_WAIT_REQUEUE_PI is called by the waiter (pthread_cond_wait()
  ^^^^^^^^^^^^^^^^^^^^^
> and pthread_cond_timedwait()) to block on the initial futex and wait
> to be requeued to a PI-aware futex.  The implementation is the
> result of a high-speed collision between futex_wait() and
> futex_lock_pi(), with some extra logic to check for the additional
> wake-up scenarios.

Looking at pthread_cond_wait:

> int
> __pthread_cond_wait (pthread_cond_t *cond, pthread_mutex_t *mutex)
> {
[...]
> #if (defined lll_futex_wait_requeue_pi \
>      && defined __ASSUME_REQUEUE_PI)
                  ^^^^^^^^^^^^^^^^^^^
>       /* If pi_flag remained 1 then it means that we had the lock and the mutex                                                                                                                                    
>          but a spurious waker raced ahead of us.  Give back the mutex before                                                                                                                                       
>          going into wait again.  */
>       if (pi_flag)
>         {
>           __pthread_mutex_cond_lock_adjust (mutex);
>           __pthread_mutex_unlock_usercnt (mutex, 0);
>         }
>       pi_flag = USE_REQUEUE_PI (mutex);
> 
>       if (pi_flag)
>         {
>           err = lll_futex_wait_requeue_pi (&cond->__data.__futex,
>                                            futex_val, &mutex->__data.__lock,
>                                            pshared);

Here's the requeue_pi syscall, but no runtime check for it. See
__ASSUME_REQUEUE_PI above. If __ASSUME_REQUEUE_PI is not present this
code is not compiled in, resulting in the breakage we're seeing.

> 
>           pi_flag = (err == 0);
>         }
>       else
> #endif
>           /* Wait until woken by signal or broadcast.  */
>         lll_futex_wait (&cond->__data.__futex, futex_val, pshared);


Looking a the kernel's kernel/futex.c:

> long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout,
>                 u32 __user *uaddr2, u32 val2, u32 val3)
> {
...
>         switch (cmd) {
>         case FUTEX_LOCK_PI:
>         case FUTEX_UNLOCK_PI:
>         case FUTEX_TRYLOCK_PI:
>         case FUTEX_WAIT_REQUEUE_PI:
>         case FUTEX_CMP_REQUEUE_PI:
>                 if (!futex_cmpxchg_enabled)
>                         return -ENOSYS;
>         }

This is the corresponding code for the runtime checks in the glibc. The
runtime check for:

>         case FUTEX_WAIT_REQUEUE_PI:
>         case FUTEX_CMP_REQUEUE_PI:

has been added in v3.4-rc1~195^2~1. The glibc however only uses
lll_futex_wait_requeue_pi on PI mutexes, which are runtime checked
during pthread_mutex_init. So from my point of view the correct fix in
the glibc is to get rid of __ASSUME_REQUEUE_PI completely - or - for ARM
remove the undef __ASSUME_REQUEUE_PI.


=======================================

diff --git a/sysdeps/unix/sysv/linux/arm/kernel-features.h
b/sysdeps/unix/sysv/linux/arm/kernel-features.h
index cb407dbc300c..65046b16974b 100644

--- a/sysdeps/unix/sysv/linux/arm/kernel-features.h

+++ b/sysdeps/unix/sysv/linux/arm/kernel-features.h

@@ -39,6 +39,6 @@

    configuration.  */ 
 #if __LINUX_KERNEL_VERSION < 0x030E03 
 # undef __ASSUME_FUTEX_LOCK_PI 
-# undef __ASSUME_REQUEUE_PI 
+//# undef __ASSUME_REQUEUE_PI 
 # undef __ASSUME_SET_ROBUST_LIST 
 #endif 

Marc

Comment 7 Marc Kleine-Budde 2015-06-26 16:17:05 UTC

Created attachment 8400 [details]
Proposed fix for ARM

Comment 8 Marc Kleine-Budde 2015-07-17 10:17:04 UTC

Created attachment 8446 [details]
signature.asc

On 06/14/2015 04:07 PM, joseph at codesourcery dot com wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=18463
> 
> --- Comment #3 from joseph at codesourcery dot com <joseph at codesourcery dot com> ---
> Since pthread_cond_* are being rewritten, it might be worth seeing if 
> Torvald's new implementation 
> <https://sourceware.org/ml/libc-alpha/2015-02/msg00578.html> helps.

I've created a detailed analysis of the problem, created a patch and
attached it to the bugreport. Can someone have a look at the issue.

Marc

Comment 9 Adhemerval Zanella 2016-05-13 17:44:06 UTC

I think changing the behavior only for ARM is not the correct fix, but rather remove the __ASSUME_REQUEUE_PI as you suggested.

Now that minimum kernel version support is 3.2 we can assume that all PI futuxes operations as covered with cmpxchg enabled check (29bfcea07745737f385b0d092e71527051c29029 [1]), so calling futexes with FUTEX_WAIT_REQUEUE_PI or FUTEX_CMP_REQUEUE_PI will result in ENOSYS if cmpxchg is not enabled.

Now for older glibc releases we can assume this kernel behavior, since for GLIBC from 2.20 to 2.24 the minimum kernel version is 2.6.32 and this version does not have this fix.

I will work on this.

[1] git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git

Comment 10 Adhemerval Zanella 2016-05-13 17:46:55 UTC

In fact from GLIBC 2.20 to GLIBC 2.24 we can check if the enabled kernel is higher than 3.2 * and thus enable __ASSUME_REQUEUE_PI for all ports.

* I am assuming 3.2 because although 3.4 is first release where the fix has landed, 3.2 is LTS version.

Comment 11 Adhemerval Zanella 2016-05-13 18:24:05 UTC

Scratch my previous comments, checking with other GLIBC mantainers we currently only supports the tagged version in Linus' tree, not the  LTS 3.X.Y or 4.X.Y releases.

So both the fix to just __ASSUME_REQUEUE_PI for all architectures or just for ARM are incorrect because 3.2 and 3.3 kernel does not have futex_atomic_cmpxchg_inatomic fix, resulting in wrong futexes kernel operations depending of kernel configuration.

Comment 12 Marc Kleine-Budde 2016-05-14 18:41:18 UTC

(In reply to Adhemerval Zanella from comment #9)
> Now that minimum kernel version support is 3.2 we can assume that all PI
> futuxes operations as covered with cmpxchg enabled check
> (29bfcea07745737f385b0d092e71527051c29029 [1]), so calling futexes with
> FUTEX_WAIT_REQUEUE_PI or FUTEX_CMP_REQUEUE_PI will result in ENOSYS if cmpxchg
> is not enabled.

A mutex with PTHREAD_PRIO_INHERIT setup will fail on kernel if futex_cmpxchg_enabled is false - as it calls FUTEX_UNLOCK_PI to test if the kernel supports it. So a FUTEX_WAIT_REQUEUE_PI or FUTEX_CMP_REQUEUE_PI will _not_ be used later, because the mutex cannot be initialed in the first place.

Comment 13 Adhemerval Zanella 2016-05-16 13:18:24 UTC

For __ASSUME_FUTEX_LOCK_PI, which will use FUTEX_{LOCK,UNLOCK}_PI, glibc can assume it will return -ENOSYS in all the cases since 'do_futex' will first check if 'futex_cmpxchg_enabled' is defined by the architecture. And as far I could check, the missing one that does not really support futex_atomic_cmpxchg_inatomic in 3.2 (arm, m68k, mips, sparc) will either return if it supports or use the default futex.h implementation that returns -ENOSYS.

But we can not assume the same thing for __ASSUME_REQUEUE_PI which will use the FUTEX_WAIT_REQUEUE_PI and FUTEX_CMP_REQUEUE_PI operation, since for linux v3.2 there is no runtime check in kernel/futex.c.  This is only done by 59263b513c11398cd66a52d4c5b2b118ce1e0359 which only appears on v3.4.

The possible cleanup I see is to remove __ASSUME_FUTEX_LOCK_PI, since kernel will return -ENOSYS for all supported arch if futex_atomic_cmpxchg_inatomic is not supported in the particular kernel configuration; and change the __ASSUME_REQUEUE_PI minimum kernel for arm to enable it for v3.4+.

Comment 14 Adhemerval Zanella 2017-01-10 18:17:28 UTC

Based on comment #5 and now that the new condvar implementation is upstream (ed19993b5b0d) I am closing this issue.

Comment 15 Torvald Riegel 2017-01-11 17:09:09 UTC

*** Bug 19941 has been marked as a duplicate of this bug. ***

Comment 16 Torvald Riegel 2017-01-11 17:23:00 UTC

The new condvar doesn't use futex-requeue anymore, so closing this is okay.

Adhemerval, it might still be good to fix the _ASSUME* problem that you described though.  I currently don't see a need for requeue elsewhere, but I wouldn't like to forget about this problem either.  If you agree, could you open a separate bug for it?

Comment 17 Adhemerval Zanella 2017-01-11 17:44:55 UTC

I am not sure if we really need to keep track of the __ASSUME_{REQUEUE_PI,FUTEX_LOCK_PI}.  The __ASSUME_FUTEX_LOCK_PI was just removed and __ASSUME_REQUEUE_PI is not really used anymore and I think we can just remove its definition for every architecture.  It is used only for USE_REQUEUE_PI in nptl/pthreadP.h and the macro is not use in any place.

Comment 18 Torvald Riegel 2017-01-11 18:06:20 UTC

(In reply to Adhemerval Zanella from comment #17)
> I am not sure if we really need to keep track of the
> __ASSUME_{REQUEUE_PI,FUTEX_LOCK_PI}.  The __ASSUME_FUTEX_LOCK_PI was just
> removed and __ASSUME_REQUEUE_PI is not really used anymore and I think we
> can just remove its definition for every architecture.  It is used only for
> USE_REQUEUE_PI in nptl/pthreadP.h and the macro is not use in any place.

You're right.  Just removing is the easiest option.