Created attachment 8335 [details] test program that triggers this issue The attached program uses pthread_cond_broadcast() to wake up two threads waiting on a condition variable which is surrounded by a mutex with PTHREAD_PRIO_INHERIT set. The pthread_mutex_unlock() after pthread_cond_broadcast() returns EPERM in the main thread. This does not happen when using pthread_cond_signal(). Also it does not happen when using just one thread that is waiting or when PTHREAD_PRIO_INHERIT is not set. This issue is reproducable with the attached program on the following setups: - Kernel 4.0 with OSELAS Toolchain 2014.12.0 for cortex a8, glibc-2.20 - Kernel 4.0 with OSELAS Toolchain 2014.12.0 for armv7a without NEON, glibc-2.20 - Kernel 4.0.0-1-armmp with native Debian experimental toolchain for armhf, glibc-2.21, gcc-5 - Kernel 3.18.0-trunk-kirkwood native Debian experimental toolchain for armel, glibc-2.21, gcc-5 This issue does not exist on: - Kernel 3.16 with Debian toolchain for x86_64 , glibc-2.19, gcc-4.9.2 - Kernel 4.0.4 (vanilla, not debian) with Debian experimental toolchain for x86_64, glibc-2.21, gcc-5 - Kernel 4.0.0-1-armmp with native Debian testing toolchain for armhf, glibc-2.19, gcc-4.9.2 - Kernel 4.0.0-1-armmp crosscompile Debian unstable toolchain for armhf, glibc-2.19, gcc-4.9.2 - Kernel 3.18.0-trunk-kirkwood native Debian testing toolchain for armel, glibc-2.19, gcc-4.9.2 I found bug 17013 which seems related. But 17013 is marked as fixed in glibc-2.20 which is where the issue described above was introduced. We tried reverting 8f630cca5c36941db1cb48726016bbed80ec1041 Fix lll_unlock twice in pthread_cond_broadcast it does not solve the issue.
I bisected the problem to the following comit: 47c5adebd2c8 Correct robust mutex / PI futex kernel assumptions (bug 9894). Applying this HACK to glibc-2.20 works around to problem: index e755741de60b..a56290ce4ca4 100644 --- a/sysdeps/unix/sysv/linux/arm/kernel-features.h +++ b/sysdeps/unix/sysv/linux/arm/kernel-features.h @@ -37,6 +37,6 @@ /* The ARM kernel may or may not support futex_atomic_cmpxchg_inatomic, depending on kernel configuration. */ -#undef __ASSUME_FUTEX_LOCK_PI -#undef __ASSUME_REQUEUE_PI -#undef __ASSUME_SET_ROBUST_LIST +//#undef __ASSUME_FUTEX_LOCK_PI +//#undef __ASSUME_REQUEUE_PI +//#undef __ASSUME_SET_ROBUST_LIST The glibc-2.20 is build with "--enable-kernel=3.0" using linux-3.16 header generated by "make headers_install". Marc -------->8-------->8-------->8-------->8-------->8-------->8-------- 47c5adebd2c864a098c3af66e61e1147dc3cf0b4 is the first bad commit commit 47c5adebd2c864a098c3af66e61e1147dc3cf0b4 Author: Joseph Myers <joseph@codesourcery.com> Date: Mon Mar 31 12:55:18 2014 +0000 Correct robust mutex / PI futex kernel assumptions (bug 9894). This patch continues fixing __ASSUME_* issues in preparation for moving to a 2.6.32 minimum kernel version by addressing assumptions on robust mutex and PI futex support availability. Those assumptions are bug 9894, but to be clear this patch does not address all the issues from that bug about wrong version assumptions, only those still applicable for --enable-kernel=2.6.32 or later (with the expectation that the move to that minimum kernel will obsolete the other parts of the bug). The patch is independent of <https://sourceware.org/ml/libc-alpha/2014-03/msg00585.html>, my other pending-review patch preparing for the kernel version change; the two together complete all the changes I believe are needed in preparation regarding any macro in sysdeps/unix/sysv/linux/kernel-features.h that would be affected by such a change. (I have not checked the correctness of macros whose conditions are unaffected by such a change, or macros only defined in other kernel-features.h files.) As discussed in that bug, robust mutexes and PI futexes need futex_atomic_cmpxchg_inatomic to be implemented, in addition to certain syscalls needed for robust mutexes (and architecture-independent kernel pieces for all the features in question). That is, as I understand it, they need futex_atomic_cmpxchg_inatomic to *work* (not return an ENOSYS error). The issues identified in my analysis relate to ARM, M68K, MicroBlaze, MIPS and SPARC. On ARM, whether futex_atomic_cmpxchg_inatomic works depends on the kernel configuration. As of 3.13, the condition for *not* working is CONFIG_CPU_USE_DOMAINS && CONFIG_SMP. As of 2.6.32 it was simply CONFIG_SMP that meant the feature was not implemented. I don't know if there are any circumstances in which we can say "we can assume a userspace glibc binary built with these options will never run on a kernel with the problematic configuration", but at least for now I'm just undefining the relevant __ASSUME_* macros for ARM. On M68K, two of the three macros are undefined for kernels before 3.10, but as far as I can see __ASSUME_FUTEX_LOCK_PI is in the same group needing futex_atomic_cmpxchg_inatomic support and so should be undefined as well. On MicroBlaze the required support was added in 2.6.33. On MIPS, the support depends on cpu_has_llsc in the kernel - that is, actual hardware LL/SC support (GCC and glibc for MIPS GNU/Linux rely on the instructions being supported in some way, but it may be kernel emulation; futex_atomic_cmpxchg_inatomic doesn't work with that emulation). The same condition as in GCC for indicating LL/SC support may not be available is used for undefining the macros in glibc, __mips == 1 || defined _MIPS_ARCH_R5900. (Maybe we could in fact desupport MIPS processors without the hardware support in glibc.) On SPARC, 32-bit kernels don't support futex_atomic_cmpxchg_inatomic; __arch64__ || __sparc_v9__ is used as the condition for binaries that won't run on 32-bit kernels. This patch is not tested beyond the sanity check of an x86_64 build. [BZ #9894] * sysdeps/unix/sysv/linux/kernel-features.h [__sparc__ && !__arch64__ && !__sparc_v9__] (__ASSUME_SET_ROBUST_LIST): Do not define. [__sparc__ && !__arch64__ && !__sparc_v9__] (__ASSUME_FUTEX_LOCK_PI): Likewise. [__sparc__ && !__arch64__ && !__sparc_v9__] (__ASSUME_REQUEUE_PI): Likewise. * sysdeps/unix/sysv/linux/arm/kernel-features.h (__ASSUME_FUTEX_LOCK_PI): Undefine. (__ASSUME_REQUEUE_PI): Likewise. (__ASSUME_SET_ROBUST_LIST): Likewise. * sysdeps/unix/sysv/linux/m68k/kernel-features.h [__LINUX_KERNEL_VERSION < 0x030a00] (__ASSUME_FUTEX_LOCK_PI): Undefine. * sysdeps/unix/sysv/linux/microblaze/kernel-features.h [__LINUX_KERNEL_VERSION < 0x020621] (__ASSUME_FUTEX_LOCK_PI): Likewise. [__LINUX_KERNEL_VERSION < 0x020621] (__ASSUME_REQUEUE_PI): Likewise. [__LINUX_KERNEL_VERSION < 0x020621] (__ASSUME_SET_ROBUST_LIST): Likewise. * sysdeps/unix/sysv/linux/mips/kernel-features.h [__mips == 1 || _MIPS_ARCH_R5900] (__ASSUME_FUTEX_LOCK_PI): Undefine. [__mips == 1 || _MIPS_ARCH_R5900] (__ASSUME_REQUEUE_PI): Likewise. [__mips == 1 || _MIPS_ARCH_R5900] (__ASSUME_SET_ROBUST_LIST): Likewise. :100644 100644 f33dd23ea0b30c71c4b96993b011a0ac02f174cf ee4ede675ac9f30aeda61917312114242f174c9e M ChangeLog :040000 040000 08bd195d28b8e094591eafc08616c99a707eef25 030ca0573138769c7de36870b31c431a925cc7c3 M sysdeps
Just checked, the problem exists on current glibc master (3164bf09f529 Fix syslog fputs_unlocked namespace (bug 18530).), too.
Since pthread_cond_* are being rewritten, it might be worth seeing if Torvald's new implementation <https://sourceware.org/ml/libc-alpha/2015-02/msg00578.html> helps.
I dug a bit deeper: Due to commit: 03d41216fe09 arm: Re-enable PI futex support for ARM kernels >= 3.14.3 (a.k.a. glibc-2.21~456) and compiling glibc with --enable-kernel=3.15 it PI futex support is working again. This means PI futex support is deliberately broken^wdisabled for _all_ ARMs on glibc-2.20. Marc
(In reply to joseph@codesourcery.com from comment #3) > Since pthread_cond_* are being rewritten, it might be worth seeing if > Torvald's new implementation > <https://sourceware.org/ml/libc-alpha/2015-02/msg00578.html> helps. With Torvald Riegel's patch ported to current master, the test case works for both pre and post 3.14.3 kernel versions supplied to --enable-kernel=. I've tested --enable-kernel=3.0 and --enable-kernel=3.15. This makes sense, as AFAICS all the __ASSUME defines in question[1] have been removed from the pthread_cond_ code. Marc [1] __ASSUME_FUTEX_LOCK_PI, __ASSUME_REQUEUE_PI, __ASSUME_SET_ROBUST_LIST
We had another, deeper look at the glibc and kernel code. For PI and robust mutexes there is runtime detection, but no fallback code. But that's not possible without kernel support. Instead a proper error code is returned: For example PI mutex: > int > __pthread_mutex_init (mutex, mutexattr) > pthread_mutex_t *mutex; > const pthread_mutexattr_t *mutexattr; > { ... > case PTHREAD_PRIO_INHERIT << PTHREAD_MUTEXATTR_PROTOCOL_SHIFT: > if (__glibc_unlikely (prio_inherit_missing ())) > return ENOTSUP; > break; and prio_inherit_missing() is: > static bool > prio_inherit_missing (void) > { > #ifdef __NR_futex > # ifndef __ASSUME_FUTEX_LOCK_PI > static int tpi_supported; > if (__glibc_unlikely (tpi_supported == 0)) > { > int lock = 0; > INTERNAL_SYSCALL_DECL (err); > int ret = INTERNAL_SYSCALL (futex, err, 4, &lock, FUTEX_UNLOCK_PI, 0, 0); This is the runtime detection, if the syscall return ENOSYS, then the kernel has no support for PI mutexes. > assert (INTERNAL_SYSCALL_ERROR_P (ret, err)); > tpi_supported = INTERNAL_SYSCALL_ERRNO (ret, err) == ENOSYS ? -1 : 1; > } > return __glibc_unlikely (tpi_supported < 0); > # endif > return false; > #endif > return true; > } So far so good. We have runtime detection for FUTEX_LOCK_PI and SET_ROBUST_LIST (implemented likewise), but the problem is REQUEUE_PI. The way PI mutexes are implemented you _have_ to use REQUEUE_PI, quoting the Kernel's ./Documentation/futex-requeue-pi.txt: > In order to ensure the rt_mutex has an owner if it has waiters, it > is necessary for both the requeue code, as well as the waiting code, > to be able to acquire the rt_mutex before returning to user space. > The requeue code cannot simply wake the waiter and leave it to > acquire the rt_mutex as it would open a race window between the ^^^^^^^^^^^ > requeue call returning to user space and the waiter waking and > starting to run. This is especially true in the uncontended case. > > The solution involves two new rt_mutex helper routines, > rt_mutex_start_proxy_lock() and rt_mutex_finish_proxy_lock(), which > allow the requeue code to acquire an uncontended rt_mutex on behalf > of the waiter and to enqueue the waiter on a contended rt_mutex. > Two new system calls provide the kernel<->user interface to > requeue_pi: FUTEX_WAIT_REQUEUE_PI and FUTEX_CMP_REQUEUE_PI. ^^^^^^^^^^^^^^^^^^^^^ > > FUTEX_WAIT_REQUEUE_PI is called by the waiter (pthread_cond_wait() ^^^^^^^^^^^^^^^^^^^^^ > and pthread_cond_timedwait()) to block on the initial futex and wait > to be requeued to a PI-aware futex. The implementation is the > result of a high-speed collision between futex_wait() and > futex_lock_pi(), with some extra logic to check for the additional > wake-up scenarios. Looking at pthread_cond_wait: > int > __pthread_cond_wait (pthread_cond_t *cond, pthread_mutex_t *mutex) > { [...] > #if (defined lll_futex_wait_requeue_pi \ > && defined __ASSUME_REQUEUE_PI) ^^^^^^^^^^^^^^^^^^^ > /* If pi_flag remained 1 then it means that we had the lock and the mutex > but a spurious waker raced ahead of us. Give back the mutex before > going into wait again. */ > if (pi_flag) > { > __pthread_mutex_cond_lock_adjust (mutex); > __pthread_mutex_unlock_usercnt (mutex, 0); > } > pi_flag = USE_REQUEUE_PI (mutex); > > if (pi_flag) > { > err = lll_futex_wait_requeue_pi (&cond->__data.__futex, > futex_val, &mutex->__data.__lock, > pshared); Here's the requeue_pi syscall, but no runtime check for it. See __ASSUME_REQUEUE_PI above. If __ASSUME_REQUEUE_PI is not present this code is not compiled in, resulting in the breakage we're seeing. > > pi_flag = (err == 0); > } > else > #endif > /* Wait until woken by signal or broadcast. */ > lll_futex_wait (&cond->__data.__futex, futex_val, pshared); Looking a the kernel's kernel/futex.c: > long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout, > u32 __user *uaddr2, u32 val2, u32 val3) > { ... > switch (cmd) { > case FUTEX_LOCK_PI: > case FUTEX_UNLOCK_PI: > case FUTEX_TRYLOCK_PI: > case FUTEX_WAIT_REQUEUE_PI: > case FUTEX_CMP_REQUEUE_PI: > if (!futex_cmpxchg_enabled) > return -ENOSYS; > } This is the corresponding code for the runtime checks in the glibc. The runtime check for: > case FUTEX_WAIT_REQUEUE_PI: > case FUTEX_CMP_REQUEUE_PI: has been added in v3.4-rc1~195^2~1. The glibc however only uses lll_futex_wait_requeue_pi on PI mutexes, which are runtime checked during pthread_mutex_init. So from my point of view the correct fix in the glibc is to get rid of __ASSUME_REQUEUE_PI completely - or - for ARM remove the undef __ASSUME_REQUEUE_PI. ======================================= diff --git a/sysdeps/unix/sysv/linux/arm/kernel-features.h b/sysdeps/unix/sysv/linux/arm/kernel-features.h index cb407dbc300c..65046b16974b 100644 --- a/sysdeps/unix/sysv/linux/arm/kernel-features.h +++ b/sysdeps/unix/sysv/linux/arm/kernel-features.h @@ -39,6 +39,6 @@ configuration. */ #if __LINUX_KERNEL_VERSION < 0x030E03 # undef __ASSUME_FUTEX_LOCK_PI -# undef __ASSUME_REQUEUE_PI +//# undef __ASSUME_REQUEUE_PI # undef __ASSUME_SET_ROBUST_LIST #endif Marc
Created attachment 8400 [details] Proposed fix for ARM
Created attachment 8446 [details] signature.asc On 06/14/2015 04:07 PM, joseph at codesourcery dot com wrote: > https://sourceware.org/bugzilla/show_bug.cgi?id=18463 > > --- Comment #3 from joseph at codesourcery dot com <joseph at codesourcery dot com> --- > Since pthread_cond_* are being rewritten, it might be worth seeing if > Torvald's new implementation > <https://sourceware.org/ml/libc-alpha/2015-02/msg00578.html> helps. I've created a detailed analysis of the problem, created a patch and attached it to the bugreport. Can someone have a look at the issue. Marc
I think changing the behavior only for ARM is not the correct fix, but rather remove the __ASSUME_REQUEUE_PI as you suggested. Now that minimum kernel version support is 3.2 we can assume that all PI futuxes operations as covered with cmpxchg enabled check (29bfcea07745737f385b0d092e71527051c29029 [1]), so calling futexes with FUTEX_WAIT_REQUEUE_PI or FUTEX_CMP_REQUEUE_PI will result in ENOSYS if cmpxchg is not enabled. Now for older glibc releases we can assume this kernel behavior, since for GLIBC from 2.20 to 2.24 the minimum kernel version is 2.6.32 and this version does not have this fix. I will work on this. [1] git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
In fact from GLIBC 2.20 to GLIBC 2.24 we can check if the enabled kernel is higher than 3.2 * and thus enable __ASSUME_REQUEUE_PI for all ports. * I am assuming 3.2 because although 3.4 is first release where the fix has landed, 3.2 is LTS version.
Scratch my previous comments, checking with other GLIBC mantainers we currently only supports the tagged version in Linus' tree, not the LTS 3.X.Y or 4.X.Y releases. So both the fix to just __ASSUME_REQUEUE_PI for all architectures or just for ARM are incorrect because 3.2 and 3.3 kernel does not have futex_atomic_cmpxchg_inatomic fix, resulting in wrong futexes kernel operations depending of kernel configuration.
(In reply to Adhemerval Zanella from comment #9) > Now that minimum kernel version support is 3.2 we can assume that all PI > futuxes operations as covered with cmpxchg enabled check > (29bfcea07745737f385b0d092e71527051c29029 [1]), so calling futexes with > FUTEX_WAIT_REQUEUE_PI or FUTEX_CMP_REQUEUE_PI will result in ENOSYS if cmpxchg > is not enabled. A mutex with PTHREAD_PRIO_INHERIT setup will fail on kernel if futex_cmpxchg_enabled is false - as it calls FUTEX_UNLOCK_PI to test if the kernel supports it. So a FUTEX_WAIT_REQUEUE_PI or FUTEX_CMP_REQUEUE_PI will _not_ be used later, because the mutex cannot be initialed in the first place.
For __ASSUME_FUTEX_LOCK_PI, which will use FUTEX_{LOCK,UNLOCK}_PI, glibc can assume it will return -ENOSYS in all the cases since 'do_futex' will first check if 'futex_cmpxchg_enabled' is defined by the architecture. And as far I could check, the missing one that does not really support futex_atomic_cmpxchg_inatomic in 3.2 (arm, m68k, mips, sparc) will either return if it supports or use the default futex.h implementation that returns -ENOSYS. But we can not assume the same thing for __ASSUME_REQUEUE_PI which will use the FUTEX_WAIT_REQUEUE_PI and FUTEX_CMP_REQUEUE_PI operation, since for linux v3.2 there is no runtime check in kernel/futex.c. This is only done by 59263b513c11398cd66a52d4c5b2b118ce1e0359 which only appears on v3.4. The possible cleanup I see is to remove __ASSUME_FUTEX_LOCK_PI, since kernel will return -ENOSYS for all supported arch if futex_atomic_cmpxchg_inatomic is not supported in the particular kernel configuration; and change the __ASSUME_REQUEUE_PI minimum kernel for arm to enable it for v3.4+.
Based on comment #5 and now that the new condvar implementation is upstream (ed19993b5b0d) I am closing this issue.
*** Bug 19941 has been marked as a duplicate of this bug. ***
The new condvar doesn't use futex-requeue anymore, so closing this is okay. Adhemerval, it might still be good to fix the _ASSUME* problem that you described though. I currently don't see a need for requeue elsewhere, but I wouldn't like to forget about this problem either. If you agree, could you open a separate bug for it?
I am not sure if we really need to keep track of the __ASSUME_{REQUEUE_PI,FUTEX_LOCK_PI}. The __ASSUME_FUTEX_LOCK_PI was just removed and __ASSUME_REQUEUE_PI is not really used anymore and I think we can just remove its definition for every architecture. It is used only for USE_REQUEUE_PI in nptl/pthreadP.h and the macro is not use in any place.
(In reply to Adhemerval Zanella from comment #17) > I am not sure if we really need to keep track of the > __ASSUME_{REQUEUE_PI,FUTEX_LOCK_PI}. The __ASSUME_FUTEX_LOCK_PI was just > removed and __ASSUME_REQUEUE_PI is not really used anymore and I think we > can just remove its definition for every architecture. It is used only for > USE_REQUEUE_PI in nptl/pthreadP.h and the macro is not use in any place. You're right. Just removing is the easiest option.