This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH][BZ #20973] Robust mutexes: Fix lost wake-up.
On 12/19/2016 02:47 PM, Florian Weimer wrote:
> On 12/16/2016 11:13 PM, Torvald Riegel wrote:
>> On Fri, 2016-12-16 at 15:11 +0100, Florian Weimer wrote:
>>> On 12/15/2016 11:29 PM, Torvald Riegel wrote:
>>>> diff --git a/nptl/pthread_mutex_lock.c
>>>> b/nptl/pthread_mutex_lock.c index bdfa529..01ac75e 100644 ---
>>>> a/nptl/pthread_mutex_lock.c +++ b/nptl/pthread_mutex_lock.c @@
>>>> -182,6 +182,11 @@ __pthread_mutex_lock_full (pthread_mutex_t
>>>> *mutex) &mutex->__data.__list.__next);
>>>>
>>>> oldval = mutex->__data.__lock; + /* This is set to
>>>> FUTEX_WAITERS iff we might have shared the
>>>
>>> “iff” doesn't seem to be correct here because it's not an exact
>>> equivalence, “if” is sufficient.
>>
>> No, I think the iff is correct. We do only set it if we may have
>> shared the flag.
>
> Then please change it to “This is set to FUTEX_WAITERS iff we have
> shared” (i.e. drop the “might”). Based on the source code, I'm still
> not sure if this is an exact equivalence.
OK. I agree I'm also not sure it's an exact equivalence, I'd also probably
say "This _will_ be set to..." since technically we share FUTEX_WAITERS in
the inner __lll_robust_lock_wait before we come back and set assume_other_futex_waiters.
> The part which confuses me is the unconditional assignment
> assume_other_futex_waiters = FUTEX_WAITERS further below.
It _must_ be assigned unconditionally exactly because of the problem
outlined in this issue (and expanded in my review). Because the value
of FUTEX_WAITERS is shared, and because of the recovery semantics of
the mutex, there is a path through __lll_robust_lock_wait which needs
to avoid the loss of FUTEX_WAITERS (and the required wakeup which happens
on unlock).
Because the unlock and wakeup by T0 clears the FUTEX_WAITERS flag, and
you don't know if other threads queued up after you, you must assume the
worst and do the wakeup also. I don't see a way to avoid the spurious wakeup.
> But I
> think lll_robust_lock returns 0 if we did not share FUTEX_WAITERS,
> and the code never retries with the assigned
> assume_other_futex_waiters value, ensuring the equivalence. I think
> it would be clearer if you switched from a do-while loop to a loop
> with an exit condition in the middle, right after the call to
> lll_robust_lock.
Yes, this is a case where assume_other_futex_waiters is zero and we have
potentially shared FUTEX_WAITERS via the inner __lll_robust_lock_wait code,
so it would appear this example contradicts using "iff".
I don't know about restructuring this loop. I'd have to see a concrete
example of the cleanup, and I think I'd like to avoid such a cleanup to
keep this patch minimal for now. Future cleanup could happen in a lot
of ways.
> Putting the FUTEX_WAITERS into the ID passed to lll_robust_lock is a
> violation of its precondition documented in
> sysdeps/nptl/lowlevellock.h, so please update the comment.
The precondition is a bit loose about this, but I agree the language
could be tightened up.
--
Cheers,
Carlos.