This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [RFC] mutex destruction (#13690): problem description and workarounds

From: "Carlos O'Donell" <carlos at redhat dot com>
To: Rich Felker <dalias at libc dot org>
Cc: Torvald Riegel <triegel at redhat dot com>, GLIBC Devel <libc-alpha at sourceware dot org>
Date: Mon, 01 Dec 2014 12:30:13 -0500
Subject: Re: [RFC] mutex destruction (#13690): problem description and workarounds
Authentication-results: sourceware.org; auth=none
References: <1396621230 dot 10643 dot 7191 dot camel at triegel dot csb> <54762514 dot 2030301 at redhat dot com> <20141201154420 dot GW29621 at brightrain dot aerifal dot cx>

On 12/01/2014 10:44 AM, Rich Felker wrote:
> On Wed, Nov 26, 2014 at 02:08:04PM -0500, Carlos O'Donell wrote:
>>> === Workaround 2: New FUTEX_UNLOCK operation that makes resetting the
>>> futex var and unblocking other threads atomic wrt. other FUTEX_WAIT ops
>>>
>>> This is like UNLOCK_PI, except for not doing PI.  Variations of this new
>>> futex op could store user-supplied values, or do a compare-and-set (or
>>> similar) read-modify-write operations.  FUTEX_WAKE_OP could be used as
>>> well, but we don't need to wake another futex (unnecessary overhead).
>>> (I haven't checked the kernel's FUTEX_WAKE_OP implementation, and there
>>> might be reasons why it can't be used as is (e.g., due to how things are
>>> locked regarding the second futex).
>>>
>>> The biggest performance drawback that I see is a potential increase in
>>> the latency of unlocking any thread (blocked *or* spinning) when any
>>> thread is blocked.  This is because we'll have to ask the kernel to
>>> reset the futex var (instead of, like now, userspace doing it), which
>>> means that we'll have to enter the kernel first before a spinning thread
>>> can get the okay to acquire the lock.  This could decrease lock
>>> scalability for short critical sections in particular because those
>>> effectively get longer.
>>> I don't think it's sufficient to merely count the number of waiting
>>> blocked threads in the futex var to get around pending FUTEX_WAKE calls.
>>> If there is *any* potentially blocked waiter, we'll have to use the
>>> kernel to reset the futex var.
>>> Perhaps this could be mitigated if we'd do a lot more spinning in the
>>> futexes, so that it's unlikely to slow down spinning waiters just
>>> because there's some blocked thread.  For blocked threads, the slow down
>>> should be less because if a waiter is going to block, there's just a
>>> small time window where the FUTEX_WAIT will actually fail (EWOULDBLOCK)
>>> due to the futex var changing concurrently.
>>>
>>> Pros:
>>> * Correct futex uses will need no changes.
>>> Cons:
>>> * glibc implementation will have to change (mutexes, semaphores,
>>> barriers, perhaps condvars).
>>> * Needs a new futex operation (or might use FUTEX_WAKE_OP with some
>>> performance penalty).
>>> * Potential decrease in lock scalability unless it can be mitigated by
>>> more aggressive spinning.
>>
>> This is a "strict semantic" workaround, and the kernel does the work.
>>
>> I don't like this solution. The performance impact is not worth it given
>> the other workarounds. However, without any performance measurement I don't
>> know exactly how bad this is, but entering the kernel is bad enough that
>> we don't want to consider it.
>> [...]
>>> === Summary / RFC
>>>
>>> IMO, workarounds 1, 1a, or 2 might be the best ones, although 4 might
>>> also be good.
>>
>> I vote 2. I don't want to relax the semantics of the original operations
>> to support spurious wakes. Such spurious wakes might be fixed in future
>> kernels and maybe other applications can start depending on them even if
>> glibc doesn't.
> 
> You said you don't like 2, but voted for 2? I'm confused.

That was a mistake.

I did not notice Torvald numbered it 1a instead of 2.

I vote 1a :}

Cheers,
Carlos.

References:
- Re: [RFC] mutex destruction (#13690): problem description and workarounds
  - From: Rich Felker

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]