This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 0/6][BZ #11588] pi-condvars: add priority inheritance for pthread_cond_* internal lock


On Mon, Aug 18, 2014 at 11:50:24PM -0400, Rich Felker wrote:
> Note that the kernel has a FUTEX_WAKE_OP command that lets you perform
> writes after acquiring the futex key and hash bucket, but there's no
> similar command for doing writes after getting ready to wait. What
> would be ideal would be a FUTEX_WAIT_OP command that:
> 
> 1. Acquires futex key and hash buckets for uaddr and uaddr2.
> 
> 2. Performs an atomic operation on uaddr2.
> 
> 3. Performs the comparison on the old value from uaddr2 and possibly
>    performs a futex wake on the already-acquired uaddr2 futex.
> 
> 4. Waits on the already-acquired uaddr futex, but without
>    restartability of the syscall.
> 
> This is basically a full WAIT analogue for the current FUTEX_WAKE_OP
> command. It would allow the cond var implementation to unlock the
> mutex atomically with waiting on the cond var futex, and would make it
> possible to achieve signal and broadcast with trivial use of
> FUTEX_WAKE. Most importantly, it would solve both the sequence number
> issue AND the self-synchronized destruction issue (destroying or
> unmapping the cond var immediately after the last waiter is unblocked)
> since the associated implementation of process-shared cond vars would
> never access the cond vard object at all (except to read the pshared
> flag and other attributes, before performing the operation); the
> entire wait operation takes place simply using the address as a futex
> key, without ever reading from or writing to it.

Note that the above implementation is already semi-possible without
help from the kernel if you have a helper thread to requeue you. The
way it goes is basically:

1. Send command to helper thread asking to wait on cond var.

2. Wait on a futex on your own stack with initial value zero.

3. Helper thread writes a 1 to your futex and requeues you to the
   actual cond var futex.

4. Helper thread unlocks the mutex (using knowledge of implementation
   internals to unlock a mutex owned by a different thread) and writes
   a 2 to your futex.

5. After waking, you check the value of your futex and wait if it's
   not yet 2. This is so that you don't pull your stack out from under
   the helper thread that's requeueing from it.

The magic is that even after requeue, if you're interrupted by a
signal, upon restarting the syscall you'll return to waiting on the
original address (on your stack) where the value has changed, and thus
error out immediately with EAGAIN.

Some extra work I've hand-waved away is required for points 4/5 unless
you want to spin waiting for it to change, but it's easy to do with
futex wait/wakes since you control the lifetime of the futex object
(on your stack). What's uglier is step 3: requeue will fail if your
haven't yet waited on the futex, and then the helper has to retry the
requeue until it succeeds (spinning). I see no way to make this clean;
the closest thing to a solution is exponential-backoff sleeps. Solving
it right requires basically the exact thing we're trying to emulate: a
futex operation that modified and wakes one address while waiting on
another.

Rich


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]