This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: Futex error handling
- From: Rich Felker <dalias at libc dot org>
- To: Torvald Riegel <triegel at redhat dot com>
- Cc: GLIBC Devel <libc-alpha at sourceware dot org>, Darren Hart <dvhart at infradead dot org>
- Date: Thu, 18 Sep 2014 13:39:53 -0400
- Subject: Re: Futex error handling
- Authentication-results: sourceware.org; auth=none
- References: <1410881785 dot 4967 dot 292 dot camel at triegel dot csb> <20140916165607 dot GZ23797 at brightrain dot aerifal dot cx> <1410891158 dot 4967 dot 303 dot camel at triegel dot csb> <20140916185457 dot GA23797 at brightrain dot aerifal dot cx> <1411043195 dot 27838 dot 32 dot camel at triegel dot csb>
On Thu, Sep 18, 2014 at 02:26:35PM +0200, Torvald Riegel wrote:
> > The EFAULT case with
> > FUTEX_WAKE, and which I claim FUTEX_WAKE_OP avoids, is when the atomic
> > operation on the futex int that's associated with the wake allows
> > another thread to synchronize and determine that it may legally
> > destroy the object before the actual wake is sent. FUTEX_WAKE_OP can
> > fully avoid this by performing the atomic operation after looking up
> > and locking the futex hash bucket, so that there's no further access
> > after the atomic and thus no opportunity for fault.
>
> Agreed; that like what UNLOCK_PI does. However, and that's something
> I've only thought about recently, it would be good to know which
> guarantees the kernel gives in this case; in particular, what happens
> (and which error code results) if there is destruction and potential
> unmapping etc. of the futex variable concurrently with WAKE_OP or
> UNLOCK_PI being in flight.
I've RTFS'd and my understanding is that no such problems are
possible. The futex hashing (note: there are two futex address
arguments and both are hashed, even if they're equal; this should be
optimized on the kernel side to make FUTEX_WAKE_OP practical) and
locking of the resulting hash buckets happens before the atomic
operation is performed. After the atomic operation, the bucket is
walked and matching waiters are woken.
In theory it's possible that, as soon as the atomic operation is
performed, the backing (file/anon/whatever) is destroyed and its
underlying id (e.g. inode number) is reused, so that the backing
identified for the original futex address has been reused by this
time. However, it's not a problem because a new waiter can't arrive
while the hash bucket is still locked -- so such a new waiter can't be
woken.
Rich