This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: futex(3) man page, final draft for pre-release review
- From: Torvald Riegel <triegel at redhat dot com>
- To: Darren Hart <dvhart at infradead dot org>
- Cc: "Michael Kerrisk (man-pages)" <mtk dot manpages at gmail dot com>, Thomas Gleixner <tglx at linutronix dot de>, lkml <linux-kernel at vger dot kernel dot org>, libc-alpha <libc-alpha at sourceware dot org>, linux-man <linux-man at vger dot kernel dot org>, "Carlos O'Donell" <carlos at redhat dot com>, Roland McGrath <roland at hack dot frob dot com>, Davidlohr Bueso <dave at stgolabs dot net>, Jakub Jelinek <jakub at redhat dot com>, Ingo Molnar <mingo at elte dot hu>, bill o gallmeister <bgallmeister at gmail dot com>, bert hubert <bert dot hubert at netherlabs dot nl>, Jan Kiszka <jan dot kiszka at siemens dot com>, Eric Dumazet <edumazet at google dot com>, Arnd Bergmann <arnd at arndb dot de>, Rusty Russell <rusty at rustcorp dot com dot au>, Heinrich Schuchardt <xypron dot glpk at gmx dot de>, Andy Lutomirski <luto at amacapital dot net>, Daniel Wagner <wagi at monom dot org>, Anton Blanchard <anton at samba dot org>, Steven Rostedt <rostedt at goodmis dot org>, Rich Felker <dalias at libc dot org>, Jonathan Wakely <jwakely at redhat dot com>, Mike Frysinger <vapier at gentoo dot org>
- Date: Fri, 18 Dec 2015 12:21:54 +0100
- Subject: Re: futex(3) man page, final draft for pre-release review
- Authentication-results: sourceware.org; auth=none
- References: <56701916 dot 4090203 at gmail dot com> <20151215211816 dot GR11972 at malice dot jf dot intel dot com>
On Tue, 2015-12-15 at 13:18 -0800, Darren Hart wrote:
> On Tue, Dec 15, 2015 at 02:43:50PM +0100, Michael Kerrisk (man-pages) wrote:
> >
> > When executing a futex operation that requests to block a thread,
> > the kernel will block only if the futex word has the value that
> > the calling thread supplied (as one of the arguments of the
> > futex() call) as the expected value of the futex word. The loadâ
> > ing of the futex word's value, the comparison of that value with
> > the expected value, and the actual blocking will happen atomiâ
> >
> > FIXME: for next line, it would be good to have an explanation of
> > "totally ordered" somewhere around here.
> >
> > cally and totally ordered with respect to concurrently executing
>
> Totally ordered with respect futex operations refers to semantics of the
> ACQUIRE/RELEASE operations and how they impact ordering of memory reads and
> writes. The kernel futex operations are protected by spinlocks, which ensure
> that that all operations are serialized with respect to one another.
>
> This is a lot to attempt to define in this document. Perhaps a reference to
> linux/Documentation/memory-barriers.txt as a footnote would be sufficient? Or
> perhaps for this manual, "serialized" would be sufficient, with a footnote
> regarding "totally ordered" and a pointer to the memory-barrier documentation?
I'd strongly prefer to document the semantics for users here. And I
don't think users use the kernel's memory model -- instead, if we assume
that most users will call futex ops from C or C++, then the best we have
is the C11 / C++11 memory model. Therefore, if we want to expand that,
we should specify semantics in terms of as-if equivalence to C11 pseudo
code. I had proposed that in the past but, IIRC, Michael didn't want to
add a C11 "dependency" in the semantics back then, at least for the
initial release.
Here's what I wrote back then (atomic_*_relaxed() is like C11
atomic_*(..., memory_order_relaxed), lock/unlock have normal C11 mutex
semantics):
========================
For example, we could say that futex_wait is, in terms of
synchronization semantics, *as if* we'd execute a piece of C11 code.
Here's a part of the docs for a glibc-internal futex wrapper that I'm
working on; this is futex_wait ... :
/* Atomically wrt other futex operations, this blocks iff the value at
*FUTEX matches the expected value. This is semantically equivalent to:
l = <get lock associated with futex> (FUTEX);
wait_flag = <get wait_flag associated with futex> (FUTEX);
lock (l);
val = atomic_load_relaxed (FUTEX);
if (val != expected) { unlock (l); return EAGAIN; }
atomic_store_relaxed (wait_flag, 1);
unlock (l);
// Now block; can time out in futex_time_wait (see below)
while (atomic_load_relaxed(wait_flag));
Note that no guarantee of a happens-before relation between a woken
futex_wait and a futex_wake is documented; however, this does not matter
in practice because we have to consider spurious wake-ups (see below),
and thus would not be able to reason which futex_wake woke us anyway.
... and this is futex_wake:
/* Atomically wrt other futex operations, this unblocks the specified
number of processes, or all processes blocked on this futex if there are
fewer than the specified number. Semantically, this is equivalent to:
l = <get lock associated with futex> (futex);
lock (l);
for (res = 0; processes_to_wake > 0; processes_to_wake--, res++) {
if (<no process blocked on futex>) break;
wf = <get wait_flag of a process blocked on futex> (futex);
// No happens-before guarantee with woken futex_wait (see above)
atomic_store_relaxed (wf, 0);
}
return res;
This allows a programmer to really infer the guarantees he/she can get
from a futex in terms of synchronization, without the docs having to use
prose to describe that. This should also not constrain the kernel in
terms of how to implement it, because it is a conceptual as-if relation
(e.g., the kernel won't spin-wait the whole time, and we might want to
make this clear for the PI case).
Of course, there are several as-if representations we could use, and we
might want to be a bit more pseudo-code-ish to make this also easy to
understand for people not familiar with C11 (e.g., using mutex + condvar
with some relaxation of condvar guaranteees).
=========================
I will go through the discussion pointed out by Davidlohr next.