This is the mail archive of the
libc-help@sourceware.org
mailing list for the glibc project.
Re: pthread_once on ARM SMP
- From: Jan Klötzke <jan at kloetzke dot net>
- To: "Carlos O'Donell" <carlos at systemhalted dot org>
- Cc: "libc-help at sourceware dot org" <libc-help at sourceware dot org>
- Date: Tue, 15 Oct 2013 20:49:44 +0200
- Subject: Re: pthread_once on ARM SMP
- Authentication-results: sourceware.org; auth=none
- References: <201310112059 dot 01618 dot jan at kloetzke dot net> <CAE2sS1i+t4rR1hPZ2Wqw01L=0xO_DRmzi_5vc7XbMJU6ZLFZQA at mail dot gmail dot com>
On Tuesday 15 October 2013, "Carlos O'Donell" <carlos@systemhalted.org> wrote:
> On Fri, Oct 11, 2013 at 2:59 PM, Jan Klötzke <jan@kloetzke.net> wrote:
> > Hi,
> >
> > While debugging something else I stumbled across the pthread_once
> > implementation of glibc. Looking at it I have the feeling that it is
> > actually not 100% safe on SMP. Quoting the relevant fast path code from
> > ports/sysdeps/unix/sysv/linux/arm/nptl/pthread_once.c:
> >
> > int
> > __pthread_once (pthread_once_t *once_control, void (*init_routine)
> > (void)) {
> >
> > for (;;)
> >
> > {
> >
> > int oldval;
> > int newval;
> > [...]
> > do
> >
> > {
> >
> > newval = __fork_generation | 1;
> > oldval = *once_control;
> > if (oldval & 2)
> >
> > break;
> >
> > } while (atomic_compare_and_exchange_val_acq (once_control,
> > newval,
> >
> > oldval) != oldval);
> >
> > /* Check if the initializer has already been done. */
> > if ((oldval & 2) != 0)
> >
> > return 0;
> >
> > [...]
> >
> > }
> >
> > [...]
> > init_routine ();
> > pthread_cleanup_pop (0);
> > *once_control = __fork_generation | 2;
> > [...]
> >
> > }
> >
> > In the fast path ((oldval & 2) != 0) I don't actually see a memory
> > barrier which would prevent the processor from reordering reads. In
> > particular the CPU might reorder reads that should happen after the
> > once_control check.
>
> Why does it need one?
>
> Where would the reads rorder to which would be wrong?
The CPU might read what was done in init_routine() before once_control.
> The atomic operation should be a sufficient barrier.
But the atomic operation is only executed if (oldval & 2) == 0. Otherwise the
loop is left immediately.
> Remember that ARM does not have a weakly ordered memory
> model like Power, where it really is important to get all the
> barriers in the right places.
AFAIU ARMv7 (like Cortex A9 or A15) actually has a weakly ordered memory
model. The CPU might reorder reads and writes (in the absence of memory
barriers) and might do speculative reads too. I don't know the Power memory
model by heart but AFAIK the ARMv7 memory model is quite close. [1]
> See the Power implementation for a "always correct" implementation.
>
> Either way Torvald Riegel is rewriting all of the pthread_once
> implementations and unifying them:
>
> https://sourceware.org/ml/libc-alpha/2013-10/msg00257.html
>
> https://sourceware.org/ml/libc-alpha/2013-10/msg00416.html
>
> https://sourceware.org/ml/libc-alpha/2013-10/msg00415.html
This looks good. The unified implementation is what I would have expected on
ARM too.
> > Likewise, shouldn't the assignment of once_control at the end involve
> > some kind of barrier too? Otherwise the once_control assignment might be
> > observable on another CPU before the side effects of init_routine(),
> > right? Do I overlook something fundamentally here?
>
> No, it's just that on a non-weakly-ordered system it doesn't really matter
> much.
I agree that its not needed for a strongly ordered memory model.
> We're going to fix this though and it will help ARM.
Great. :)
Thank,
Jan
[1] http://en.wikipedia.org/w/index.php?title=Memory_ordering&oldid=564107383