This is the mail archive of the libc-help@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: pthread_once on ARM SMP


On Tuesday 15 October 2013, "Carlos O'Donell" <carlos@systemhalted.org> wrote:
> On Fri, Oct 11, 2013 at 2:59 PM, Jan Klötzke <jan@kloetzke.net> wrote:
> > Hi,
> > 
> > While debugging something else I stumbled across the pthread_once
> > implementation of glibc. Looking at it I have the feeling that it is
> > actually not 100% safe on SMP. Quoting the relevant fast path code from
> > ports/sysdeps/unix/sysv/linux/arm/nptl/pthread_once.c:
> > 
> > int
> > __pthread_once (pthread_once_t *once_control, void (*init_routine)
> > (void)) {
> > 
> >   for (;;)
> >   
> >     {
> >     
> >       int oldval;
> >       int newval;
> >       [...]
> >       do
> >       
> >         {
> >         
> >           newval = __fork_generation | 1;
> >           oldval = *once_control;
> >           if (oldval & 2)
> >           
> >             break;
> >         
> >         } while (atomic_compare_and_exchange_val_acq (once_control,
> >         newval,
> > 
> > oldval) != oldval);
> > 
> >       /* Check if the initializer has already been done.  */
> >       if ((oldval & 2) != 0)
> >       
> >         return 0;
> >       
> >       [...]
> >     
> >     }
> >   
> >   [...]
> >   init_routine ();
> >   pthread_cleanup_pop (0);
> >   *once_control = __fork_generation | 2;
> >   [...]
> > 
> > }
> > 
> > In the fast path ((oldval & 2) != 0) I don't actually see a memory
> > barrier which would prevent the processor from reordering reads. In
> > particular the CPU might reorder reads that should happen after the
> > once_control check.
> 
> Why does it need one?
> 
> Where would the reads rorder to which would be wrong?

The CPU might read what was done in init_routine() before once_control.

> The atomic operation should be a sufficient barrier.

But the atomic operation is only executed if (oldval & 2) == 0. Otherwise the 
loop is left immediately.

> Remember that ARM does not have a weakly ordered memory
> model like Power, where it really is important to get all the
> barriers in the right places.

AFAIU ARMv7 (like Cortex A9 or A15) actually has a weakly ordered memory 
model. The CPU might reorder reads and writes (in the absence of memory 
barriers) and might do speculative reads too. I don't know the Power memory 
model by heart but AFAIK the ARMv7 memory model is quite close. [1]

> See the Power implementation for a "always correct" implementation.
> 
> Either way Torvald Riegel is rewriting all of the pthread_once
> implementations and unifying them:
> 
> https://sourceware.org/ml/libc-alpha/2013-10/msg00257.html
> 
> https://sourceware.org/ml/libc-alpha/2013-10/msg00416.html
> 
> https://sourceware.org/ml/libc-alpha/2013-10/msg00415.html

This looks good. The unified implementation is what I would have expected on 
ARM too.

> > Likewise, shouldn't the assignment of once_control at the end involve
> > some kind of barrier too? Otherwise the once_control assignment might be
> > observable on another CPU before the side effects of init_routine(),
> > right? Do I overlook something fundamentally here?
> 
> No, it's just that on a non-weakly-ordered system it doesn't really matter
> much.

I agree that its not needed for a strongly ordered memory model.

> We're going to fix this though and it will help ARM.

Great. :)

Thank,
Jan

[1] http://en.wikipedia.org/w/index.php?title=Memory_ordering&oldid=564107383


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]