This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Synchronizing auxiliary mutex data


On Wed, 21 Jun 2017, Andreas Schwab wrote:

> On Jun 20 2017, Alexander Monakov <amonakov@ispras.ru> wrote:
> 
> > Plain accesses to fields like __data.owner are fine as long as they all are
> > within critical sections set up by LLL_MUTEX_{LOCK,UNLOCK}, but there are some
> > outside of them. So e.g. in nptl/pthread_mutex_lock:
> >
> >   95   else if (__builtin_expect (PTHREAD_MUTEX_TYPE (mutex)
> >   96                              == PTHREAD_MUTEX_RECURSIVE_NP, 1))
> >   97     {
> >   98       /* Recursive mutex.  */
> >   99       pid_t id = THREAD_GETMEM (THREAD_SELF, tid);
> >  100 
> >  101       /* Check whether we already hold the mutex.  */
> >  102       if (mutex->__data.__owner == id)
> >  103         {
> >  104           /* Just bump the counter.  */
> >  105           if (__glibc_unlikely (mutex->__data.__count + 1 == 0))
> >  106             /* Overflow of the counter.  */
> >  107             return EAGAIN;
> >  108 
> >  109           ++mutex->__data.__count;
> >  110 
> >  111           return 0;
> >  112         }
> >  113 
> >  114       /* We have to get the mutex.  */
> >  115       LLL_MUTEX_LOCK (mutex);
> >  116 
> >  117       assert (mutex->__data.__owner == 0);
> 
> What keeps the cpu from using its cached value of __owner here?

Inside LLL_MUTEX_LOCK there's an atomic operation with acquire memory ordering.
The compiler and the hardware are responsible, together, for ensuring proper
ordering: the compiler may not move the load of __owner up prior to that atomic
operation, and must emit machine code that will cause the CPU to keep the
ordering at runtime (on some architectures, e.g. ARM, this implies emitting memory
barrier instructions, but on x86 the atomic operation will be a lock-prefixed
memory operation, enforcing proper ordering on its own).

So, the compiler is forbidden to reorder those operations, it must signal the
CPU that it also may not reorder those operations, and thus when in
LLL_MUTEX_LOCK we acquire the mutex, the value of __owner we see after that (in
our cache, or not) must be the value stored by the thread previously unlocking
the mutex (i.e. 0). If we see non-zero, then we observed that the modification
of mutex __lock was propagated out-of-order with that of __owner (or our load
was actually executed/served by cache out-of-order with the atomic op).

If on a given platform the cache subsystem doesn't normally propagate
modifications in the same order as they arrive, the CPU will need to let the
cache subsystem be aware of memory fences (and other ordering ops) it sees.
Likewise for loads.

> > afaict the access at line 102 can invoke undefined behavior due to a data race.
> 
> Is that the undefined behaviour here?

No, that's not directly related. The data race will happen if one thread
performs this load while another owns this mutex and modifies __owner.

> > In practice I think it works fine because the compiler doesn't tear the load,
> 
> What does "tear the load" mean?

Decompose the load into separately executed memory accesses. In principle the
compiler could load the low and high 16 bits of __owner separately, and that can
cause the check at line 102 return true when we don't actually own the mutex.

Alexander


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]