This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH v2] PowerPC: Fix a race condition when eliding a lock

"Carlos O'Donell" <> writes:

> On 08/20/2015 11:20 AM, Tulio Magno Quites Machado Filho wrote:
>> Changes since v1:
>>  - Improved commit message.
>>  - Added new comments in the source code alerting to the concurrency details
>>    intrinsic to this code.
>>  - Removed compiler barriers from this patch, which will be treated in another
>>    patch and will be synchronized with the GCC implementation.
> The architecture and idea of this change is correct, but there are still
> a few details we need to get right. It's almost there, probably v3 and
> we'll be done. See my comments below.


>> The previous code used to evaluate the preprocessor token is_lock_free to
>> a variable before starting a transaction.  This behavior can cause an
>> error if another thread got the lock (without using a transaction)
>> between the evaluation of the token and the beginning of the transaction.
>> This patch delays the evaluation of is_lock_free to inside a transaction
>> by moving this part of the code to the macro ELIDE_LOCK.
> This is looking better, but the comments talk only about the error
> you are fixing, and not the overall concurrency scheme. The latter
> is really what we want to document, and from that documentation it
> should flow naturally that is_lock_free must, out of a requirement,
> can be read within the transaction safely.

Just to be clear: when you say "the comments talk", are you referring to the
commit message or source code comments?

>> +
>> +   Always evaluate is_lock_free from inside a transaction in macros
>> +   Failing to do so, creates a race condition to the memory access specified
>> +   in is_lock_free which can be triggered with the following order of
>> +   events:
>> +   1. The lock accessed by is_lock_free is free.
>> +   2. Thread T1 evaluates is_lock_free and stores into register R1 that the
>> +      lock is free.
>> +   3. Thread T2 acquires the same lock used in is_lock_free.
>> +   4. T1 begins the transaction, creating a memory barrier where is_lock_free
>> +      is false, but R1 is true.
>> +   5. T1 reads R1 and doesn't abort the transaction.
>> +   6. T1 calls ELIDE_UNLOCK, which reads false from is_lock_free and decides
>> +      to unlock a lock acquired by T2, leading to undefined behavior.  */
> This talks about this particular problem, but should instead talk about
> why is_lock_free need not have any locks or atomic accesses.
> e.g.
>    The value of is_lock_free is read and possibly written to by multiple
>    threads, either during lock acquisition, or elision. In order to avoid
>    this evaluation from becoming a data race the access of is_lock_free
>    is placed *inside* the transaction. Within the transaction is_lock_free
>    can be accessed with relaxed ordering. We are assured by the transaction
>    that the value of is_lock_free is consistent with the state of the lock,
>    otherwise the hardware will abort the transaction.
>    ... */

Agreed.  The source code becomes much more useful with your comments.  ;-)

> Feel free to add back the notes on the invalid case if you want, but the
> above is the minimum I'm expecting. It's a definition of the intent of the
> current code.

Maybe the invalid case should only be available in the commit message.

> Does that make sense?


> Perhaps Torvald can review this too.
> I apologize in advance for making this take so long, but we all need
> to use the same language and calibrate our expectations regarding
> the level of detail that we need here and the language used.

No problem.  And thank you for spending your time on this review.

>> +
>> +/* Returns 0 if the lock defined by is_lock_free was elided.
>> +   ADAPT_COUNT is a per-lock state variable.  */
>> +# define ELIDE_LOCK(adapt_count, is_lock_free)				\
>> +  ({									\
>> +    int ret = 0;							\
>> +    if (adapt_count > 0)						\
>> +      (adapt_count)--;							\
>> +    else								\
>> +      for (int i = __elision_aconf.try_tbegin; i > 0; i--)		\
>> +	{								\
>> +	  if (__builtin_tbegin (0))					\
>> +	    {								\
>> +	      if (is_lock_free)						\
> This should use a relaxed MO atomic access to highlight that this data *is*
> shared between threads, but that there are no ordering guarantees. It should
> result in a normal load, but it makes it clear in the code that some other
> thread is also accessing this data, and that the transaction as a barrier
> is all that is protecting this from being a data-race (undefined behaviour).
> Torvlad made the same comment in his review, and I'm just repeating it here
> because I think it bears repeating.

Repeating it is a good idea because I was missing it.
But I still fail to understand how this will make the code clear.
In fact, I'm afraid it could make it more confusing, e.g. "if I have the
atomic access, why do I need this transaction?  Let's remove it."

Maybe, if I had an example I'd better understand what both of you want.


Tulio Magno

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]