This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCHv3] powerpc: Fix write-after-destroy in lock elision
- From: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- To: Torvald Riegel <triegel at redhat dot com>
- Cc: libc-alpha at sourceware dot org
- Date: Mon, 2 Jan 2017 15:25:02 -0200
- Subject: Re: [PATCHv3] powerpc: Fix write-after-destroy in lock elision
- Authentication-results: sourceware.org; auth=none
- References: <20161212043646.GL10558@vapier.lan> <1482343384-28430-1-git-send-email-tuliom@linux.vnet.ibm.com> <afe9f3de-5d30-7df0-8af0-b120bb7bdad0@linaro.org> <1483377714.13143.99.camel@redhat.com>
On 02/01/2017 15:21, Torvald Riegel wrote:
> On Mon, 2017-01-02 at 15:03 -0200, Adhemerval Zanella wrote:
>> LGTM with a nit only on the comment about the adapt_count decrement in
>> unlock.
>>
>> On 21/12/2016 16:03, Tulio Magno Quites Machado Filho wrote:
>>> From: Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
>>>
>>> Changes since v2:
>>> - Fixed coding style.
>>>
>>> Changes since v1:
>>> - Removed references to data race by the actual error: write-after-destroy.
>>> - Enclosed adapt_count accesses in C11 atomics.
>>>
>>> --- 8< ---
>>>
>>> The update of *adapt_count after the release of the lock causes a race
>>> condition when thread A unlocks, thread B continues and destroys the
>>> mutex, and thread A writes to *adapt_count.
>>
>>
>>>
>>> diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
>>> index 43c5a67..0e0baf5 100644
>>> --- a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
>>> +++ b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
>>> @@ -28,13 +28,17 @@ __lll_unlock_elision (int *lock, short *adapt_count, int pshared)
>>> __libc_tend (0);
>>> else
>>> {
>>> - lll_unlock ((*lock), pshared);
>>> -
>>> - /* Update the adapt count AFTER completing the critical section.
>>> - Doing this here prevents unneeded stalling when entering
>>> - a critical section. Saving about 8% runtime on P8. */
>>> + /* Update adapt_count in the critical section to prevent a
>>> + write-after-destroy error as mentioned in BZ 20822. The
>>> + following update of adapt_count is clearly contained within
>>> + the critical region of the fall-back lock as it immediately
>>> + precedes the unlock. Attempting to use C11 atomics to access
>>> + adapt_count would be (at minimum) misleading and (at worse) a
>>> + serious error forcing a larx-hit-stcx flush. */
>>> if (*adapt_count > 0)
>>> (*adapt_count)--;
>>
>> I think C11 atomic comment should outline that using atomic_fetch_add_relaxed
>> generates suboptimal code, not misleading, since it generates an atomic sequence
>> using lharx/sthcx. on a transaction region. Something like:
>>
>> /* Update adapt_count in the critical section to prevent a
>> write-after-destroy error as mentioned in BZ 20822. The
>> following update of adapt_count is clearly contained within
>> the critical region of the fall-back lock as it immediately
>> precedes the unlock.
>> Using C11 atomics like atomic_fetch_add_relaxed generates
>> suboptimal code (lharx/sthcx. sequence) where an plain load,
>> update, store is suffice inside a HTM region (the whole
>> idea of using this construction). */
>
> Yes.
>
> For the record, I'm still of the opinion that it should just use atomic
> relaxed MO loads and stores -- simply for consistency and to make future
> transition to C11 atomic types easier.
>
I think it can be changed to:
if (atomic_load_relaxed (adapt_count) > 0)
(*adapt_count)--;
Without performance issues and the comment will outline why not C11 atomic is
not being used in this specific snippet.