This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCHv3] PowerPC: Fix a race condition when eliding a lock
- From: Torvald Riegel <triegel at redhat dot com>
- To: "Paul E. Murphy" <murphyp at linux dot vnet dot ibm dot com>
- Cc: libc-alpha at sourceware dot org, Adhemerval Zanella <adhemerval dot zanella at linaro dot org>
- Date: Thu, 08 Oct 2015 17:54:32 +0200
- Subject: Re: [PATCHv3] PowerPC: Fix a race condition when eliding a lock
- Authentication-results: sourceware.org; auth=none
- References: <55D742D3 dot 9050600 at redhat dot com> <1440439895-11812-1-git-send-email-tuliom at linux dot vnet dot ibm dot com> <1441136302 dot 5089 dot 182 dot camel at otta> <55E60E88 dot 50104 at linaro dot org> <55E61799 dot 6010707 at linux dot vnet dot ibm dot com>
On Tue, 2015-09-01 at 16:24 -0500, Paul E. Murphy wrote:
>
> On 09/01/2015 03:46 PM, Adhemerval Zanella wrote:
> > Indeed the 'odd' comment does not make sense and we should just remove it
> > (I misread texasr definition). My initial idea was define some codes
> > that set the persistent failures and some that do not. I think I best
> > approach would be:
> >
> > /* tabort will set TEXASR(0:31) = ((_ABORT_LOCK_BUSY & 0xff) << 24) | 0x1
> > and the TEXASR persistent bit is bit 25 (32-7). Only the syscall
> > code means a persistent error that should trigger a default lock
> > acquisition. */
> > #define _ABORT_SYSCALL 0x1 /* Syscall issued. */
> > #define _ABORT_LOCK_BUSY 0x2 /* Lock already used. */
> > #define _ABORT_NESTED_TRYLOCK 0x4 /* Write operation in trylock. */
>
> The kernel defines several abort codes, we'll want to work with them, or
> recycle them as needed.
Agreed in general. The abort codes should be part of the ABI, or we'll
get into trouble interpreting them given that transactions can span
across different layers. I've asked Andi Kleen whether he can make this
happen for x86 (eg, through appropriate documentation somewhere). It
would be good if this can be clarified/specified on PowerPC too.
Can kernel-level abort codes actually reach userspace, or do we only
need to establish userspace consensus for these codes? For x86, it's
userspace only AFAIK.
Would it be likely that userspace explicit aborts get interpreted by the
kernel? They can't be trusted more than one would trust the program.
> I'm not convinced any of the existing codes should be non-persistent:
>
> A pthread_mutex_trylock attempt within an elided pthread_mutex_lock is
> guaranteed to fail try_tbegin times if there is no contention on the lock.
> Aborts get increasingly expensive as you increase the amount of speculative
> execution.
Although that this depends on the program (e.g., trylock might not be
called all the time), in this case I would also guess that it would be
better to consider it a persistent condition.
> A busy lock likely indicates contention in the critical section which
> does not benefit from elision, I'd err on the side of a persistent
> failure.
I don't think I agree. An already-acquired lock is something that could
be hit less likely by using elision more often on this lock (think about
the "lemming effect").