This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
[PATCHv2] powerpc: Spinlock optimization and cleanup
- From: "Paul E. Murphy" <murphyp at linux dot vnet dot ibm dot com>
- To: "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>, triegel at redhat dot com, rth at twiddle dot net, Tulio Magno Quites Machado Filho <tuliom at linux dot vnet dot ibm dot com>, rth at twiddle dot net
- Cc: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>, Steve Munroe <sjmunroe at us dot ibm dot com>
- Date: Wed, 30 Sep 2015 11:28:22 -0500
- Subject: [PATCHv2] powerpc: Spinlock optimization and cleanup
- Authentication-results: sourceware.org; auth=none
Changes from V1:
* Use C macro for atomic_store_release as suggested in
comments.
* Run benchmarks to quantize the performance changes and
note in commit message.
---8<---
This patch optimizes powerpc spinlock implementation by:
* Current algorithm spin over a lwzx, but only after issuing a lwarx.
first. The optimization for such case is to avoid the first lwarx
in contention case (which is much more costly than normal loads).
* Use the correct EH hint bit on the larx for supported ISA. For lock
acquisition, the thread that acquired the lock with a successful stcx
do not want to give away the write ownership on the cacheline. The
idea is to make the load reservation "sticky" about retaining write
authority to the line. That way, the store that must inevitably come
to release the lock can succeed quickly and not contend with other
threads issuing lwarx. If another thread does a store to the line
(false sharing), the winning thread must give up write authority to
The proper value of EH for the larx for a lock acquisition is 1.
* Increase contented lock performance by up to 40%, and no measurable
impact on uncontended locks on P8.
It also adds some cleanup to use the defined acquire semantic
instructions and function prototype using default C style.
Thanks to Adhemerval Zanella who did most of the work. I've run some
tests, and addressed some minor feedback.
2015-09-25 Paul E. Murphy <murphyp@linux.vnet.ibm.com>
* sysdeps/powerpc/nptl/pthread_spin_lock.c (pthread_spin_lock):
Optimize first check for contention case and add lwarx hint.
* sysdeps/powerpc/nptl/pthread_spin_trylock.c (pthread_spin_trylock):
Use ANSI prototype.
* sysdep/unix/sysv/linux/powerpc/pthread_spin_unlock.c: Move to ...
* sysdeps/powerpc/nptl/pthread_spin_unlock.c: ... here, and
update to new atomic macros.
---
sysdeps/powerpc/nptl/pthread_spin_lock.c | 22 ++++++++-------
sysdeps/powerpc/nptl/pthread_spin_trylock.c | 3 +-
sysdeps/powerpc/nptl/pthread_spin_unlock.c | 27 +++++++++++++++++++
.../unix/sysv/linux/powerpc/pthread_spin_unlock.c | 28 --------------------
4 files changed, 40 insertions(+), 40 deletions(-)
create mode 100644 sysdeps/powerpc/nptl/pthread_spin_unlock.c
delete mode 100644 sysdeps/unix/sysv/linux/powerpc/pthread_spin_unlock.c
diff --git a/sysdeps/powerpc/nptl/pthread_spin_lock.c b/sysdeps/powerpc/nptl/pthread_spin_lock.c
index cc081f8..8a39da9 100644
--- a/sysdeps/powerpc/nptl/pthread_spin_lock.c
+++ b/sysdeps/powerpc/nptl/pthread_spin_lock.c
@@ -19,24 +19,26 @@
#include "pthreadP.h"
int
-pthread_spin_lock (lock)
- pthread_spinlock_t *lock;
+pthread_spin_lock (pthread_spinlock_t *lock)
{
unsigned int __tmp;
asm volatile (
- "1: lwarx %0,0,%1\n"
+ "0: lwzx %0,0,%1\n"
+ " cmpwi 0,%0,0\n"
+ " bne 0b\n"
+ "1: lwarx %0,0,%1" MUTEX_HINT_ACQ "\n"
" cmpwi 0,%0,0\n"
" bne- 2f\n"
" stwcx. %2,0,%1\n"
" bne- 2f\n"
- " isync\n"
- " .subsection 1\n"
- "2: lwzx %0,0,%1\n"
- " cmpwi 0,%0,0\n"
- " bne 2b\n"
- " b 1b\n"
- " .previous"
+ __ARCH_ACQ_INSTR "\n"
+ " .subsection 1\n"
+ "2: lwzx %0,0,%1\n"
+ " cmpwi 0,%0,0\n"
+ " bne 2b\n"
+ " b 1b\n"
+ " .previous"
: "=&r" (__tmp)
: "r" (lock), "r" (1)
: "cr0", "memory");
diff --git a/sysdeps/powerpc/nptl/pthread_spin_trylock.c b/sysdeps/powerpc/nptl/pthread_spin_trylock.c
index 77a4615..c485aa4 100644
--- a/sysdeps/powerpc/nptl/pthread_spin_trylock.c
+++ b/sysdeps/powerpc/nptl/pthread_spin_trylock.c
@@ -20,8 +20,7 @@
#include "pthreadP.h"
int
-pthread_spin_trylock (lock)
- pthread_spinlock_t *lock;
+pthread_spin_trylock (pthread_spinlock_t *lock)
{
unsigned int old;
int err = EBUSY;
diff --git a/sysdeps/powerpc/nptl/pthread_spin_unlock.c b/sysdeps/powerpc/nptl/pthread_spin_unlock.c
new file mode 100644
index 0000000..f830ad2
--- /dev/null
+++ b/sysdeps/powerpc/nptl/pthread_spin_unlock.c
@@ -0,0 +1,27 @@
+/* pthread_spin_unlock -- unlock a spin lock. PowerPC version.
+ Copyright (C) 2007-2015 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+#include "pthreadP.h"
+#include <lowlevellock.h>
+
+int
+pthread_spin_unlock (pthread_spinlock_t *lock)
+{
+ atomic_store_release (lock, 0);
+ return 0;
+}
diff --git a/sysdeps/unix/sysv/linux/powerpc/pthread_spin_unlock.c b/sysdeps/unix/sysv/linux/powerpc/pthread_spin_unlock.c
deleted file mode 100644
index 7af694f..0000000
--- a/sysdeps/unix/sysv/linux/powerpc/pthread_spin_unlock.c
+++ /dev/null
@@ -1,28 +0,0 @@
-/* pthread_spin_unlock -- unlock a spin lock. PowerPC version.
- Copyright (C) 2007-2015 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <http://www.gnu.org/licenses/>. */
-
-#include "pthreadP.h"
-#include <lowlevellock.h>
-
-int
-pthread_spin_unlock (pthread_spinlock_t *lock)
-{
- __asm __volatile (__ARCH_REL_INSTR ::: "memory");
- *lock = 0;
- return 0;
-}
--
1.7.1