This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH v2 0/5] Add a new mutex type PTHREAD_MUTEX_QUEUESPINNER_NP

From: kemi <kemi dot wang at intel dot com>
To: Adhemerval Zanella <adhemerval dot zanella at linaro dot org>, Florian Weimer <fweimer at redhat dot com>, Rical Jason <rj at 2c3t dot io>, Carlos Donell <carlos at redhat dot com>, Glibc alpha <libc-alpha at sourceware dot org>
Cc: Dave Hansen <dave dot hansen at linux dot intel dot com>, Tim Chen <tim dot c dot chen at intel dot com>, Andi Kleen <andi dot kleen at intel dot com>, Ying Huang <ying dot huang at intel dot com>, Aaron Lu <aaron dot lu at intel dot com>, Lu Aubrey <aubrey dot li at intel dot com>
Date: Thu, 2 Aug 2018 08:34:42 +0800
Subject: Re: [PATCH v2 0/5] Add a new mutex type PTHREAD_MUTEX_QUEUESPINNER_NP
References: <1531464752-18830-1-git-send-email-kemi.wang@intel.com>

Hi, Gentle maintainers
  Could we get this patch series reviewed next? Thanks

https://sourceware.org/ml/libc-alpha/2018-07/msg00354.html
https://sourceware.org/ml/libc-alpha/2018-07/msg00357.html
https://sourceware.org/ml/libc-alpha/2018-07/msg00355.html
https://sourceware.org/ml/libc-alpha/2018-07/msg00358.html
https://sourceware.org/ml/libc-alpha/2018-07/msg00356.html
https://sourceware.org/ml/libc-alpha/2018-07/msg00359.html

On 2018年07月13日 14:52, Kemi Wang wrote:
> The pthread adaptive mutex is designed based-on the truth that the lock
> probably would be released after a few of CPU cycles in most cases.
> Especially for the case: the applications code is well-behaved with a short
> critical section that is highly contended. Thus, spinning on the lock for
> a while instead of calling into the kernel to block immediately can help
> to improve the system performance.
> 
> But there are two main problems in current adaptive mutex. The first one is
> fairness, multiple spinners contend for the lock simultaneously and there
> is no any guarantee for a spinner to acquire the lock no matter how long it
> has been waiting for. The other is the heavy cache line bouncing. Since the
> cache line including the lock is shared among all of the spinners, when
> lock is released, each spinner will try to acquire the lock via cmpxchg
> instruction which constantly floods the system via "read-for-ownership"
> requests. As a result, there will be a lot of cache line bouncing in a
> large system with a lots of CPUs.
> 
> To address the problems mentioned above, the idea for queuing mutex
> spinners with MCS lock[1] referred to the implementation of mutex[2] in
> kernel is proposed and a new mutex type PTHREAD_MUTEX_QUEUESPINNER_NP is
> introduced. Compare to adaptive mutex (read only while spinning), the test
> result on Intel 2 sockets Skylake platform has showns significant
> performance improvement (See the first patch for details).
> 
> Though, queuing spinner with mcs lock can help to improve the performance
> of adaptive mutex when multiple threads contending for a lock, people
> probably want to know how queue spinner mutex performs when compare to
> other lock discipline widely knowns as pthread spin lock and pthread mutex.
> We can see the details of the test result in the first patch. Simply, some
> conclusion is summarized as below:
> a) In case of little lock contention, spin lock performs best, queue
> spinner mutex performs similar to adaptive mutex, and both perform a
> little better than pthread mutex.
> b) In the case of severe lock contention with large number of CPUs when
> protecting a small critical section (less than 1000ns). Most of lock
> acquisition is got via spinning. Queue spinner mutex.
> performs much better than spin lock and adaptive mutex. This is because the
> overhead of heavy cache line bouncing plays a big role on lock performance.
> c) With the increase of the size of a critical section, the advantage of
> queue spinner mutex on performance in reduced gradually. This is because
> the overhead of cache line bouncing will not become the bottleneck of lock
> performance, instead, the overhead of futex_wait and futex_wake
> plays a big role. When the critical section is increased to 1ms, even the
> latency of futex syscall would be ignorable compared to the total time of
> lock acquisition.
> 
> As we can see above, queue spinner mutex performs well in kinds of workload,
> but there would be a potential risk to use this type of mutex. When the
> lock holder is transformed to the next spinner in the queue, but it is not
> running (the CPU is scheduled to run other task). Thus, the other spinners
> have to wait in the queue, this would probably collapse the lock performance.
> To emulate this case, we run two same processes simultaneously, the
> process has 28 threads each of which sets CPU affinity to an individual CPU
> according to the thread id. Thus, CPU [0~27] are subscribed by two threads.
> In the worst case (s=1000ns, t=6000ns), the lock performance is reduced by
> 58.1% (2205245->924263).
> Therefore, queue spinner mutex would be carefully used for applications to
> pursue fairness and performance without oversubscribing CPU resource. E.g.
> Containers in public cloud infrastructure.
> 
> To overcome the disadvantage of lock performance collapse in the legacy MCS
> lock when CPUs are oversubscribed, we proposed an enhanced MCS lock
> algorithm in the second patch which allows a spinner to spin on mutex lock
> without having to be waken up by its previous spinner. In such case, this
> new mutex type would be widely and safely used in kinds of usage scenarios.
> 
> ChangeLog:
>    V1->V2
>    a) Propose an enhanced MCS lock algrithm to avoid potential lock
>    performance collapse.
>    b) Add a link of the paper which introduces original MCS lock.
>    c) Add the commit id for mutex implementation with MCS lock in kernel.
> 
> [1] Mellor-Crummey, John M., and Michael L. Scott. "Algorithms for scalable
> synchronization on shared-memory multiprocessors." ACM Transactions on
> Computer Systems (TOCS) 9.1 (1991): 21-65.
> https://www.cos.ufrj.br/~ines/courses/progpar01/papers/p21-mellor-
> crummey.pdf
> [2] Commit id for mutex with MCS lock in kernel:
> 2bd2c92cf07cc4a373bf316c75b78ac465fefd35
> 
> Kemi Wang (5):
>   Mutex: Queue spinners to reduce cache line bouncing and ensure
>     fairness
>   MCS lock: Enhance legacy MCS lock algorithm
>   Mutex: add unit tests for new type
>   BENCHMARK: add a benchmark for testing new type of mutex
>   Manual: Add manual for pthread mutex
> 
>  benchtests/Makefile                          |  4 +-
>  benchtests/bench-mutex-adaptive-thread.c     |  8 ++-
>  benchtests/bench-mutex-queuespinner-thread.c | 21 +++++++
>  manual/Makefile                              |  2 +-
>  manual/mutex.texi                            | 68 +++++++++++++++++++++
>  nptl/Makefile                                |  8 +--
>  nptl/allocatestack.c                         |  2 +-
>  nptl/descr.h                                 | 26 ++++----
>  nptl/mcs_lock.c                              | 88 ++++++++++++++++++++++++++++
>  nptl/mcs_lock.h                              | 21 +++++++
>  nptl/nptl-init.c                             |  2 +-
>  nptl/pthreadP.h                              |  2 +-
>  nptl/pthread_mutex_init.c                    |  3 +-
>  nptl/pthread_mutex_lock.c                    | 40 ++++++++++++-
>  nptl/pthread_mutex_timedlock.c               | 35 ++++++++++-
>  nptl/pthread_mutex_trylock.c                 |  5 +-
>  nptl/pthread_mutex_unlock.c                  |  7 ++-
>  nptl/pthread_mutexattr_settype.c             |  2 +-
>  nptl/tst-initializers1.c                     | 11 ++--
>  nptl/tst-mutex5b.c                           |  2 +
>  nptl/tst-mutex7b.c                           |  2 +
>  sysdeps/nptl/bits/thread-shared-types.h      | 22 +++++--
>  sysdeps/nptl/pthread.h                       | 15 +++--
>  sysdeps/unix/sysv/linux/hppa/pthread.h       |  4 ++
>  24 files changed, 350 insertions(+), 50 deletions(-)
>  create mode 100644 benchtests/bench-mutex-queuespinner-thread.c
>  create mode 100644 manual/mutex.texi
>  create mode 100644 nptl/mcs_lock.c
>  create mode 100644 nptl/mcs_lock.h
>  create mode 100644 nptl/tst-mutex5b.c
>  create mode 100644 nptl/tst-mutex7b.c
>

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]