This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH 9/9] Add manual for lock elision
- From: "Carlos O'Donell" <carlos at redhat dot com>
- To: Andi Kleen <andi at firstfloor dot org>
- Cc: libc-alpha at sourceware dot org, Andi Kleen <ak at linux dot intel dot com>
- Date: Tue, 14 May 2013 02:39:39 -0400
- Subject: Re: [PATCH 9/9] Add manual for lock elision
- References: <1368225725-14283-1-git-send-email-andi at firstfloor dot org> <1368225725-14283-10-git-send-email-andi at firstfloor dot org>
On 05/10/2013 06:42 PM, Andi Kleen wrote:
> From: Andi Kleen <ak@linux.intel.com>
>
> pthreads are not described in the documentation, but I decided to document
> lock elision there at least.
Thank you very much for the manual.
This will really help our users understand elision.
> 2013-05-02 Andi Kleen <ak@linux.intel.com>
>
> * manual/Makefile: Add elision.texi.
> * manual/threads.texi: Link to elision.
> * manual/elision.texi: New file.
> * manual/intro.texi: Link to elision.
> * manual/lang.texi: dito.
> ---
> manual/Makefile | 2 +-
> manual/elision.texi | 299 +++++++++++++++++++++++++++++++++++++++++++++++++++
> manual/intro.texi | 3 +
> manual/lang.texi | 2 +-
> manual/threads.texi | 2 +-
> 5 files changed, 305 insertions(+), 3 deletions(-)
> create mode 100644 manual/elision.texi
>
> diff --git a/manual/Makefile b/manual/Makefile
> index 44c0fd4..5d78761 100644
> --- a/manual/Makefile
> +++ b/manual/Makefile
> @@ -42,7 +42,7 @@ chapters = $(addsuffix .texi, \
> message search pattern io stdio llio filesys \
> pipe socket terminal syslog math arith time \
> resource setjmp signal startup process job nss \
> - users sysinfo conf crypt debug threads)
> + users sysinfo conf crypt debug threads elision)
> add-chapters = $(wildcard $(foreach d, $(add-ons), ../$d/$d.texi))
> appendices = lang.texi header.texi install.texi maint.texi platform.texi \
> contrib.texi
> diff --git a/manual/elision.texi b/manual/elision.texi
> new file mode 100644
> index 0000000..40c3bbb
> --- /dev/null
> +++ b/manual/elision.texi
> @@ -0,0 +1,299 @@
> +@node Lock elision, Language Features, Debugging Support, Top
> +@c %MENU% Lock elision
> +@chapter Lock elision
> +
> +@c create the bizarre situation that lock elision is documented, but pthreads isn't
Please bear with us as we expand the threads.texi file :-)
We can't even use the old threads.texi file from Linuxthreads becuase it was
never contributed to the FSF :-(
> +
> +This chapter describes the lock implementation implementation for pthread
s/implementation implementation/implementation/g
s/pthread/POSIX thread/g
> +locks.
> +
> +@menu
> +* Lock elision introduction:: What is lock elision?
> +* Semantic differences of elided locks::
> +* Tuning lock elision::
> +* Setting elision for individual @code{pthread_mutex_t}::
> +* Setting @code{pthread_mutex_t} elision using environment variables::
> +* Setting elision for individual @code{pthread_rwlock_t}::
> +* Setting @code{pthread_rwlock_t} elision using environment variables::
> +@end menu
> +
> +@node Lock elision introduction
> +@section Lock elision introduction
> +
> +Lock elision is a technique to improve lock scaling. It runs
> +lock regions in parallel using hardware support for a transactional execution
> +mode. The lock region is executed speculatively, and as long
> +as there is no conflict or other reason for transaction abort the lock
> +will executed in parallel. If an transaction abort occurs, any
> +side effect of the speculative execution is undone, the lock is taken
> +for real and the lock region re-executed. This improves scalability
> +of the program because locks do not need to wait for each other.
> +
> +The standard @code{pthread_mutex_t} mutexes and @code{pthread_rwlock_t} rwlocks
> +can be transparently elided by the C library.
s/the C library/@theglibc/g.
> +
> +Lock elision may lower performance if transaction aborts occur too frequently.
> +In this case it is recommended to use a PMU profiler to find the causes for
> +the aborts first and try to eliminate them. If that is not possible
> +elision can be disabled for a specific lock or for the whole program.
> +Alternatively elision can be disabled completely, and only enabled for
> +specific locks that are known to be elision friendly.
> +
> +The defaults locks are adaptive. The lock library decides whether elision
> +is profitable based on the abort rates, and automatically disables
> +elision for a lock when it aborts too often. After some time elision
> +is retried, in case the workload changed.
> +
> +Lock elision is currently supported for default (timed) mutexes and for
> +adaptive mutexes. Other lock types do not elide. Condition variables
> +also do not elide. This may change in future versions.
> +
> +@node Semantic differences of elided locks
> +@section Semantic differences of elided locks
> +
> +Elided locks have some semantic differences to classic locks. These differences
> +are only visible when the lock is successfully elided. Since elision may always
> +fail a program cannot rely on any of these semantics.
> +
> +@itemize
> +@item
> +Elided locks always behave like read-write locks.
> +
> +@item
> +Mutexes and write rwlocks can be locked recursively inside the lock region.
> +This behavior is visible through @code{pthread_mutex_trylock}. This
> +behavior is not enabled by default for default timed locks, only
> +for locks that have been explicitely marked for elision with
> +@code{PTHREAD_MUTEX_ELISION_NP}. The default locks will abort
> +elision for nested trylocks.
> +
> +@smallexample
> +pthread_mutex_lock (&lock);
> +if (pthread_mutex_trylock (&lock) == 0)
> + /* with elision we come here */
> +else
> + /* with no elision we always come here */
> +@end smallexample
> +
> +And also through @code{pthread_mutex_timedlock}. This behavior is unconditional
> +for elided locks.
> +
> +@smallexample
> +pthread_mutex_lock (&lock);
> +if (pthread_mutex_timedlock (&lock, &timeout) == 0)
> + /* With elision we always come here */
> +else
> + /* With no elision we always come here because timeout happens. */
> +@end smallexample
> +
> +Similar semantic changes apply to @code{pthread_rwlock_trywrlock} and
> +@code{pthread_rwlock_timedwrlock}.
> +
> +@item
> +@code{pthread_mutex_destroy} does not return an error when the lock is locked
> +and will clear the lock state.
> +
> +@item
> +@code{pthread_mutex_t} and @code{pthread_rwlock_t} appear free from other threads.
> +
> +This can be visible through trylock or timedlock.
> +In most cases checking this is a existing latent race in the program, but there may
> +be rare cases when it is not.
s/rare//g.
> +
> +@item
> +@code{EAGAIN} and @code{EDEADLK} in rwlocks will not happen under elision.
> +
> +@item
> +@code{pthread_mutex_unlock} does not return an error when unlocking a free lock.
> +
> +@item
> +Elision changes timing because locks now run in parallel.
> +Timing differences may expose latent race bugs in the program. Programs using time based synchronization
> +(as opposed to using data dependencies) may change behavior.
> +
> +@end itemize
> +
> +@node Tuning lock elision
> +@section Tuning lock elision
> +
> +Critical regions may need some tuning to get the benefit of lock elision.
> +This is based on the abort rates, which can be determined by a PMU profiler
> +(e.g. perf on GNU/Linux systems). When the abort rate is too high lock
s/GNU\/Linux systems/@gnulinuxsystems/g.
> +scaling will not improve. Generally lock elision feedback should be done
> +only based on profile feedback.
> +
> +Most of these optimizations will improve performance even without lock elision
> +because they will minimize cache line bouncing between threads or make
> +lock regions smaller.
> +
> +Common causes of transactional aborts:
> +
> +@itemize
> +@item
> +Not elidable operations like system calls, IO, CPU exceptions.
> +
> +Try to move out of the critical section when common. Note that these often happen at program startup only.
> +@item
> +Global statistic counts
> +
> +Global statistic variables tend to cause conflicts. Either disable, or make per thread or as a last resort sample
> +(not update every operation)
> +@item
> +False sharing of variables or data structures causing conflicts with other threads
> +
> +Add padding as needed.
> +@item
> +Other conflicts on the same cache lines with other threads
> +
> +Minimize conflicts with other threads. This may require changes to the data structures.
> +@item
> +Capacity overflow
> +
> +The memory transaction used for lock elision has a limited capacity. Make the critical region smaller
> +or move operations that do not need to be protected by the lock outside.
> +
> +@item
> +Rewriting already set flags
> +
> +Setting flags or variables in shared objects that are already set may cause conflicts. Add a check
> +to only write when the value changed.
> +@end itemize
> +
> +@node Setting elision for individual @code{pthread_mutex_t}
> +@section Setting elision for individual @code{pthread_mutex_t}
> +
> +Elision can be explicitly disabled or enabled for each @code{pthread_mutex_t} in the program.
> +This overrides any other defaults set by environment variables for this lock.
> +
> +@code{pthrex_mutex_t} Initializers for using in variable initializations.
> +
> +@itemize
> +@item
> +PTHREAD_MUTEX_INIT_NP(PTHREAD_MUTEX_TIMED_NP|PTHREAD_MUTEX_ELISION_NP)
> +Force lock elision for a (default) timed mutex.
> +
> +@item
> +PTHREAD_MUTEX_INIT_NP(PTHREAD_MUTEX_TIMED_NP|PTHREAD_MUTEX_NO_ELISION_NP)
> +Force no lock elision for a (default) timed mutex.
> +
> +@item
> +PTHREAD_MUTEX_INIT_NP(PTHREAD_MUTEX_ADAPTIVE_NP|PTHREAD_MUTEX_ELISION_NP)
> +Force lock elision for an adaptive mutex.
> +
> +@item
> +PTHREAD_MUTEX_INIT_NP(PTHREAD_MUTEX_ADAPTIVE_NP|PTHREAD_MUTEX_NO_ELISION_NP)
> +Force no lock elision for an adaptive mutex.
> +@end itemize
> +
> +@smallexample
> +/* Disable lock elision for mylock */
> +pthread_mutex_t mylock = PTHREAD_MUTEX_INIT_NP(PTHREAD_MUTEX_TIMED_NP|PTHREAD_MUTEX_ELISION_NP);
s/PTHREAD_MUTEX_ELISION_NP/PTHREAD_MUTEX_NO_ELISION_NP/g.
> +@end smallexample
> +
> +The lock type can also be set at runtime using @code{pthread_mutexattr_settype} and @code{pthread_mutex_init}.
> +
> +@smallexample
> +/* Force lock elision for a dynamically allocated mutex */
> +pthread_mutexattr_t attr;
> +pthread_mutexattr_init (&attr);
> +pthread_mutexattr_settype (&attr, PTHREAD_MUTEX_TIMED_NP|PTHREAD_MUTEX_ELISION_NP);
> +pthread_mutex_init (&object->mylock, &attr);
> +@end smallexample
> +
> +@code{pthread_mutex_gettype} will return additional flags too.
> +
> +@node Setting @code{pthread_mutex_t} elision using environment variables
> +@section Setting @code{pthread_mutex_t} elision using environment variables
> +The elision of @code{pthread_mutex_t} mutexes can be configured at runtime with the @code{GLIBC_MUTEX}
> +environment variable. This will force a specific lock type for all
> +mutexes in the program that do not have another type set explicitly.
> +An explicitly set lock type will override the environment variable.
> +
> +@smallexample
> +# run myprogram with no elision
> +GLIBC_MUTEX=none myprogram
> +@end smallexample
> +
> +The default depends on the C library build configuration and whether the hardware
s/the C library/@theglibc/g
> +supports lock elision.
> +
> +@itemize
> +@item
> +@code{GLIBC_MUTEX=elision}
> +Use elided mutexes, unless explicitely disabled in the program.
> +
> +@item
> +@code{GLIBC_MUTEX=none}
> +Don't use elide mutexes, unless explicitly enable in the program.
> +@end itemize
> +
> +In addition additional tunables can be configured through the environment variable,
s/In addition additional/Additional/g
> +like this:
> +@code{GLIBC_MUTEX=adaptive:retry_lock_busy=10,retry_lock_internal_abort=20}
> +Note these parameters do not consistitute an ABI and may change or disappear
s/consistitute/constitute/g
> +at any time as the lock elision algorithm evolves.
> +
> +Currently supported parameters are:
> +
> +@itemize
> +@item
> +retry_lock_busy
> +How often to not attempt a transaction when the lock is seen as busy.
Units?
> +
> +@item
> +retry_lock_internal_abort
> +How often to not attempt a transaction after an internal abort is seen.
Units?
> +
> +@item
> +retry_try_xbegin
> +How often to retry the transaction on external aborts.
Units?
> +
> +@item
> +retry_trylock_internal_abort
> +How often to retry the transaction on internal aborts during trylock.
> +This setting is also used for adaptive locks.
> +
Units?
> +@end itemize
> +
> +@node Setting elision for individual @code{pthread_rwlock_t}
> +@section Setting elision for individual @code{pthread_rwlock_t}
> +
> +Elision can be explicitly disabled or enabled for each @code{pthread_rwlock_t} in the program.
> +This overrides any other defaults set by environment variables for this lock.
> +
> +Valid flags are @code{PTHREAD_RWLOCK_ELISION_NP} to force elision and @code{PTHREAD_RWLOCK_NO_ELISION_NP}
> +to disable elision. These can be ored with other rwlock types.
> +
> +@smallexample
> +/* Force no lock elision for a dynamically allocated rwlock */
> +pthread_rwlockattr_t rwattr;
> +pthread_rwlockattr_init (&rwattr);
> +pthread_rwlockattr_settype (&rwattr, PTHREAD_RWLOCK_NO_ELISION_NP);
> +pthread_rwlock_init (&object->myrwlock, &rwattr);
> +@end smallexample
> +
> +@node Setting @code{pthread_rwlock_t} elision using environment variables
> +@section Setting @code{pthread_rwlock_t} elision using environment variables
> +The elision of @code{pthread_rwlock_t} rwlockes can be configured at
> +runtime with the @code{GLIBC_RWLOCK} environment variable.
> +This will force a specific lock type for all
> +rwlockes in the program that do not have another type set explicitly.
> +An explicitly set lock type will override the environment variable.
> +
> +@smallexample
> +# run myprogram with no elision
> +GLIBC_RWLOCK=none myprogram
> +@end smallexample
> +
> +The default depends on the C library build configuration and whether the hardware
s/the C library/@theglibc/g
> +supports lock elision.
> +
> +@itemize
> +@item
> +@code{GLIBC_RWLOCK=elision}
> +Use elided rwlockes, unless explicitely disabled in the program.
s/rwlockes/rwlocks/g.
s/explicitely/explicitly/g.
> +
> +@item
> +@code{GLIBC_RWLOCK=none}
> +Don't use elided rwlocks, unless explicitely enabled in the program.
s/explicitely/explicitly/g.
> +@end itemize
> diff --git a/manual/intro.texi b/manual/intro.texi
> index deaf089..3af44c6 100644
> --- a/manual/intro.texi
> +++ b/manual/intro.texi
> @@ -703,6 +703,9 @@ information about the hardware and software configuration your program
> is executing under.
>
> @item
> +@ref{Lock elision} describes elided locks in pthreads.
s/pthreads/POSIX threads/g.
> +
> +@item
> @ref{System Configuration}, tells you how you can get information about
> various operating system limits. Most of these parameters are provided for
> compatibility with POSIX.
> diff --git a/manual/lang.texi b/manual/lang.texi
> index ee04e23..72e06b0 100644
> --- a/manual/lang.texi
> +++ b/manual/lang.texi
> @@ -1,6 +1,6 @@
> @c This node must have no pointers.
> @node Language Features
> -@c @node Language Features, Library Summary, , Top
> +@c @node Language Features, Library Summary, Lock elision, Top
> @c %MENU% C language features provided by the library
> @appendix C Language Facilities in the Library
>
> diff --git a/manual/threads.texi b/manual/threads.texi
> index 9a1df1a..06afdd3 100644
> --- a/manual/threads.texi
> +++ b/manual/threads.texi
> @@ -1,5 +1,5 @@
> @node POSIX Threads
> -@c @node POSIX Threads, , Cryptographic Functions, Top
> +@c @node POSIX Threads, Lock elision, Cryptographic Functions, Top
> @chapter POSIX Threads
> @c %MENU% POSIX Threads
> @cindex pthreads
>
OK with those changes.
Please repost after I review the other patches.
Cheers,
Carlos.