This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: TSX lock elision for glibc v5
- From: Dominik Vogt <vogt at linux dot vnet dot ibm dot com>
- To: libc-alpha at sourceware dot org
- Date: Mon, 6 May 2013 10:29:49 +0200
- Subject: Re: TSX lock elision for glibc v5
- References: <1367537252-30831-1-git-send-email-andi at firstfloor dot org>
- Reply-to: vogt at linux dot vnet dot ibm dot com
On Thu, May 02, 2013 at 04:27:23PM -0700, Andi Kleen wrote:
> v5: (= rtm-devel7)
> Rebased to current master.
> Use GLIBC_* prefixes for environment variables.
> Merge environment scan with dynamic linker
> Fix CPUID id that broke earlier.
> Minor cleanups.
Maybe it's just me, but I find it very difficult to analyse the
changes between v4 and v5. For example, patch 4 looks totally
different from the one in v4, and the list above gives no clue as
to why that is.
Comments on the patches:
* The patches 2, 4 and 6 can and should be split into parts that
add the architecture independent infrastructure for lock
elision and a part that adds the achitecture dependent
implementation for Intel. With the current patches it is more
difficult than necessary to identify the code that has to be
rewritten when porting lock elision to aother platforms.
* The terms rtm and xtest appear in patch 7 in global code, but
they are really Intel specific. It is possible and desirable to
generalize the lock elision interfaces to allow linking in other
cpus.
* The elision-conf.[ch] code is mostly architecture independent
and can be move into the common nptl code almost completely.
Are there any performance measurements on these patches? One
observation I made on a different platform adapted to the lock
elision patches is that lock elision performs rather poorly on
short lived read type locks:
for (i = 0; i < 10000000; i++)
{
pthread_mutex_lock(&m);
x = shared_var;
pthread_mutex_unlock(&m);
}
In my tests this code snippet takes 2.5 times as much runtime with
just a single thread using elision compared to a glibc without the
lock elision patches, although there are no aborts at all. I
assume this is due to the longer code path and the execution time
of the instructions to begin and end transactions.
Ciao
Dominik ^_^ ^_^
--
Dominik Vogt
IBM Germany