This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Unify pthread_once (bug 15215)


On Thu, Apr 10, 2014 at 09:57:40PM +0200, Torvald Riegel wrote:
> On Mon, 2014-04-07 at 17:17 -0300, Adhemerval Zanella wrote:
> > On 28-03-2014 20:29, Carlos O'Donell wrote:
> > > David, Marcus, Joseph, Mike, Andreas, Steve, Chris,
> > >
> > > We would like to unify all C-based pthread_once implmentations
> > > per the plan in bug 15215 for glibc 2.20.
> > >
> > > Your machines are on the list of C-based pthread_once implementations.
> > >
> > > See this for the intial discussions on the unified pthread_once:
> > > https://sourceware.org/ml/libc-alpha/2013-05/msg00210.html
> > >
> > > The goal is to provide a single and correct C implementation of 
> > > pthread_once. Architectures can then build on that if they need more 
> > > optimal implementations, but I don't encourage that and I'd rather
> > > see deep discussions on how to make one unified solution where
> > > possible.
> > >
> > > I've also just reviewed Torvald's new pthread_once microbenchmark which
> > > you can use to compare your previous C implementation with the new
> > > standard C implementation (measures pthread_once latency). The primary
> > > use of this test is to help provide objective proof for or against the
> > > i386 and x86_64 assembly implementations.
> > >
> > > We are not presently converting any of the machines with custom
> > > implementations, but that will be a next step after testing with the
> > > help of the maintainers for sh, i386, x86_64, powerpc, s390 and alpha.
> > >
> > > If we don't hear any objections we will go forward with this change
> > > in one week and unify ia64, hppa, mips, tile, sparc, m68k, arm
> > > and aarch64 on a single pthread_once implementation based on sparc's C
> > > implementation.
> > >
> > > Any objections to this cleanup for 2.20?
> > >
> > I tested the patch and benchmark on PowerPC (POWER7) and the results looks good:
> > 
> > * Current code:
> > "duration": 5.08322e+09, "iterations": 2.2037e+08, "max": 244.863, "min": 22.08, "mean": 23.0668
> > "duration": 5.08316e+09, "iterations": 2.20479e+08, "max": 771.08, "min": 21.984, "mean": 23.0551
> > "duration": 5.08306e+09, "iterations": 2.21093e+08, "max": 53.966, "min": 22.052, "mean": 22.9906
> > "duration": 5.0833e+09, "iterations": 2.20062e+08, "max": 347.895, "min": 22.15, "mean": 23.0994
> > "duration": 5.08277e+09, "iterations": 2.20699e+08, "max": 632.479, "min": 21.997, "mean": 23.0303
> > 
> > * Optimization:
> > "duration": 4.97694e+09, "iterations": 8.42834e+08, "max": 134.181, "min": 5.405, "mean": 5.90501
> > "duration": 4.9758e+09, "iterations": 8.66952e+08, "max": 29.163, "min": 5.405, "mean": 5.73941
> > "duration": 4.9778e+09, "iterations": 8.51788e+08, "max": 40.819, "min": 5.405, "mean": 5.84394
> > "duration": 4.97413e+09, "iterations": 8.52432e+08, "max": 37.089, "min": 5.488, "mean": 5.83523
> > "duration": 4.97795e+09, "iterations": 8.43376e+08, "max": 163.703, "min": 5.405, "mean": 5.90241
> > 
> > I am running on a 18 cores machine, so I guess the 'max' is due a timing issue from os jitter.
> 
> There's no spinning in the algorithm currently (or most of glibc AFAIK,
> except the rather simplistic attempt in adaptive mutexes), so even small
> initialization routines may go straight to blocking via futexes.  (And
> AFAIK, futexes don't spin before actually trying to block.)
> 
In pthread_once context spinning is unlikely to help, you would need to
hit oppurtinity window when other thread does brief initialization and
this could happen only once each core.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]