This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [RFC] pthread_once: Use unified variant instead of custom x86_64/i386
- From: Rich Felker <dalias at libc dot org>
- To: libc-alpha at sourceware dot org
- Date: Sun, 19 Oct 2014 23:41:21 -0400
- Subject: Re: [RFC] pthread_once: Use unified variant instead of custom x86_64/i386
- Authentication-results: sourceware.org; auth=none
- References: <1381523328 dot 18547 dot 3422 dot camel at triegel dot csb> <1396878469 dot 10643 dot 8959 dot camel at triegel dot csb> <1413751577 dot 8483 dot 27 dot camel at triegel dot csb> <20141020013717 dot 08E272C3A86 at topped-with-meat dot com>
On Sun, Oct 19, 2014 at 06:37:16PM -0700, Roland McGrath wrote:
> > +static int
> > +__attribute__((noinline))
>
> Space before paren.
>
> > +__pthread_once_slow (pthread_once_t *once_control, void (*init_routine) (void))
>
> This needs a comment explaining why it's separate. I would have expected
> the compiler to generate essentially the same code for the early bailout
> case marked with __glibc_likely.
Sadly gcc never does that, as far as I know. This optimization
shouldn't even be dependent on whether the code path is "likely" or
not; compilers should "shrink-wrap" code paths that don't need heavy
stack frame setup whenever they can do so without pessimizing the
other code paths, but they don't.
The method in this patch is the standard fix that I've used in various
places to "manually shrink-wrap" before and it seems appropriate here.
I just tried the same approach in musl for pthread_once (already using
it for mutexes) and it doubled performance in the fast path, so I
think it's a good idea.
Rich