This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 1/6] Optimize number of accesses to *adapt_count in lock elision code.


On Mon, 2013-09-02 at 09:58 +0200, Dominik Vogt wrote:
> diff --git a/nptl/sysdeps/unix/sysv/linux/x86/elision-lock.c
> b/nptl/sysdeps/unix/sysv/linux/x86/elision-lock.c
> index 9d00983..45370ed 100644
> --- a/nptl/sysdeps/unix/sysv/linux/x86/elision-lock.c
> +++ b/nptl/sysdeps/unix/sysv/linux/x86/elision-lock.c
> @@ -44,7 +44,7 @@
>  int
>  __lll_lock_elision (int *futex, short *adapt_count, EXTRAARG int
> private)
>  {
> -  if (*adapt_count <= 0)
> +  if (*adapt_count == 0)

This is incorrect, adapt_count can become less than zero.  This check
here is not atomic with the increment further down, so you can have any
number of threads pending to decrease adapt_count.

If you check for == 0, we can easily get into cases where we have to
have 2^16 minus a few non-elision acquisitions before we can reach 0
again.

>      {
>        unsigned status;
>        int try_xbegin;
> @@ -71,13 +71,13 @@ __lll_lock_elision (int *futex, short
> *adapt_count, EXTRAARG int private)
>                 {
>                   /* Right now we skip here.  Better would be to wait
> a bit
>                      and retry.  This likely needs some spinning.  */
> -                 if (*adapt_count != aconf.skip_lock_busy)
> +                 if (aconf.skip_lock_busy > 0)
>                     *adapt_count = aconf.skip_lock_busy;

I believe the intent here is to avoid writes to the lock if they aren't
necessary.  From a correctness POV, setting adapt_count to some positive
value would always be correct here (and is always possible in the
current code anyway, because the check and the store aren't atomic).
Thus, the check is and must be a performance optimization, which the
sparse comment in the chunk below seems to indicate:

>                 }
>               /* Internal abort.  There is no chance for retry.
>                  Use the normal locking and next time use lock.
>                  Be careful to avoid writing to the lock.  */
> -             else if (*adapt_count != aconf.skip_lock_internal_abort)
> +             else if (aconf.skip_lock_internal_abort > 0)
>                 *adapt_count = aconf.skip_lock_internal_abort;
>               break;
>             }

(Andi: Obviously, this comment wasn't verbose enough.  Perhaps that's an
additional tiny bit helping to convince you that writing useful comments
is worthwhile...)

Dominik, if you are interested in optimizing the adaptation mechanism, I
suggest to start with a coarse description / design first, documenting
the performance-relevant effects we need to avoid:
* Bad transitions like if we had the == 0 check (see above).
* Cases where modifying adapt_count can lead to further aborts, and how
to avoid this.
* Things we haven't considered yet, and don't measure yet, but which
might happen in practice (eg, forms of convoying, ...)

All of this would be a good enhancement to the HLE guidelines wiki page.
I'm suggesting to describe the issues/design first because my gut
feeling is that we don't really understand this yet, and that just
trying to fiddle about the adaptation mechanism, this would be just code
churn but no clear path to handling this any better.  Of course, if you
can provide code changes that can be demonstrated to perform better, all
the better.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]