This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Remove atomic operations from malloc.c

From: Torvald Riegel <triegel at redhat dot com>
To: Adhemerval Zanella <azanella at linux dot vnet dot ibm dot com>
Cc: libc-alpha at sourceware dot org
Date: Wed, 11 Feb 2015 16:36:38 +0100
Subject: Re: [PATCH] Remove atomic operations from malloc.c
Authentication-results: sourceware.org; auth=none
References: <54DB130F dot 9070300 at web dot de> <1423652468 dot 9778 dot 250 dot camel at triegel dot csb> <54DB5011 dot 9020405 at linux dot vnet dot ibm dot com> <54DB594F dot 9070406 at web dot de> <54DB5D6D dot 5060305 at linux dot vnet dot ibm dot com>

On Wed, 2015-02-11 at 11:47 -0200, Adhemerval Zanella wrote:
> On 11-02-2015 11:29, Leonhard Holz wrote:
> >> I did get into the changes itself, but at least for powerpc (POWER8/16c/128T)
> >> I am not seeing improvements with the patch.  In fact it seems to increase
> >> contention:
> >>
> >>             time per iteration
> >> nths       master     patch
> >> 1           51.422    75.046
> >> 8          53.077    78.507
> >> 16         57.430    89.385
> >> 32         71.206   108.359
> >> 64        114.370   172.115
> >> 128       251.731   330.924
> >>
> >
> > Thank you for testing! Maybe the costs of a mutex_lock are higher on PowerPC than on i686? Anyway it looks like I have to take a different approach...

I don't think it's just that, but it could be a part.  When you use a
futex-based lock such as our lowlevellock, lock release needs an atomic
RMW operation as well (to find out whether there is any waiter).  That's
something that the (broken) list removal code doesn't need.

> PowerPC uses now the default implementation at sysdeps/nptl/lowlevellock.h which 
> basically translates to acquire CAS followed by a futex operation in contention
> case.  So I think the gain is for powerpc (specially with high SMT), busy-wait
> using like a spinlock yields better performance than possible issuing a futex
> operations.

Using spin-waiting should help, but I would be cautious in just using
spin-waiting.

References:
- [PATCH] Remove atomic operations from malloc.c
  - From: Leonhard Holz
- Re: [PATCH] Remove atomic operations from malloc.c
  - From: Torvald Riegel
- Re: [PATCH] Remove atomic operations from malloc.c
  - From: Adhemerval Zanella
- Re: [PATCH] Remove atomic operations from malloc.c
  - From: Leonhard Holz
- Re: [PATCH] Remove atomic operations from malloc.c
  - From: Adhemerval Zanella

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]