This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] benchtests: Add malloc microbenchmark

From: Steven Munroe <munroesj at linux dot vnet dot ibm dot com>
To: Rich Felker <dalias at aerifal dot cx>, Will Newton <will dot newton at linaro dot org>
Cc: libc-alpha <libc-alpha at sourceware dot org>
Date: Tue, 15 Apr 2014 14:09:12 -0500
Subject: Re: [PATCH] benchtests: Add malloc microbenchmark
Authentication-results: sourceware.org; auth=none
References: <1397568941-4298-1-git-send-email-will dot newton at linaro dot org> <1397576171 dot 12247 dot 7 dot camel at spokane1 dot rchland dot ibm dot com> <CANu=Dmji4SC2C2U4ps9Ci1LQq7gZ1GMz7BZXZ6n+zygMH8g78g at mail dot gmail dot com> <20140415162746 dot GY26358 at brightrain dot aerifal dot cx>
Reply-to: munroesj at us dot ibm dot com

On Tue, 2014-04-15 at 12:27 -0400, Rich Felker wrote:
> On Tue, Apr 15, 2014 at 04:42:25PM +0100, Will Newton wrote:
> > On 15 April 2014 16:36, Steven Munroe <munroesj@linux.vnet.ibm.com> wrote:
> > > On Tue, 2014-04-15 at 14:35 +0100, Will Newton wrote:
> > >> Add a microbenchmark for measuring malloc and free performance. The
> > >> benchmark allocates and frees buffers of random sizes in a random
> > >> order and measures the overall execution time and RSS. Variants of the
> > >> benchmark are run with 8, 32 and 64 threads to measure the effect of
> > >> concurrency on allocator performance.
> > >>
> > >> The random block sizes used follow an inverse square distribution
> > >> which is intended to mimic the behaviour of real applications which
> > >> tend to allocate many more small blocks than large ones.
> > >>
> > >
> > > This test is more likely to measure the locking overhead of random then
> > > it is to measure malloc performance.
> > 
> > It uses rand_r so I don't think this is the case.
> 
> If you're using rand_r, you need to be careful how you use the output,
> as glibc's rand_r implementation has very poor statistical properties.
> See:
> 
> http://sourceware.org/bugzilla/show_bug.cgi?id=15615
> 
> snip
> 
> > The benchmark code spends roughly 80% of its time within malloc/free
> > and friends, which is good, but does leave some room for improvement.
> > Around 10% of the time is spent in dealing with random number
> > generation so maybe a simple inline random number generator would
> > improve things.
> 
I personally strive for 95-99% time in the software-under-test (SUT).
This is much harder then it looks but can and should be done.

The other issue to look out for is gettimeofday/clock_gettime overheads.
You need to run the SUT long enough that the clock reading and
conversion is not a factor in the measurement.

> What about just pregenerating a large array of random numbers and
> accessing sequentual slots of the array? This potentially has cache
> issues but it might be possible to simply use a small array and wrap
> back to the beginning, perhaps performing a trivial operation like
> adding the last output of the previous run onto the value in the
> array.
> 
This is generally a better design for a micro-benchmark.

References:
- [PATCH] benchtests: Add malloc microbenchmark
  - From: Will Newton
- Re: [PATCH] benchtests: Add malloc microbenchmark
  - From: Steven Munroe
- Re: [PATCH] benchtests: Add malloc microbenchmark
  - From: Will Newton
- Re: [PATCH] benchtests: Add malloc microbenchmark
  - From: Rich Felker

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]