This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH v2] add concurrent malloc benchmark

From: Leonhard Holz <leonhard dot holz at web dot de>
To: Siddhesh Poyarekar <siddhesh at redhat dot com>
Cc: libc-alpha at sourceware dot org
Date: Thu, 26 Mar 2015 14:42:45 +0100
Subject: Re: [PATCH v2] add concurrent malloc benchmark
Authentication-results: sourceware.org; auth=none
References: <54FD6FAE dot 8050204 at web dot de> <20150325070034 dot GD5023 at spoyarek dot pnq dot redhat dot com>

Am 25.03.2015 um 08:00 schrieb Siddhesh Poyarekar:

On Mon, Mar 09, 2015 at 11:02:22AM +0100, Leonhard Holz wrote:

The current malloc benchtest does use multiple threads, but they work on
independent block lists so that every thread uses its own arena only. The
following patch adds a version of the benchmark with all threads working on
the same block list, so that it is likely that a thread free()s a block in
an arena of a different thread and by this causes lock contentation on that
arena. Therefore the performance of the malloc locking mechanism is included
in the measuring.

Unfortunately the access to an entry in the shared block list has to be
protected by a mutex. The time taken for acquiring the mutex is included in
the measured time per iteration, so that the iteration time of the
concurrent malloc benchmark is not directly compareable the time of the
per-thread benchmark.


It might be a better idea to simulate producers and consumers for such
a benchmark.  Size variance is not an issue in this case since we're
not measuring overhead due to consolidation.  A simple model would be
to have one thread malloc blocks and pass it on to another thread,
which then frees the block.  You could scale this to a number of
producers and consumers, either working in pairs or on a couple of
common pools.

Hmm.. the malloc/free code has different pathes depending on requested block sizeincluding different locking procedures, so I would like to keep some kind of sizevariance. Also the current implementation is not that different to your proposedschema, actually all threads act as producers and consumers that are coupled viathe round-robin block array which acts as some sort of queue. And a queue toconnect the threads is needed anyway (?), either with extra data structures (whichimplies malloc which implies locking) or explicit locking as implemented.

So I would'nt say that the proposed version is not improvable but please be a bitmore specific about what should be achieved and how the proposed way does this.


Leonhard

Follow-Ups:
- Re: [PATCH v2] add concurrent malloc benchmark
  - From: Siddhesh Poyarekar

References:
- [PATCH v2] add concurrent malloc benchmark
  - From: Leonhard Holz
- Re: [PATCH v2] add concurrent malloc benchmark
  - From: Siddhesh Poyarekar

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]