This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: RFC: replace ptmalloc2

From: Will Newton <will dot newton at linaro dot org>
To: Eric Wong <normalperson at yhbt dot net>
Cc: JÃrn Engel <joern at purestorage dot com>, Rich Felker <dalias at libc dot org>, Siddhesh Poyarekar <siddhesh dot poyarekar at gmail dot com>, GNU C Library <libc-alpha at sourceware dot org>
Date: Mon, 20 Oct 2014 15:17:56 +0100
Subject: Re: RFC: replace ptmalloc2
Authentication-results: sourceware.org; auth=none
References: <20141009215447 dot GD8583 at Sligo dot logfs dot org> <CAAHN_R0JDNQkx7oV0HS9Knv7nsPZiARLeFb4zpPa+rj7cNfECg at mail dot gmail dot com> <20141010010743 dot GA15146 at Sligo dot logfs dot org> <20141010012530 dot GX23797 at brightrain dot aerifal dot cx> <20141010013302 dot GC15146 at Sligo dot logfs dot org> <20141010020229 dot GY23797 at brightrain dot aerifal dot cx> <20141014233254 dot GA1860 at Sligo dot logfs dot org> <20141015040031 dot GR32028 at brightrain dot aerifal dot cx> <20141015045238 dot GA4528 at Sligo dot logfs dot org> <CANu=DmgNQm4A0ChTky+c4iSBwDjb5uAmsHd3H7MynQPjL3vecA at mail dot gmail dot com> <20141017090340 dot GA12253 at dcvr dot yhbt dot net>

On 17 October 2014 10:03, Eric Wong <normalperson@yhbt.net> wrote:
> Will Newton <will.newton@linaro.org> wrote:
>> I currently have some microbenchmarks and a script to run a few open
>> source applications inside a docker container and measure the
>> performance of the commonly used allocators. I could certainly use
>> more and more realistic workloads to add to it:
>>
>> https://git.linaro.org/toolchain/cortex-malloc.git
>
> Hi Will, great to see you have a producer-consumer test in there!  I
> noticed jemalloc seemed flawed with large remote frees in my own tests,
> too.
>
> Several months ago, I also started studying memory allocator
> implementations and working on some benchmarks and a dlmalloc-based one,
> but got discouraged and sidetracked by other things.
>
> * git://80x24.org/xtbench
>   - xthr.c is mine, also producer-consumer (uses URCU),
>   - t-test* from ptmalloc3
>   There's also ebizzy...
>
> * git://80x24.org/femalloc
>   - it is dlmalloc + wait-free queue (from URCU)
>     dlmalloc 2.8 has an API (mspace) for doing per-thread arenas.
>   - "fe" == "fool's errand", that's what working on a general-purpose
>     malloc feels like :x

Thanks for the links, it looks an interesting approach. I'm not really
familiar with URCU so I guess I'll have to get my head around that
first.

>> So far I would say that tcmalloc seems to have the best performance
>> but the highest space overhead.
>
> I found the locklessinc.com malloc is fast, too; but use lots of space
> and needs to be ported to non-x86.  I also like the use of a slab
> allocator in lockless for some small allocations and may do something
> similar of I continue with femalloc.  femalloc currently performs
> well for medium/larger allocations, but there's some glaring weaknesses
> I documented in the README[1].
>
>
> There's several other things I want to keep in mind for a malloc
> (but do not have good automated tests for, yet):
>
> * copy-on-write sharing on fork + swap behavior for large allocations.
>   The ptmalloc/dlmalloc layout seems bad for this because the boundary
>   tags ends up touching extra pages, esp for bigger allocations.  femalloc
>   inherits this weakness, so that's part of the reason I've been working
>   on other things, instead.

I believe tcmalloc should behave better in this case but I do not have
a test for this scenario either.

> * Ability to take advantage of THP, but not inflate memory usage.
>   This is important for folks who run many threads w/o overcommit.
>   I think software like MySQL with 10s-100s of threads on a handful of
>   CPUs is here to stay.  Getting folks to remember knobs like
>   MALLOC_ARENA_MAX is annoying and tiring.

Do you have an idea in mind how malloc could integrate with THP?

-- 
Will Newton
Toolchain Working Group, Linaro

Follow-Ups:
- Re: RFC: replace ptmalloc2
  - From: Eric Wong

References:
- RFC: replace ptmalloc2
  - From: Jörn Engel
- Re: RFC: replace ptmalloc2
  - From: Jörn Engel
- Re: RFC: replace ptmalloc2
  - From: Rich Felker
- Re: RFC: replace ptmalloc2
  - From: Jörn Engel
- Re: RFC: replace ptmalloc2
  - From: Rich Felker
- Re: RFC: replace ptmalloc2
  - From: Jörn Engel
- Re: RFC: replace ptmalloc2
  - From: Rich Felker
- Re: RFC: replace ptmalloc2
  - From: Jörn Engel
- Re: RFC: replace ptmalloc2
  - From: Will Newton
- Re: RFC: replace ptmalloc2
  - From: Eric Wong

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]