This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [patch] malloc per-thread cache ready for review
- From: DJ Delorie <dj at redhat dot com>
- To: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- Cc: libc-alpha at sourceware dot org
- Date: Thu, 02 Feb 2017 09:43:36 -0500
- Subject: Re: [patch] malloc per-thread cache ready for review
- Authentication-results: sourceware.org; auth=none
Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes:
> however I don't see any improvement on SPEC.
The case that's sped up is when you are malloc'ing a block size that's
the same as something you've recently free'd, and relatively small (1k
or less for 64-bit). The cache is not that big[*], so if your app doesn't
follow that pattern you won't see much speedup.
I was kinda surprized that SPEC showed improvements; the only thing I
can think of is that on some types of CPUs the memory locality is
affected in beneficial ways, but I have no way of measuing that. Use of
the tcache means that memory free'd by a thread tends to get re-used by
the same thread.
[*] 64 buckets of 7 chunks each
> Or is there some other config trick needed to turn it on? (I see the
> new fast path in the binaries so it is built correctly).
You can turn it *off* with tunables (glibc.malloc.tcache_count = 0) to
compare performance with/without. It defaults to enabled in the
dj/malloc-tcache branch, --disable-experimental-malloc to remove it.