This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: malloc - cache locality - TLB
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: Torvald Riegel <triegel at redhat dot com>
- Cc: Carlos O'Donell <carlos at redhat dot com>, GNU C Library <libc-alpha at sourceware dot org>, Roland McGrath <roland at hack dot frob dot com>, Andreas Jaeger <aj at suse dot com>, "Joseph S. Myers" <joseph at codesourcery dot com>, Andreas Schwab <schwab at suse dot de>, Siddhesh Poyarekar <siddhesh at redhat dot com>
- Date: Fri, 20 Dec 2013 17:09:15 +0100
- Subject: Re: malloc - cache locality - TLB
- Authentication-results: sourceware.org; auth=none
- References: <52A6A0DA dot 1080109 at redhat dot com> <1386688619 dot 23049 dot 3215 dot camel at triegel dot csb> <20131220022411 dot GA26981 at domone dot podge>
The TLB effects could be mostly interpreted as a cache locality problem
where cache-lines are page large. Links that I gave earlier also try to
handle these.
On Fri, Dec 20, 2013 at 03:24:11AM +0100, OndÅej BÃlka wrote:
>
> http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.43.6621&rep=rep1&type=pdf
> http://www.cs.technion.ac.il/~itai/Courses/Cache/pointers.pdf
> http://people.cs.umass.edu/~emery/pubs/p33-feng.pdf
>
A main ideas there to minimize tlb misses is to allocate chunks in address-order which causes
frequently allocated entries reside mostly at low address and eventualy free high addresses
when they are not needed. There is also a bitmap alternative where we
allocate from single page while we can and then pick a page with smallest usage ratio
(and is reasonably fresh for cache concerns).
However technology progressed somewhat and main problems are
administrative rather than technological ones.
There is prerequisite that we should introduce a per-application file
where we could store application-specific profile feedback.
Second requirement is using separate heaps for small and large allocations.
As linux supported since 2003 huge pages we could try to use these.
One of benefits of paging is that system does not have allocate pages
that are never used. However when requests are less than page large
a single write will occupy it and benefits of virtual space vanish.
If we placed these allocations into huge pages then we could
considerably improve TLB cache considerably as small requests are
likeliest candidates for hot entries.
A downsite is that huge pages migth be too big so we could waste a page
while we use only small part of it, give pages to process that does not
compute much and starve a computing one, copying too much on fork etc.
These problems are solvable with profiling, into file that I mentioned
earlier we could write statistics of small heap size and allocate huge
pages only when it is likely that we use more than page, there was no
fork, process is cpu-intensive...
Another hurdle is that in linux total number of available huge pages in system
needs to be set by root so we need to coordinate with distributions
about setting a default.