This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: pthread wastes memory with mlockall(MCL_FUTURE)
- From: Balazs Kezes <rlblaster at gmail dot com>
- To: Rich Felker <dalias at libc dot org>
- Cc: libc-alpha at sourceware dot org
- Date: Fri, 18 Sep 2015 20:29:52 +0100
- Subject: Re: pthread wastes memory with mlockall(MCL_FUTURE)
- Authentication-results: sourceware.org; auth=none
- References: <20150918102734 dot GA27881 at eper> <20150918143824 dot GB17773 at brightrain dot aerifal dot cx> <20150918163842 dot GB27881 at eper> <20150918170853 dot GC17773 at brightrain dot aerifal dot cx>
On 2015-09-18 13:08 -0400, Rich Felker wrote:
> I'm talking about new PROT_NONE pages.
That's not how pthread does the allocation: it mmaps read/write first,
and then does a mprotect(..., ..., PROT_NONE).
> The kernel certainly accounts for them differently as commit charge.
> New PROT_NONE pages consume no commit charge. Anonymous pages with
> data in them, which would become available again if you mprotect them
> readable, do consume commit charge. (For this reason, you have to mmap
> MAP_FIXED+PROT_NONE to uncommit memory rather than just using mprotect
> PROT_NONE, even if you already used madvise MADV_DONTNEED on it.)
So while working on the repro I've looked deeper and created a simple
app which demonstrates the mmap behavior:
// gcc -Wall -Wextra -std=c99 mapping.c -o mapping
#define _GNU_SOURCE
#include <assert.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <sys/mman.h>
#include <unistd.h>
int main(void)
{
int r;
r = mlockall(MCL_CURRENT | (getenv("M") ? MCL_FUTURE : 0));
assert(r == 0);
int flags = MAP_PRIVATE | MAP_ANONYMOUS;
void *mem = mmap(NULL, 8LL << 30, PROT_WRITE, flags, -1, 0);
assert(mem != NULL);
sleep(100);
return 0;
}
All it does it mmaps some memory and if I have the envvar M set then it
also does the mlocking part. When I run this application without
mlocking then it barely uses any RSS memory. However when I set M then I
can see in htop that RSS is 8GB and that
"cat /proc/meminfo | grep MemAvailable" shows 8 GB less memory. Actually
when I look at the number of minor pagefaults I get this:
$ /usr/bin/time -f %R ./mapping
102
$ M=1 /usr/bin/time -f %R ./mapping
4709
So I think the kernel preallocates all the memory in this case.
However if I set the protection to PROT_NONE then the kernel doesn't do
the preallocation.
Interestingly it does *not* preallocate even if mmap with PROT_NONE
first and then do a mprotect(mem, 8LL<<30, PROT_WRITE). I do see the
pagefaults if I do a memset(mem, 0, 8LL<<30) afterwards though.
So here's what I think pthreads should do: First mmap with PROT_NONE and
only then should mprotect read/write the stack pages.
Does that sound reasonable?
Thanks!
--
Balazs