This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: pthread wastes memory with mlockall(MCL_FUTURE)
- From: Balazs Kezes <rlblaster at gmail dot com>
- To: Rich Felker <dalias at libc dot org>
- Cc: libc-alpha at sourceware dot org
- Date: Sun, 20 Sep 2015 14:27:12 +0100
- Subject: Re: pthread wastes memory with mlockall(MCL_FUTURE)
- Authentication-results: sourceware.org; auth=none
- References: <20150918102734 dot GA27881 at eper> <20150918143824 dot GB17773 at brightrain dot aerifal dot cx> <20150918163842 dot GB27881 at eper> <20150918170853 dot GC17773 at brightrain dot aerifal dot cx> <20150918192952 dot GC27881 at eper> <20150918194521 dot GD17773 at brightrain dot aerifal dot cx> <20150918201101 dot GD27881 at eper> <20150918232246 dot GF17773 at brightrain dot aerifal dot cx>
On 2015-09-18 19:22 -0400, Rich Felker wrote:
> If this works, I think it's only due to a kernel bug of failing to
> apply the lock after mprotect. It's also going to be considerably
> slower, I think. What I had in mind was switching around the existing
> mmap/mprotect order, not adding an extra mprotect.
Here's a better patch. It will readd the permissions in two steps. First
it will only add for this pthread structure at the end of the
allocation. Then at a later step it will readd it for the rest of the
stack (this part I didn't touch so that's why this patch is short).
I don't really like this two-step mprotect call but unless the stack
grows down, you can't avoid it (and I didn't special case that option).
But review carefully because I don't know how to test it better than
with my little test application and make xcheck.
diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c
index 753da61..2959816 100644
--- a/nptl/allocatestack.c
+++ b/nptl/allocatestack.c
@@ -499,11 +499,11 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
#if MULTI_PAGE_ALIASING != 0
if ((size % MULTI_PAGE_ALIASING) == 0)
size += pagesize_m1 + 1;
#endif
- mem = mmap (NULL, size, prot,
+ mem = mmap (NULL, size, PROT_NONE,
MAP_PRIVATE | MAP_ANONYMOUS | MAP_STACK, -1, 0);
if (__glibc_unlikely (mem == MAP_FAILED))
return errno;
@@ -542,13 +542,25 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
- __static_tls_size)
& ~__static_tls_align_m1)
- TLS_PRE_TCB_SIZE);
#endif
+ /* We allocated the memory without permissions in order for the kernel
+ to not allocate the guard pages in case of mlockall(MCL_FUTURE).
+ Now readd the permissions for pd's page. */
+ void *pdpage = (void*) ((uintptr_t) pd & ~pagesize_m1);
+ size_t pdsize = mem + size - pdpage;
+ if (__glibc_unlikely (mprotect (pdpage, pdsize, prot) != 0))
+ {
+ (void) munmap (mem, size);
+ return errno;
+ }
+
/* Remember the stack-related values. */
pd->stackblock = mem;
pd->stackblock_size = size;
+ pd->guardsize = size;
/* We allocated the first block thread-specific data array.
This address will not change for the lifetime of this
descriptor. */
pd->specific[0] = pd->specific_1stblock;
@@ -621,12 +633,11 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
/* Note that all of the stack and the thread descriptor is
zeroed. This means we do not have to initialize fields
with initial value zero. This is specifically true for
the 'tid' field which is always set back to zero once the
- stack is not used anymore and for the 'guardsize' field
- which will be read next. */
+ stack is not used anymore. */
}
/* Create or resize the guard area if necessary. */
if (__glibc_unlikely (guardsize > pd->guardsize))
{