This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] malloc: a question about arena_get2()


On 2015/12/3 17:12, Xishi Qiu wrote:

> On 2015/12/3 15:50, Zhangjian (Bamvor) wrote:
> 

ping

>> Hi,
>>
> 
> I tested the latest glibc, and it still have the problem.
> There are ten transparent pages allocated.
> cat /proc/xx/smaps |grep AnonHugePages
> 
> If I change "if (__builtin_expect (n <= narenas_limit - 1, 0))"
> to "if (__builtin_expect (n < narenas_limit, 0))", it seems OK.
> There is no transparent page allocated.
> 
> Thanks,
> Xishi Qiu
> 
>>
>> On 2015/12/3 10:15, Xishi Qiu wrote:
>>> Hi,
>>>
>>> In function arena_get2(), I'm not quite understand the comment
>>> about the following code, why the underflow is OK?
>> A little bit background, we found 200-300M memory consumption
>> compare with our old system(glibc 2.11). After debugging, we
>> found that each process we created got one THP instead of 4k
>> page which is introduced by commit 41b81892f11f
>> ("Handle ARENA_TEST correctly).
>>> "if (__builtin_expect (n <= narenas_limit - 1, 0))"
>> And this piece of code (glibc2.17) is as same as lastest glibc.
>>>
>>> I find v2.11 is "if (narenas < narenas_limit)".
>>>
>>> I do a simple test, and find the child thread will create a new
>>> heap when call malloc(), and v2.11 is not.
>>>
>>> v2.17:
>>> __libc_malloc()->arena_lock()->arena_get2()->_int_new_arena()->new_heap()
>>> then the return addr of malloc() is aligned to 2M, this will lead
>>> to alloc transparent page in kernel. So it consume more memory than v2.11
>>> (v2.11 is only 4kb)even user only malloc but not write/read the area.
>> Indeed, there is hundreds of threads in our system, it means that it takes
>> hundreds of Mega bytes memory after malloc.
>>
>> Regards
>>
>> Bamvor
>>
>>> v2.11:
>>> public_mALLOc()->arena_lock()->arena_get2()->reused_arena()
>>> and public_mALLOc() will finally call sYSMALLOc()->mmap() syscall.
>>> so the return adddr is only aligned to 4kb.
>>>
>>>
>>> Here is my test code. Please use "gcc test.c -lpthread" to
>>> compile it.
>>>
>>> Thanks,
>>> Xishi Qiu
>>>
>>> #include <stdio.h>
>>> #include <stdlib.h>
>>> #include <string.h>
>>> #include <sys/types.h>
>>> #include <sys/stat.h>
>>> #include <unistd.h>
>>> #include <pthread.h>
>>>
>>> void *run_thr_adp( void *p )
>>> {
>>> 	int size = 5000*1000;
>>> 	char *test;
>>>
>>> 	printf("malloc start\n");
>>> 	test = (char *)malloc(size);
>>> 	if (test)
>>> 		printf("malloc success, start=0x%lx, end=0x%lx\n",
>>> 			test, test+size-1);
>>> 	printf("malloc end\n");
>>>
>>> 	sleep(600);
>>> 	return NULL;
>>> }
>>>
>>> void startThread( void ( *thr )( void* ) )
>>> {
>>>     pthread_t pid;
>>>     pthread_create( &pid, NULL, run_thr_adp, ( void* )thr );
>>> }
>>>
>>> int main()
>>> {
>>> 	int i;
>>> 	for (i = 0; i < 10; i++)
>>> 	{
>>> 		startThread(0);
>>> //		sleep(1);
>>> 	}
>>> 	sleep(1000);
>>>
>>> 	return 0;
>>> }
>>>
>>
>>
>> .
>>
> 
> 




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]