This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug libc/20425] New: unbalanced and poor utilization of memory in glibc arenas may cause memory bloat and subsequent OOM


https://sourceware.org/bugzilla/show_bug.cgi?id=20425

            Bug ID: 20425
           Summary: unbalanced and poor utilization of memory in glibc
                    arenas may cause memory bloat and subsequent OOM
           Product: glibc
           Version: 2.12
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: libc
          Assignee: unassigned at sourceware dot org
          Reporter: sumeet.keswani at hpe dot com
                CC: drepper.fsp at gmail dot com
  Target Milestone: ---

Created attachment 9411
  --> https://sourceware.org/bugzilla/attachment.cgi?id=9411&action=edit
Reproducer

Summary:

The [g]libc allocator is not doing a good job reusing memory across arenas.

Furthermore the use of arenas is unbalanced, when some arenas see 4 times the
median number of allocations other arenas see few.

Also significant amount of memory is unused and results in bloat.

We believe this bug has been exposed after this fix for
https://sourceware.org/bugzilla/show_bug.cgi?id=19048

Although our reproducer use glibc-2.12.1-192 (built by RedHat - which includes
fix for Bug 19048) to demonstrate the issue. We highly suspect this issue
exists in the latest version of glibc.
(note that glibc-2.12.1-166 built by RedHat does not show this issue, because
it lacks fix for Bug 19048)


Reproducer:

Attached is the reproducer that contains the files needed to reproduce the
issue...

pthread_arena.exe.out is a run with stats.
pthread_arena.exe.out.tcl.xls is those stats post-processed.

It's a 24 core machine with 96GiB of memory. 
Thus 192 arenas (8*24). 
The median size of an arena is about 16M. 

If you look at iteration 1 (called set 1), you'll see that all arenas are right
around the median. 
i.e., the 500 threads pretty much balanced on the 192 arenas. 

But on the 2nd and subsequent iterations there are several arenas that are 4+
times the median. (and there are a few that dropped to almost nothing). 
What this says is that after the initial population of arenas, the algorithm to
choose an arena when one is needed is very poor, 
causing over-subscription on many and under-subscription on a few. 


Impact on the application :
Most database application account for the memory they use.
The application _is_ freeing memory.  The [g]libc allocator is not doing a good
job reusing it.

Consequently the amount of memory _used_ by the application far exceeds that
what is accounted for by the application.
(i.e. for the application believes it uses 3G but the RSS is actually 7G, due
to the poor utilization as a result of this bug). This can result in a OOM
error/exception to the application when it goes to allocates more memory and
there isn't any available on the machine


If subscription could become balanced, that might solve the problem.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]