This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[PATCH] [v2] malloc: Consistently apply trim_threshold to all heaps

From: Mel Gorman <mgorman at suse dot de>
To: Carlos O'Donell <carlos at redhat dot com>
Cc: Rik van Riel <riel at redhat dot com>, KOSAKI Motohiro <kosaki dot motohiro at gmail dot com>, Konstantin Serebryany <kcc at google dot com>, Minchan Kim <minchan dot kim at gmail dot com>, libc-alpha at sourceware dot org
Date: Tue, 10 Feb 2015 20:27:03 +0000
Subject: [PATCH] [v2] malloc: Consistently apply trim_threshold to all heaps
Authentication-results: sourceware.org; auth=none
References: <20150209140608 dot GD2395 at suse dot de> <54D91E06 dot 7060603 at redhat dot com> <20150209224947 dot GA21275 at suse dot de> <54DA25D0 dot 3050501 at redhat dot com> <20150210165414 dot GF2395 at suse dot de> <54DA5599 dot 2070804 at redhat dot com>

Trimming heaps is a balance between saving memory and the system overhead
required to update page tables and discard allocated pages. The malloc
option M_TRIM_THRESHOLD is a tunable that users are meant to use to decide
where this balance point is but it is only applied to the main arena.

For scalability reasons, glibc malloc has per-thread heaps but these are
shrunk with madvise() if there is one page free at the top of the heap.
In some circumstances this can lead to high system overhead if a thread
has a control flow like

    while (data_to_process) {
        buf = malloc(large_size);
        do_stuff();
        free(buf);
    }

For a large size, the free() will call madvise (pagetable teardown, page
free and TLB flush) every time followed immediately by a malloc (fault,
kernel page alloc, zeroing and charge accounting). The kernel overhead
can dominate such a workload.

This patch allows the user to tune when madvise gets called by applying
the trim threshold to the per-thread heaps.

This is a basic test-case for the worst case scenario where every free is
a madvise followed by a an alloc

/* gcc bench-free.c -lpthread -o bench-free */
static int num = 1024;

void __attribute__((noinline,noclone)) dostuff (void *p)
{
}

void *worker (void *data)
{
  int i;

  for (i = num; i--;)
    {
      void *m = malloc (48*4096);
      dostuff (m);
      free (m);
    }

  return NULL;
}

int main()
{
  int i;
  pthread_t t;
  void *ret;
  if (pthread_create (&t, NULL, worker, NULL))
    exit (2);
  if (pthread_join (t, &ret))
    exit (3);
  return 0;
}

Before the patch, this resulted in 1024 calls to madvise. With the patch
applied, madvise is called twice because the default trim threshold is
high enough to avoid the problem. This a more complex case where there is
a mix of mallocs and frees. It's simply a different worker function for
the test case above

void *worker (void *data)
{
  int i;
  int j = 0;
  void *free_index[num];

  for (i = num; i--;)
    {
      void *m = malloc ((i % 58) *4096);
      dostuff (m);
      if (i % 2 == 0) {
        free (m);
      } else {
        free_index[j++] = m;
      }
    }
  for (; j >= 0; j--)
    {
      free(free_index[j]);
    }

  return NULL;
}

glibc 2.21 calls malloc 90305 times but with the patch applied, it's
called 13438. Increasing the trim threshold will decrease the number of
times it's called if a user ever detected that madvise overhead and
refaulting was a problem.

ebizzy is meant to generate a workload resembling common web application
server workloads. It is threaded with a large working set that at its core
has an allocation, do_stuff, free loop that also hits this case. The primary
metric of the benchmark is records processed per second. This is running on
my desktop which is a single socket machine with an I7-4770 and 8 cores.
Each thread count was run for 30 seconds. It was only run once as the
performance difference is so high that the variation is insignificant.

                glibc 2.21              patch
threads 1            10230              44114
threads 2            19153              84925
threads 4            34295             134569
threads 8            51007             183387

Note that the saving happens to be a concidence as the size allocated
by ebizzy was less than the default threshold. If a different number of
chunks were specified then it may also be necessary to tune the threshold
to compensate

By default on my machine, the patch is roughly quadrupling the performance
of this benchmark. The difference in system CPU usage illustrates why.

ebizzy running 1 thread with glibc 2.21
10230 records/s 306904
real 30.00 s
user  7.47 s
sys  22.49 s

22.49 seconds was spent in the kernel for a workload runinng 30 seconds. With the
patch applied

ebizzy running 1 thread with patch applied
44126 records/s 1323792
real 30.00 s
user 29.97 s
sys   0.00 s

system CPU usage was zero with the patch applied. strace shows that glibc
running this workload calls madvise approximately 9000 times a second. With
the patch applied madvise was called twice during the workload (or 0.06
times per second).

  * malloc/arena.c (free): Apply trim threshold to per-thread heaps
    as well as the main arena.
---
 ChangeLog      | 5 +++++
 malloc/arena.c | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index dc1ed1ba1249..b860b2fe1850 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,8 @@
+2015-02-10  Mel Gorman  <mgorman@suse.de>
+
+	* malloc/arena.c (free): Apply trim threshold to per-thread heaps
+	as well as the main arena.
+
 2015-02-06  Carlos O'Donell  <carlos@systemhalted.org>
 
 	* version.h (RELEASE): Set to "stable".
diff --git a/malloc/arena.c b/malloc/arena.c
index 886defb074a2..a78d4835a825 100644
--- a/malloc/arena.c
+++ b/malloc/arena.c
@@ -696,7 +696,7 @@ heap_trim (heap_info *heap, size_t pad)
     }
   top_size = chunksize (top_chunk);
   extra = (top_size - pad - MINSIZE - 1) & ~(pagesz - 1);
-  if (extra < (long) pagesz)
+  if (extra < (long) mp_.trim_threshold)
     return 0;
 
   /* Try to shrink. */

Follow-Ups:
- Re: [PATCH] [v2] malloc: Consistently apply trim_threshold to all heaps
  - From: Siddhesh Poyarekar

References:
- [PATCH] [RFC] malloc: Reduce worst-case behaviour with madvise and refault overhead
  - From: Mel Gorman
- Re: [PATCH] [RFC] malloc: Reduce worst-case behaviour with madvise and refault overhead
  - From: Carlos O'Donell
- Re: [PATCH] [RFC] malloc: Reduce worst-case behaviour with madvise and refault overhead
  - From: Mel Gorman
- Re: [PATCH] [RFC] malloc: Reduce worst-case behaviour with madvise and refault overhead
  - From: Carlos O'Donell
- Re: [PATCH] [RFC] malloc: Reduce worst-case behaviour with madvise and refault overhead
  - From: Mel Gorman
- Re: [PATCH] [RFC] malloc: Reduce worst-case behaviour with madvise and refault overhead
  - From: Carlos O'Donell

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]