This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug libc/12486] New: Data race in Glibc's malloc debugging routines


http://sourceware.org/bugzilla/show_bug.cgi?id=12486

           Summary: Data race in Glibc's malloc debugging routines
           Product: glibc
           Version: 2.13
            Status: NEW
          Severity: normal
          Priority: P2
         Component: libc
        AssignedTo: drepper.fsp@gmail.com
        ReportedBy: kaushikv@umich.edu


-------------------------------------
Data race in Glibc's malloc debugging routines
-------------------------------------

Authors: Kaushik Veeraraghavan & Jason Flinn
Email: {kaushikv,jflinn}@umich.edu
Date: February 13, 2011

--------
Summary:
--------

When running Glibc with MALLOC_DEBUG enabled, a data race in how the
malloc statistics counters (e.g.: n_mmaps, max_n_mmaps, etc.)  are
accessed can cause an assert() to incorrectly fire and abort program
execution.

---------------
Affected versions:
---------------

Glibc-2.13 to Glibc-2.5.1. (Possibly affects versions prior to
Glibc-2.5.1 as well, but I haven't verified this.)

-----
Setup:
-----

Target architecture: x86 (32-bit)

Kernel: Linux 2.6.26 #1575 SMP Fri Feb 11 21:39:52 EST 2011 i686 i686
i386 GNU/Linux

Configuration: ../configure --disable-profile --enable-add-ons
--enable-kernel=2.6.0 --without-gd --without-selinux

gcc version 4.1.2 20080704 (Red Hat 4.1.2-44)

GNU ld version 2.17.50.0.6-9.el5 20061020

------
Details:
------

Glibc implements a number of assertion checks in its malloc routines
that can be enabled by compiling with -DMALLOC_DEBUG. These checks
verify the state of malloc's internal data structures: an assertion
firing likely indicates that a user program has trashed memory.

However, when testing a multi-threaded program linked to Glibc w/
-DMALLOC_DEBUG enabled, we found that the "assert(mp_.n_mmaps <=
mp_.max_n_mmaps);" in malloc/malloc.c:do_check_malloc_state(), line
2935, can fire incorrectly due to a data race in Glibc's malloc
routines.

The buggy interleaving is as follows:

Thread 1                                                                   
Thread 2

sYSMALLOc()                                                            
do_check_malloc_state()
{                                                                              
    {
...                                                                            
    ...
/* update statistics */
if (++mp_.n_mmaps > mp_.max_n_mmaps)
                                                                               
    assert(mp_.n_mmaps <= mp_.max_n_mmaps);
      mp_.max_n_mmaps = mp_.n_mmaps;             
...                                                                            
   ...
}                                                                              
    }

Specifically, observe that the increment of "mp_.n_mmaps" in
malloc/malloc.c:sYSMALLOc(), line 3060 is not atomic with the update
"mp_.max_n_mmaps = mp_.n_mmaps;". Hence, as shown above, it is
possible that Thread 1 has incremented n_mmaps but not updated
max_n_mmaps yet. If Thread 2 concurrently execues the
"assert(mp_.n_mmaps <= mp_.max_n_mmaps);" check in
malloc/malloc.c:do_check_malloc_state(), line 2935, the assertion will
fire incorrectly and abort the program.

-----------
Recreate race:
-----------

This race can be easily triggered by adding a sleep() before updating
"max_n_mmaps" and before the "assert()", as shown in the diff below. A
simple test program with two threads that allocate a large chunk of
memory (say, 1 GB), triggers the race on my machine.

2934a2935
>   sleep (1);
3060c3061,3062
<     if (++mp_.n_mmaps > mp_.max_n_mmaps)
---
> 	if (++mp_.n_mmaps > mp_.max_n_mmaps) {
> 	  sleep (2);
3061a3064
> 	}

-------
Solution:
-------

Acquiring a lock before updating statistics counters for the default
case is probably too expensive in terms of performance.  Additionally,
since the assert() checks are disabled in the default case, racy
updates to 'max_n_mmaps' and other counters will report false
statistics but are otherwise benign.

However, to address the issue that a racy update can cause an assert
to fire and abort user programs, we propose the solution that a lock
be grabbed before accessing (read/update) statistics counters when
MALLOC_DEBUG is enabled. As a user running Glibc with the MALLOC_DEBUG
flag has already accepted the associated performance penalty, grabbing
a lock to ensure correctness is a viable solution.

The patch pasted below fixes the race described above, and also
ensures that a lock is held when accessing malloc's various statistics
counters which ensures that correct values are reported when
MALLOC_DEBUG is enabled.


diff -uNr /Users/kaushikv/Desktop/glibc-2.13/malloc/arena.c
glibc-2.13/malloc/arena.c
--- /Users/kaushikv/Desktop/glibc-2.13/malloc/arena.c    2011-02-12
19:30:45.000000000 -0500
+++ glibc-2.13/malloc/arena.c    2011-02-12 20:22:37.000000000 -0500
@@ -409,7 +409,14 @@
 #if DEFAULT_TOP_PAD != 0
   mp_.top_pad        = DEFAULT_TOP_PAD;
 #endif
+#ifdef MALLOC_DEBUG
+  mutex_init(&mmap_mutex);
+  (void)mutex_lock(&mmap_mutex);
+#endif
   mp_.n_mmaps_max    = DEFAULT_MMAP_MAX;
+#ifdef MALLOC_DEBUG
+  (void)mutex_unlock(&mmap_mutex);
+#endif
   mp_.mmap_threshold = DEFAULT_MMAP_THRESHOLD;
   mp_.trim_threshold = DEFAULT_TRIM_THRESHOLD;
   mp_.pagesize       = malloc_getpagesize;
diff -uNr /Users/kaushikv/Desktop/glibc-2.13/malloc/hooks.c
glibc-2.13/malloc/hooks.c
--- /Users/kaushikv/Desktop/glibc-2.13/malloc/hooks.c    2011-02-12
19:30:39.000000000 -0500
+++ glibc-2.13/malloc/hooks.c    2011-02-12 20:41:43.000000000 -0500
@@ -562,8 +562,13 @@
   ms->sbrked_mem_bytes = main_arena.system_mem;
   ms->trim_threshold = mp_.trim_threshold;
   ms->top_pad = mp_.top_pad;
+#ifdef MALLOC_DEBUG
+  (void)mutex_lock(&mmap_mutex);
+#endif
   ms->n_mmaps_max = mp_.n_mmaps_max;
-  ms->mmap_threshold = mp_.mmap_threshold;
+#ifdef MALLOC_DEBUG
+  (void)mutex_unlock(&mmap_mutex);
+#endif  ms->mmap_threshold = mp_.mmap_threshold;
   ms->check_action = check_action;
   ms->max_sbrked_mem = main_arena.max_system_mem;
 #ifdef NO_THREADS
@@ -571,10 +576,16 @@
 #else
   ms->max_total_mem = 0;
 #endif
+#ifdef MALLOC_DEBUG
+  (void)mutex_lock(&mmap_mutex);
+#endif
   ms->n_mmaps = mp_.n_mmaps;
   ms->max_n_mmaps = mp_.max_n_mmaps;
   ms->mmapped_mem = mp_.mmapped_mem;
   ms->max_mmapped_mem = mp_.max_mmapped_mem;
+#ifdef MALLOC_DEBUG
+  (void)mutex_unlock(&mmap_mutex);
+#endif
   ms->using_malloc_checking = using_malloc_checking;
   ms->max_fast = get_max_fast();
 #ifdef PER_THREAD
@@ -654,17 +665,29 @@
   main_arena.system_mem = ms->sbrked_mem_bytes;
   mp_.trim_threshold = ms->trim_threshold;
   mp_.top_pad = ms->top_pad;
+#ifdef MALLOC_DEBUG
+  (void)mutex_lock(&mmap_mutex);
+#endif
   mp_.n_mmaps_max = ms->n_mmaps_max;
+#ifdef MALLOC_DEBUG
+  (void)mutex_unlock(&mmap_mutex);
+#endif
   mp_.mmap_threshold = ms->mmap_threshold;
   check_action = ms->check_action;
   main_arena.max_system_mem = ms->max_sbrked_mem;
 #ifdef NO_THREADS
   mp_.max_total_mem = ms->max_total_mem;
 #endif
+#ifdef MALLOC_DEBUG
+  (void)mutex_lock(&mmap_mutex);
+#endif
   mp_.n_mmaps = ms->n_mmaps;
   mp_.max_n_mmaps = ms->max_n_mmaps;
   mp_.mmapped_mem = ms->mmapped_mem;
   mp_.max_mmapped_mem = ms->max_mmapped_mem;
+#ifdef MALLOC_DEBUG
+  (void)mutex_unlock(&mmap_mutex);
+#endif
   /* add version-dependent code here */
   if (ms->version >= 1) {
     /* Check whether it is safe to enable malloc checking, or whether
diff -uNr /Users/kaushikv/Desktop/glibc-2.13/malloc/malloc.c
glibc-2.13/malloc/malloc.c
--- /Users/kaushikv/Desktop/glibc-2.13/malloc/malloc.c    2011-02-12
19:30:27.000000000 -0500
+++ glibc-2.13/malloc/malloc.c    2011-02-12 20:22:37.000000000 -0500
@@ -2586,7 +2586,7 @@
 #define check_malloc_state(A)

 #else
-
+static mutex_t mmap_mutex; /* Mutex used to protect malloc parameter updates
*/
 #define check_chunk(A,P)              do_check_chunk(A,P)
 #define check_free_chunk(A,P)         do_check_free_chunk(A,P)
 #define check_inuse_chunk(A,P)        do_check_inuse_chunk(A,P)
@@ -2932,6 +2932,9 @@
   assert(total <= (unsigned long)(mp_.max_total_mem));
   assert(mp_.n_mmaps >= 0);
 #endif
+#ifdef MALLOC_DEBUG
+  (void)mutex_lock(&mmap_mutex);
+#endif
   assert(mp_.n_mmaps <= mp_.max_n_mmaps);

   assert((unsigned long)(av->system_mem) <=
@@ -2939,6 +2942,9 @@

   assert((unsigned long)(mp_.mmapped_mem) <=
      (unsigned long)(mp_.max_mmapped_mem));
+#ifdef MALLOC_DEBUG
+  (void)mutex_unlock(&mmap_mutex);
+#endif

 #ifdef NO_THREADS
   assert((unsigned long)(mp_.max_total_mem) >=
@@ -2989,7 +2995,9 @@

   size_t          pagemask  = mp_.pagesize - 1;
   bool            tried_mmap = false;
-
+#ifdef MALLOC_DEBUG
+  bool            mmap_test = false;
+#endif

 #if HAVE_MMAP

@@ -3000,8 +3008,17 @@
     rather than expanding top.
   */

-  if ((unsigned long)(nb) >= (unsigned long)(mp_.mmap_threshold) &&
-      (mp_.n_mmaps < mp_.n_mmaps_max)) {
+#ifdef MALLOC_DEBUG
+  (void)mutex_lock(&mmap_mutex);
+  mmap_test = ((unsigned long)(nb) >= (unsigned long)(mp_.mmap_threshold) &&
+           (mp_.n_mmaps < mp_.n_mmaps_max));
+  (void)mutex_unlock(&mmap_mutex);
+
+  if (mmap_test) {
+#else
+  if (((unsigned long)(nb) >= (unsigned long)(mp_.mmap_threshold) &&
+       (mp_.n_mmaps < mp_.n_mmaps_max)) {
+#endif

     char* mm;             /* return value from mmap call*/

@@ -3056,13 +3073,18 @@
       }

     /* update statistics */
-
+#ifdef MALLOC_DEBUG
+    (void)mutex_lock(&mmap_mutex);
+#endif
     if (++mp_.n_mmaps > mp_.max_n_mmaps)
       mp_.max_n_mmaps = mp_.n_mmaps;

     sum = mp_.mmapped_mem += size;
     if (sum > (unsigned long)(mp_.max_mmapped_mem))
       mp_.max_mmapped_mem = sum;
+#ifdef MALLOC_DEBUG
+    (void)mutex_unlock(&mmap_mutex);
+#endif
 #ifdef NO_THREADS
     sum += av->system_mem;
     if (sum > (unsigned long)(mp_.max_total_mem))
@@ -3542,8 +3564,14 @@
       return;
     }

+#ifdef MALLOC_DEBUG
+  (void)mutex_lock(&mmap_mutex);
+#endif
   mp_.n_mmaps--;
   mp_.mmapped_mem -= total_size;
+#ifdef MALLOC_DEBUG
+  (void)mutex_unlock(&mmap_mutex);
+#endif

   int ret __attribute__ ((unused)) = munmap((char *)block, total_size);

@@ -3592,10 +3620,16 @@
   assert((p->prev_size == offset));
   set_head(p, (new_size - offset)|IS_MMAPPED);

+#ifdef MALLOC_DEBUG
+  (void)mutex_lock(&mmap_mutex);
+#endif
   mp_.mmapped_mem -= size + offset;
   mp_.mmapped_mem += new_size;
   if ((unsigned long)mp_.mmapped_mem > (unsigned long)mp_.max_mmapped_mem)
     mp_.max_mmapped_mem = mp_.mmapped_mem;
+#ifdef MALLOC_DEBUG
+  (void)mutex_unlock(&mmap_mutex);
+#endif
 #ifdef NO_THREADS
   if ((unsigned long)(mp_.mmapped_mem + arena_mem + main_arena.system_mem) >
       mp_.max_total_mem)
@@ -5406,9 +5440,15 @@
       assert((newp->prev_size == offset));

       /* update statistics */
+#ifdef MALLOC_DEBUG
+    (void)mutex_lock(&mmap_mutex);
+#endif
       sum = mp_.mmapped_mem += newsize - oldsize;
       if (sum > (unsigned long)(mp_.max_mmapped_mem))
     mp_.max_mmapped_mem = sum;
+#ifdef MALLOC_DEBUG
+    (void)mutex_unlock(&mmap_mutex);
+#endif
 #ifdef NO_THREADS
       sum += main_arena.system_mem;
       if (sum > (unsigned long)(mp_.max_total_mem))
@@ -5725,10 +5765,26 @@
      we would not be able to later free/realloc space internal
      to a segregated mmap region.
   */
+
+#ifdef MALLOC_DEBUG
+    (void)mutex_lock(&mmap_mutex);
+#endif
   mmx = mp_.n_mmaps_max;   /* disable mmap */
   mp_.n_mmaps_max = 0;
+#ifdef MALLOC_DEBUG
+    (void)mutex_unlock(&mmap_mutex);
+#endif
+
   mem = _int_malloc(av, size);
+
+#ifdef MALLOC_DEBUG
+    (void)mutex_lock(&mmap_mutex);
+#endif
   mp_.n_mmaps_max = mmx;   /* reset mmap */
+#ifdef MALLOC_DEBUG
+    (void)mutex_unlock(&mmap_mutex);
+#endif
+
   if (mem == 0)
     return 0;

@@ -5959,8 +6015,16 @@
   mi.fordblks = avail;
   mi.uordblks = av->system_mem - avail;
   mi.arena = av->system_mem;
+
+#ifdef MALLOC_DEBUG
+    (void)mutex_lock(&mmap_mutex);
+#endif
   mi.hblks = mp_.n_mmaps;
   mi.hblkhd = mp_.mmapped_mem;
+#ifdef MALLOC_DEBUG
+    (void)mutex_unlock(&mmap_mutex);
+#endif
+
   mi.fsmblks = fastavail;
   mi.keepcost = chunksize(av->top);
   mi.usmblks = mp_.max_total_mem;
@@ -5976,11 +6040,22 @@
   int i;
   mstate ar_ptr;
   struct mallinfo mi;
-  unsigned int in_use_b = mp_.mmapped_mem, system_b = in_use_b;
+#ifdef MALLOC_DEBUG
+    unsigned int in_use_b, system_b;
+#else
+    unsigned int in_use_b = mp_.mmapped_mem, system_b = in_use_b;
+#endif
 #if THREAD_STATS
   long stat_lock_direct = 0, stat_lock_loop = 0, stat_lock_wait = 0;
 #endif

+#ifdef MALLOC_DEBUG
+    (void)mutex_lock(&mmap_mutex);
+    in_use_b = mp_.mmapped_mem;
+    system_b = in_use_b;
+    (void)mutex_unlock(&mmap_mutex);
+#endif
+
   if(__malloc_initialized < 0)
     ptmalloc_init ();
 #ifdef _LIBC
@@ -6019,11 +6094,18 @@
 #ifdef NO_THREADS
   fprintf(stderr, "max system bytes = %10u\n", (unsigned
int)mp_.max_total_mem);
 #endif
+
+#ifdef MALLOC_DEBUG
+  (void)mutex_lock(&mmap_mutex);
+#endif
 #if HAVE_MMAP
   fprintf(stderr, "max mmap regions = %10u\n", (unsigned int)mp_.max_n_mmaps);
   fprintf(stderr, "max mmap bytes   = %10lu\n",
       (unsigned long)mp_.max_mmapped_mem);
 #endif
+#ifdef MALLOC_DEBUG
+  (void)mutex_unlock(&mmap_mutex);
+#endif
 #if THREAD_STATS
   fprintf(stderr, "heaps created    = %10d\n",  stat_n_heaps);
   fprintf(stderr, "locked directly  = %10ld\n", stat_lock_direct);
@@ -6094,7 +6176,13 @@
       res = 0;
     else
 #endif
+#ifdef MALLOC_DEBUG
+  (void)mutex_lock(&mmap_mutex);
+#endif
       mp_.n_mmaps_max = value;
+#ifdef MALLOC_DEBUG
+  (void)mutex_unlock(&mmap_mutex);
+#endif
       mp_.no_dyn_threshold = 1;
     break;

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]