This is the mail archive of the glibc-cvs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

GNU C Library master sources branch dj/malloc updated. glibc-2.23-554-gf9a7d78


This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, dj/malloc has been updated
       via  f9a7d78b73ab0bb943413d8641e5d36c15dddc79 (commit)
      from  e4650ee4a81d530a3062621c65fcfc76c575ecbe (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
http://sourceware.org/git/gitweb.cgi?p=glibc.git;a=commitdiff;h=f9a7d78b73ab0bb943413d8641e5d36c15dddc79

commit f9a7d78b73ab0bb943413d8641e5d36c15dddc79
Author: Carlos O'Donell <carlos@systemhalted.org>
Date:   Sat Jul 16 22:19:03 2016 -0400

    Enhance the tracer with new data and fixes.
    
    * Increase trace entry to 64-bytes.
    
    The following patch increases the trace entry to 64-bytes, still a
    proper multiple of the shared memory window size. While we have doubled
    the entry size the on-disk format is still smaller than the ASCII
    version. In the future we may wish to add variable sized records, but
    for now the simplicity of this method works well.
    
    With the extra bytes we are going to:
    - Record internal size information for incoming (free) and outgoing
      chunks (malloc, calloc, realloc, etc).
      - Simplifies accounting of RSS usage and provides an extra cross check
        between malloc<->free based on internal chunk sizes.
    - Record alignment information for memalign, and posix_memalign.
      - Continues to extend the tracer to the full API.
    - Leave 128-bits of padding for future path uses.
      - Useful for more path information.
    
    Additionally __MTB_TYPE_POSIX_MEMALIGN is added for the sole purpose of
    recording the trace only so that we can hard-fail in the workload
    converter when we see such an entry.
    
    Lastly C_MEMALIGN, C_VALLOC, C_PVALLOC, and C_POSIX_MEMALIGN are added
    for workload entries for the sake of completeness.
    
    Builds on x86_64, capture looks good and it works.
    
    * Teach trace_dump about the new entries.
    
    The following patch teaches trace_dump about the new posix_memalign
    entry. It also teaches trace_dump about the new size2 and size3 fields.
    Tested by tracing a program that uses malloc, free, and memalign and
    verifying that the extra fields show the expected chunk sizes, and
    alignments dumped with trace_dump.
    
    Tested on x86_64 with no apparently problems.
    
    * Teach trace2wl and trace_run about new entries
    
    (a) trace2wl changes:
    
    The following patch teaches trace2wl how to output entries for valloc
    and pvalloc, it does so exactly the same way it does for malloc, since
    from the perspective of the API they are identical.
    
    Additionally trace2wl is taught how to output an event for memalign,
    storing alignment and size in the event record.
    
    Lastly posix_memalign is detected and the converter aborted if it's
    seen.  It is my opinion that we should not ignore this data during
    conversion.  If we see a need for it we should implement it later.
    
    (b) trace_run changes:
    
    Some cosmetic cleanup in printing 'pthread_t' which is always an address
    of the struct pthread structure in memory, so to make debugging easier
    we should print the value as a hex pointer.
    
    Teach the simulator how to run memalign. With the newly recorded
    alignment information we double check that the resulting memory is
    correctly aligned.
    
    We do not implement valloc and pvalloc, they will abort the simulator.
    This is incremental progress.
    
    Tested on x86_64 by converting and running a multithreaded test
    application that calls calloc, malloc, free, and memalign.
    
    * Disable recursive traces and save new data.
    
    (a) Adds support for disabling recurisvely recorded traces e.g. realloc
    calling malloc no longer produces a realloc and malloc trace event. We
    solve this by using a per-thread variable to disable new trace creation,
    but allow path bits to be set.  This lets us record the code paths
    taken, but only record one public API event.
    
    (b) Save internal chunk size information into trace events for all APIs.
    The most important is free where we record the free size, this allows
    easier tooling to compute running idea RSS values.
    
    Tested on x86_64 with some small applications and test programs.

diff --git a/malloc/malloc.c b/malloc/malloc.c
index 35ac622..82608e1 100644
--- a/malloc/malloc.c
+++ b/malloc/malloc.c
@@ -1139,8 +1139,21 @@ volatile __malloc_trace_map_entry *__malloc_trace_buffer = NULL;
 /* The file we're mapping them to.  */
 char * __malloc_trace_filename = NULL;
 
+/* Global trace enable flag.  Default off.
+   If global trace enable is 1 then tracing is carried out for all
+   threads.  Otherwise no threads trace calls.  */
 volatile int __malloc_trace_enabled = 0;
 
+/* Per-thread trace enable flag.  Default on.
+   If thread trace enable is 1 then tracing for the thread behaves as expected
+   per the global trace enabled value.
+   If thread trace enable is 0 then __MTB_TRACE_ENTRY and __MTB_TRACE_SET
+   do nothing, only __MTB_TRACE_PATH sets path bits i.e. no new traces are
+   created, the existing trace is used to store path bits.
+   The purpose of this is to allow the implementation to nest public API
+   calls, track paths, without creating multiple nested trace events.  */
+__thread int __malloc_thread_trace_enabled = 1;
+
 static __thread int __malloc_trace_last_num = -1;
 static __thread __malloc_trace_buffer_ptr trace_ptr;
 
@@ -1228,7 +1241,7 @@ __mtb_trace_entry (uint32_t type, size_t size, void *ptr1)
 	    {
 	      /* FIXME: Better handling of errors?  */
 	      _m_printf("Can't open trace_buffer file %s\n", __malloc_trace_filename);
-	      __malloc_trace_enabled = 0;
+	      atomic_store_release (&__malloc_trace_enabled, 0);
 	      return;
 	    }
 
@@ -1245,7 +1258,7 @@ __mtb_trace_entry (uint32_t type, size_t size, void *ptr1)
 	    {
 	      /* FIXME: Better handling of errors?  */
 	      _m_printf("Can't map trace_buffer file %s\n", __malloc_trace_filename);
-	      __malloc_trace_enabled = 0;
+	      atomic_store_release (&__malloc_trace_enabled, 0);
 	      return;
 	    }
 
@@ -1295,9 +1308,11 @@ __mtb_trace_entry (uint32_t type, size_t size, void *ptr1)
   trace_ptr->path_munmap = 0;
   trace_ptr->path_m_f_realloc = 0;
   trace_ptr->path = 0;
-  trace_ptr->size = size;
   trace_ptr->ptr1 = ptr1;
   trace_ptr->ptr2 = 0;
+  trace_ptr->size = size;
+  trace_ptr->size2 = 0;
+  trace_ptr->size3 = 0;
 }
 
 /* Initialize the trace buffer and backing file.  The file is
@@ -1371,21 +1386,47 @@ size_t __malloc_trace_sync (void)
   return atomic_load_relaxed (&__malloc_trace_count);
 }
 
-
-#define __MTB_TRACE_ENTRY(type,size,ptr1)		   \
-  if (__builtin_expect (__malloc_trace_enabled, 0)) \
-    __mtb_trace_entry (__MTB_TYPE_##type,size,ptr1);			   \
-  else							   \
-    trace_ptr = 0;
-
-#define __MTB_TRACE_PATH(mpath)		       \
-  if (__builtin_expect (trace_ptr != NULL, 1)) \
+/* CONCURRENCY NOTES: The load acquire here synchronizes with the store release
+   from __malloc_trace_init to ensure that all threads see the initialization
+   done by the first thread that calls __malloc_trace_init.  The load acquire
+   also synchronizes with the store releases in __mtb_trace_entry to ensure
+   that all error cleanup is visible.  Lastly it synchronizes with the store
+   releases from __malloc_trace_pause, __malloc_trace_unpause, and
+   __malloc_trace_top to ensure that all external changes are visible to the
+   current thread.  */
+#define __MTB_TRACE_ENTRY(type, size, ptr1)		   		      \
+  if (__glibc_unlikely (atomic_load_acquire (&__malloc_trace_enabled))	      \
+      && __glibc_unlikely (__malloc_thread_trace_enabled))		      \
+    __mtb_trace_entry (__MTB_TYPE_##type,size,ptr1);
+
+/* Ignore __malloc_thread_trace_enabled and set path bits.  This allows us to
+   track the path of a call without additional traces.  For example realloc
+   can call malloc and free without making new trace, but we record the paths
+   taken in malloc and free.  */
+#define __MTB_TRACE_PATH(mpath)						      \
+  if (__glibc_unlikely (trace_ptr != NULL))				      \
     trace_ptr->path_##mpath = 1;
 
-#define __MTB_TRACE_SET(var,value) \
-  if (__builtin_expect (trace_ptr != NULL, 1)) \
+#define __MTB_TRACE_SET(var,value)					      \
+  if (__glibc_unlikely (__malloc_thread_trace_enabled)			      \
+      && __glibc_unlikely (trace_ptr != NULL))				      \
     trace_ptr->var = value;
 
+/* Allow __MTB_TRACE_ENTRY to create new trace entries.  */
+#define __MTB_THREAD_TRACE_ENABLE()					      \
+  ({									      \
+    __malloc_thread_trace_enabled = 1;					      \
+  })
+
+/* Disallow __MTB_TRACE_ENTRY from creating new trace
+   entries. Use of __MTB_TRACE_SET becomes a NOOP, but
+   __MTB_TRACE_PATH still sets the unique path bit in
+   the trace (all path bits are unique).  */
+#define __MTB_THREAD_TRACE_DISABLE()					      \
+  ({									      \
+    __malloc_thread_trace_enabled = 0;					      \
+  })
+
 #else
 void __malloc_trace_init (char *filename) {}
 size_t __malloc_trace_pause (void) { return 0; }
@@ -2677,7 +2718,7 @@ sysmalloc (INTERNAL_SIZE_T nb, mstate av)
       /* Don't try if size wraps around 0 */
       if ((unsigned long) (size) > (unsigned long) (nb))
         {
-	  __MTB_TRACE_PATH(mmap);
+	  __MTB_TRACE_PATH (mmap);
           mm = (char *) (MMAP (0, size, PROT_READ | PROT_WRITE, 0));
 
           if (mm != MAP_FAILED)
@@ -2852,7 +2893,11 @@ sysmalloc (INTERNAL_SIZE_T nb, mstate av)
           /* Call the `morecore' hook if necessary.  */
           void (*hook) (void) = atomic_forced_read (__after_morecore_hook);
           if (__builtin_expect (hook != NULL, 0))
-            (*hook)();
+	    {
+	      __MTB_THREAD_TRACE_DISABLE ();
+              (*hook)();
+	      __MTB_THREAD_TRACE_ENABLE ();
+	    }
         }
       else
         {
@@ -2999,7 +3044,11 @@ sysmalloc (INTERNAL_SIZE_T nb, mstate av)
                       /* Call the `morecore' hook if necessary.  */
                       void (*hook) (void) = atomic_forced_read (__after_morecore_hook);
                       if (__builtin_expect (hook != NULL, 0))
-                        (*hook)();
+			{
+			  __MTB_THREAD_TRACE_DISABLE ();
+                          (*hook)();
+			  __MTB_THREAD_TRACE_ENABLE ();
+			}
                     }
                 }
 
@@ -3162,7 +3211,11 @@ systrim (size_t pad, mstate av)
       /* Call the `morecore' hook if necessary.  */
       void (*hook) (void) = atomic_forced_read (__after_morecore_hook);
       if (__builtin_expect (hook != NULL, 0))
-        (*hook)();
+	{
+	  __MTB_THREAD_TRACE_DISABLE ();
+          (*hook)();
+	  __MTB_THREAD_TRACE_ENABLE ();
+	}
       new_brk = (char *) (MORECORE (0));
 
       LIBC_PROBE (memory_sbrk_less, 2, new_brk, extra);
@@ -3220,7 +3273,7 @@ munmap_chunk (mchunkptr p)
   /* If munmap failed the process virtual memory address space is in a
      bad shape.  Just leave the block hanging around, the process will
      terminate shortly anyway since not much can be done.  */
-  __MTB_TRACE_PATH(munmap);
+  __MTB_TRACE_PATH (munmap);
   __munmap ((char *) block, total_size);
 }
 
@@ -3312,6 +3365,8 @@ __libc_malloc (size_t bytes)
   mstate ar_ptr;
   void *victim;
 
+  __MTB_TRACE_ENTRY (MALLOC, bytes, NULL);
+
 #if USE_TCACHE
   /* int_free also calls request2size, be careful to not pad twice.  */
   size_t tbytes = request2size(bytes);
@@ -3327,11 +3382,7 @@ __libc_malloc (size_t bytes)
       tcache_list = &tcache;
       (void) mutex_unlock (&tcache_mutex);
     }
-#endif
-
-  __MTB_TRACE_ENTRY (MALLOC,bytes,NULL);
 
-#if USE_TCACHE
   if (tc_idx < mp_.tcache_max
       && tcache.entries[tc_idx] != NULL
       && tcache.initted == 1)
@@ -3340,7 +3391,8 @@ __libc_malloc (size_t bytes)
       tcache.entries[tc_idx] = e->next;
       tcache.counts[tc_idx] --;
       __MTB_TRACE_PATH (thread_cache);
-      __MTB_TRACE_SET(ptr2, e);
+      __MTB_TRACE_SET (ptr2, e);
+      __MTB_TRACE_SET (size3, tbytes);
       return (void *) e;
     }
 #endif
@@ -3350,7 +3402,12 @@ __libc_malloc (size_t bytes)
   if (__builtin_expect (hook != NULL, 0))
     {
       __MTB_TRACE_PATH (hook);
-      return (*hook)(bytes, RETURN_ADDRESS (0));
+      __MTB_THREAD_TRACE_DISABLE ();
+      victim = (*hook)(bytes, RETURN_ADDRESS (0));
+      __MTB_THREAD_TRACE_ENABLE ();
+      if (victim != NULL)
+	__MTB_TRACE_SET (size3, chunksize (mem2chunk (victim)));
+      return victim;
     }
 
 #if 0 && USE_TCACHE
@@ -3439,6 +3496,7 @@ __libc_malloc (size_t bytes)
 	(void) mutex_unlock (&ar_ptr->mutex);
 
       __MTB_TRACE_SET(ptr2, ent);
+      __MTB_TRACE_SET (size3, chunksize (mem2chunk (ent)));
       return ent;
 	}
     }
@@ -3464,6 +3522,8 @@ __libc_malloc (size_t bytes)
   assert (!victim || chunk_is_mmapped (mem2chunk (victim)) ||
           ar_ptr == arena_for_chunk (mem2chunk (victim)));
   __MTB_TRACE_SET(ptr2, victim);
+  if (victim != NULL)
+    __MTB_TRACE_SET (size3, chunksize (mem2chunk (victim)));
   return victim;
 }
 libc_hidden_def (__libc_malloc)
@@ -3475,12 +3535,21 @@ __libc_free (void *mem)
   mchunkptr p;                          /* chunk corresponding to mem */
 
   __MTB_TRACE_ENTRY (FREE, 0, mem);
+  /* It's very important we record the free size in the trace.
+     This makes verification of tracked malloc<->free's much
+     easier by adding size as a way to correlate the entries also.
+     It also makes it easier to do running tallies of ideal RSS usage
+     from trace data.  */
+  if (mem != 0)
+    __MTB_TRACE_SET (size2, chunksize (mem2chunk (mem)));
 
   void (*hook) (void *, const void *)
     = atomic_forced_read (__free_hook);
   if (__builtin_expect (hook != NULL, 0))
     {
+      __MTB_THREAD_TRACE_DISABLE ();
       (*hook)(mem, RETURN_ADDRESS (0));
+      __MTB_THREAD_TRACE_ENABLE ();
       return;
     }
 
@@ -3520,25 +3589,45 @@ __libc_realloc (void *oldmem, size_t bytes)
 
   void *newp;             /* chunk to return */
 
+  /* The realloc event may include calls to malloc and free, and in those
+     cases we disable the recursive tracing, but continue to record the
+     traced paths.  */
+  __MTB_TRACE_ENTRY (REALLOC, bytes, oldmem);
+  if (oldmem != 0)
+    __MTB_TRACE_SET (size2, chunksize (mem2chunk (oldmem)));
+
   void *(*hook) (void *, size_t, const void *) =
     atomic_forced_read (__realloc_hook);
   if (__builtin_expect (hook != NULL, 0))
-    return (*hook)(oldmem, bytes, RETURN_ADDRESS (0));
-
-  __MTB_TRACE_ENTRY(REALLOC,bytes,oldmem);
+    {
+      __MTB_THREAD_TRACE_DISABLE ();
+      newp = (*hook)(oldmem, bytes, RETURN_ADDRESS (0));
+      __MTB_THREAD_TRACE_ENABLE ();
+      __MTB_TRACE_SET (ptr2, newp);
+      if (newp != 0)
+	__MTB_TRACE_SET (size3, chunksize (mem2chunk (newp)));
+      return newp;
+    }
 
 #if REALLOC_ZERO_BYTES_FREES
   if (bytes == 0 && oldmem != NULL)
     {
-      __libc_free (oldmem); return 0;
+      __MTB_THREAD_TRACE_DISABLE ();
+      __libc_free (oldmem);
+      __MTB_THREAD_TRACE_ENABLE ();
+      return 0;
     }
 #endif
 
   /* realloc of null is supposed to be same as malloc */
   if (oldmem == 0)
     {
+      __MTB_THREAD_TRACE_DISABLE ();
       newp = __libc_malloc (bytes);
+      __MTB_THREAD_TRACE_ENABLE ();
       __MTB_TRACE_SET (ptr2, newp);
+      if (newp != 0)
+	__MTB_TRACE_SET (size3, chunksize (mem2chunk (newp)));
       return newp;
     }
 
@@ -3575,10 +3664,14 @@ __libc_realloc (void *oldmem, size_t bytes)
 	 always make a copy (and do not free the old chunk).  */
       if (DUMPED_MAIN_ARENA_CHUNK (oldp))
 	{
+	  void *newmem;
 	  /* Must alloc, copy, free. */
-	  void *newmem = __libc_malloc (bytes);
+	  __MTB_THREAD_TRACE_DISABLE ();
+	  newmem = __libc_malloc (bytes);
+	  __MTB_THREAD_TRACE_ENABLE ();
 	  if (newmem == 0)
 	    return NULL;
+
 	  /* Copy as many bytes as are available from the old chunk
 	     and fit into the new size.  NB: The overhead for faked
 	     mmapped chunks is only SIZE_SZ, not 2 * SIZE_SZ as for
@@ -3586,6 +3679,8 @@ __libc_realloc (void *oldmem, size_t bytes)
 	  if (bytes > oldsize - SIZE_SZ)
 	    bytes = oldsize - SIZE_SZ;
 	  memcpy (newmem, oldmem, bytes);
+          __MTB_TRACE_SET (ptr2, newmem);
+	  __MTB_TRACE_SET (size3, chunksize (mem2chunk (newmem)));
 	  return newmem;
 	}
 
@@ -3596,6 +3691,7 @@ __libc_realloc (void *oldmem, size_t bytes)
       if (newp)
 	{
 	  __MTB_TRACE_SET (ptr2, chunk2mem (newp));
+	  __MTB_TRACE_SET (size3, chunksize ((mchunkptr) newp));
 	  return chunk2mem (newp);
 	}
 #endif
@@ -3603,19 +3699,23 @@ __libc_realloc (void *oldmem, size_t bytes)
       if (oldsize - SIZE_SZ >= nb)
 	{
 	  __MTB_TRACE_SET (ptr2, oldmem);
+	  __MTB_TRACE_SET (size3, chunksize (mem2chunk (oldmem)));
 	  return oldmem;                         /* do nothing */
 	}
 
       __MTB_TRACE_PATH (m_f_realloc);
 
       /* Must alloc, copy, free. */
+      __MTB_THREAD_TRACE_DISABLE ();
       newmem = __libc_malloc (bytes);
+      __MTB_THREAD_TRACE_ENABLE ();
       if (newmem == 0)
         return 0;              /* propagate failure */
 
       memcpy (newmem, oldmem, oldsize - 2 * SIZE_SZ);
       munmap_chunk (oldp);
       __MTB_TRACE_SET (ptr2, newmem);
+      __MTB_TRACE_SET (size3, chunksize (mem2chunk (newmem)));
       return newmem;
     }
 
@@ -3632,7 +3732,9 @@ __libc_realloc (void *oldmem, size_t bytes)
       /* Try harder to allocate memory in other arenas.  */
       LIBC_PROBE (memory_realloc_retry, 2, bytes, oldmem);
       __MTB_TRACE_PATH (m_f_realloc);
+      __MTB_THREAD_TRACE_DISABLE ();
       newp = __libc_malloc (bytes);
+      __MTB_THREAD_TRACE_ENABLE ();
       if (newp != NULL)
         {
           memcpy (newp, oldmem, oldsize - SIZE_SZ);
@@ -3640,6 +3742,8 @@ __libc_realloc (void *oldmem, size_t bytes)
         }
     }
 
+  if (newp != 0)
+    __MTB_TRACE_SET (size3, chunksize (mem2chunk (newp)));
   __MTB_TRACE_SET (ptr2, newp);
   return newp;
 }
@@ -3652,8 +3756,11 @@ __libc_memalign (size_t alignment, size_t bytes)
   void *rv;
 
   __MTB_TRACE_ENTRY (MEMALIGN, bytes, NULL);
+  __MTB_TRACE_SET (size2, alignment);
   rv = _mid_memalign (alignment, bytes, address);
   __MTB_TRACE_SET (ptr2, rv);
+  if (rv != 0)
+    __MTB_TRACE_SET (size3, chunksize (mem2chunk (rv)));
   return rv;
 }
 
@@ -3666,11 +3773,21 @@ _mid_memalign (size_t alignment, size_t bytes, void *address)
   void *(*hook) (size_t, size_t, const void *) =
     atomic_forced_read (__memalign_hook);
   if (__builtin_expect (hook != NULL, 0))
-    return (*hook)(alignment, bytes, address);
+    {
+      __MTB_THREAD_TRACE_DISABLE ();
+      p = (*hook)(alignment, bytes, address);
+      __MTB_THREAD_TRACE_ENABLE ();
+      return p;
+    }
 
   /* If we need less alignment than we give anyway, just relay to malloc.  */
   if (alignment <= MALLOC_ALIGNMENT)
-    return __libc_malloc (bytes);
+    {
+      __MTB_THREAD_TRACE_DISABLE ();
+      p = __libc_malloc (bytes);
+      __MTB_THREAD_TRACE_ENABLE ();
+      return p;
+    }
 
   /* Otherwise, ensure that it is at least a minimum chunk size */
   if (alignment < MINSIZE)
@@ -3733,8 +3850,11 @@ __libc_valloc (size_t bytes)
   void *rv;
 
   __MTB_TRACE_ENTRY (VALLOC, bytes, NULL);
+  __MTB_TRACE_SET (size2, pagesize);
   rv = _mid_memalign (pagesize, bytes, address);
   __MTB_TRACE_SET (ptr2, rv);
+  if (rv != 0)
+    __MTB_TRACE_SET (size3, chunksize (mem2chunk (rv)));
   return rv;
 }
 
@@ -3760,6 +3880,8 @@ __libc_pvalloc (size_t bytes)
   __MTB_TRACE_ENTRY (PVALLOC, bytes, NULL);
   rv = _mid_memalign (pagesize, rounded_bytes, address);
   __MTB_TRACE_SET (ptr2, rv);
+  if (rv != 0)
+    __MTB_TRACE_SET (size3, chunksize (mem2chunk (rv)));
   return rv;
 }
 
@@ -3794,11 +3916,14 @@ __libc_calloc (size_t n, size_t elem_size)
     {
       sz = bytes;
       __MTB_TRACE_PATH (hook);
+      __MTB_THREAD_TRACE_DISABLE ();
       mem = (*hook)(sz, RETURN_ADDRESS (0));
+      __MTB_THREAD_TRACE_ENABLE ();
       if (mem == 0)
         return 0;
 
       __MTB_TRACE_SET (ptr2, mem);
+      __MTB_TRACE_SET (size3, chunksize (mem2chunk (mem)));
       return memset (mem, 0, sz);
     }
 
@@ -3856,6 +3981,7 @@ __libc_calloc (size_t n, size_t elem_size)
     return 0;
 
   p = mem2chunk (mem);
+  __MTB_TRACE_SET (size3, chunksize (p));
   __MTB_TRACE_SET (ptr2, mem);
 
   /* Two optional cases in which clearing not necessary */
@@ -5881,13 +6007,17 @@ __posix_memalign (void **memptr, size_t alignment, size_t size)
 {
   void *mem;
 
+  __MTB_TRACE_ENTRY (POSIX_MEMALIGN, size, 0);
+  __MTB_TRACE_SET (size2, alignment);
   /* Test whether the SIZE argument is valid.  It must be a power of
      two multiple of sizeof (void *).  */
   if (alignment % sizeof (void *) != 0
       || !powerof2 (alignment / sizeof (void *))
       || alignment == 0)
-    return EINVAL;
-
+    {
+      __MTB_TRACE_SET (ptr1, (void *) EINVAL);
+      return EINVAL;
+    }
 
   void *address = RETURN_ADDRESS (0);
   mem = _mid_memalign (alignment, size, address);
@@ -5895,9 +6025,12 @@ __posix_memalign (void **memptr, size_t alignment, size_t size)
   if (mem != NULL)
     {
       *memptr = mem;
+      __MTB_TRACE_SET (ptr2, mem);
+      __MTB_TRACE_SET (size3, chunksize (mem2chunk (mem)));
       return 0;
     }
 
+  __MTB_TRACE_SET (ptr1, (void *) ENOMEM);
   return ENOMEM;
 }
 weak_alias (__posix_memalign, posix_memalign)
diff --git a/malloc/mtrace.h b/malloc/mtrace.h
index dcb20bb..6ce663d 100644
--- a/malloc/mtrace.h
+++ b/malloc/mtrace.h
@@ -36,9 +36,23 @@ struct __malloc_trace_buffer_s {
   uint32_t path_hook:1; /* A hook was used to complete the request */
   uint32_t path:16; /* remaining bits */
 
+  /* FREE - pointer to allocation to free.
+     REALLOC - pointer to original allocation.
+     POSIX_MEMALIGN - error code */
   void *ptr1;
+  /* pointer to new allocation. */
   void *ptr2;
+  /* requested size. */
   size_t size;
+  /* FREE - internal size of deallocation.
+     REALLOC - internal size of original allocation.
+     MEMALIGN - alignment.
+     POSIX_MEMALIGN - alignment.  */
+  size_t size2;
+  /* internal size of new allocation.  */
+  size_t size3;
+  /* Pad out to 64-bytes for future uses and mmap'd window alignment.  */
+  size_t pad[2];
 };
 
 typedef struct __malloc_trace_buffer_s *__malloc_trace_buffer_ptr;
@@ -62,33 +76,35 @@ size_t __malloc_trace_stop (void);
 size_t __malloc_trace_sync (void);
 
 
-#define __MTB_TYPE_UNUSED	0
+#define __MTB_TYPE_UNUSED		0
 
 /* ptr1 is 0x1234, size is sizeof(void *) - there is one of these at
    the beginning of the trace.  */
-#define __MTB_TYPE_MAGIC	255
+#define __MTB_TYPE_MAGIC		255
 
 /* ptr2 = malloc (size) */
-#define __MTB_TYPE_MALLOC	1
+#define __MTB_TYPE_MALLOC		1
 
 /* ptr2 = calloc (size) */
-#define __MTB_TYPE_CALLOC	2
+#define __MTB_TYPE_CALLOC		2
 
 /* free (ptr1) */
-#define __MTB_TYPE_FREE		3
+#define __MTB_TYPE_FREE			3
 
 /* ptr2 = realloc (ptr1, size) */
-#define __MTB_TYPE_REALLOC	4
+#define __MTB_TYPE_REALLOC		4
 
-/* ptr2 = memalign (size, (int)ptr2) */
-#define __MTB_TYPE_MEMALIGN	5
+/* ptr2 = memalign (size2, size) */
+#define __MTB_TYPE_MEMALIGN		5
 
 /* ptr2 = valloc (size) */
-#define __MTB_TYPE_VALLOC	6
+#define __MTB_TYPE_VALLOC		6
 
 /* ptr2 = pvalloc (size) */
-#define __MTB_TYPE_PVALLOC	7
+#define __MTB_TYPE_PVALLOC		7
 
+/* ptr2 = posix_memalign (ptr1, size2, size)  */
+#define __MTB_TYPE_POSIX_MEMALIGN	8
 
 typedef enum {
   MSCAN_UNUSED,
@@ -117,3 +133,7 @@ void __malloc_scan_chunks (void (*callback)(void * /*ptr*/, size_t /*length*/, i
 #define C_ALLOC_SYNCS 9
 #define C_NTHREADS 10
 #define C_START_THREAD 11
+#define C_MEMALIGN 12
+#define C_VALLOC 13
+#define C_PVALLOC 14
+#define C_POSIX_MEMALIGN 15
diff --git a/malloc/trace2wl.cc b/malloc/trace2wl.cc
index 5fe7b55..e0f4a51 100644
--- a/malloc/trace2wl.cc
+++ b/malloc/trace2wl.cc
@@ -275,16 +275,27 @@ main(int argc, char **argv)
 
 	case __MTB_TYPE_MALLOC:
 	case __MTB_TYPE_CALLOC:
+	case __MTB_TYPE_VALLOC:
+	case __MTB_TYPE_PVALLOC:
 	  acq_ptr (thread, pa2);
 	  if (pa2 && pa2->valid)
-	    printf ("%d: pointer %p malloc'd again?  %d:%s\n", i, pa2->ptr, pa2->reason_idx, pa2->reason);
-	  thread->add (r->type == __MTB_TYPE_MALLOC ? C_MALLOC : C_CALLOC);
+	    printf ("%d: pointer %p alloc'd again?  %d:%s\n", i, pa2->ptr, pa2->reason_idx, pa2->reason);
+
+	  if (r->type == __MTB_TYPE_MALLOC)
+	    thread->add (C_MALLOC);
+	  if (r->type == __MTB_TYPE_CALLOC)
+	    thread->add (C_CALLOC);
+	  if (r->type == __MTB_TYPE_VALLOC)
+	    thread->add (C_VALLOC);
+	  if (r->type == __MTB_TYPE_PVALLOC)
+	    thread->add (C_PVALLOC);
+
 	  thread->add_int (pa2 ? pa2->idx : 0);
 	  thread->add_int (r->size);
 	  if (pa2)
 	    {
 	      pa2->valid = 1;
-	      pa2->reason = "malloc";
+	      pa2->reason = "alloc";
 	      pa2->reason_idx = i;
 	    }
 	  break;
@@ -334,6 +345,28 @@ main(int argc, char **argv)
 	    }
 
 	  break;
+
+	case __MTB_TYPE_MEMALIGN:
+	  acq_ptr (thread, pa2);
+	  if (pa2 && pa2->valid)
+	    printf ("%d: pointer %p memalign'd again?  %d:%s\n", i, pa2->ptr, pa2->reason_idx, pa2->reason);
+	  thread->add (C_MEMALIGN);
+	  thread->add_int (pa2 ? pa2->idx : 0);
+	  thread->add_int (r->size2);
+	  thread->add_int (r->size);
+	  if (pa2)
+	    {
+	      pa2->valid = 1;
+	      pa2->reason = "memalign";
+	      pa2->reason_idx = i;
+	    }
+	  break;
+
+	case __MTB_TYPE_POSIX_MEMALIGN:
+	  printf ("%d: Unsupported posix_memalign call.\n", i);
+	  exit (1);
+	  break;
+
 	}
     }
 
diff --git a/malloc/trace_dump.c b/malloc/trace_dump.c
index d3a72f2..7b691d9 100644
--- a/malloc/trace_dump.c
+++ b/malloc/trace_dump.c
@@ -42,14 +42,15 @@ data_looks_like_raw_trace (unsigned char *data, long n_data)
 }
 
 const char * const typenames[] = {
-  "unused  ",
-  "malloc  ",
-  "calloc  ",
-  "free    ",
-  "realloc ",
+  "unused",
+  "malloc",
+  "calloc",
+  "free",
+  "realloc",
   "memalign",
-  "valloc  ",
-  "pvalloc  ",
+  "valloc",
+  "pvalloc",
+  "posix_memalign",
 };
 
 void
@@ -62,7 +63,8 @@ dump_raw_trace (unsigned char *data, long n_data)
 
   printf ("%ld out of %ld events captured (I think)\n", head, head);
 
-  printf ("threadid type     path     ptr1             size             ptr2\n");
+  printf ("%8s %8s %8s %16s %16s %16s %16s %16s\n",
+	  "threadid", "type", "path", "ptr1", "size", "ptr2", "size2", "size3");
 
   while (data <= edata - sizeof (struct __malloc_trace_buffer_s))
     {
@@ -73,9 +75,11 @@ dump_raw_trace (unsigned char *data, long n_data)
 	case __MTB_TYPE_UNUSED:
 	  break;
 	default:
-	  printf ("%08x %s %c%c%c%c%c%c%c%c %016llx %016llx %016llx\n",
+	  /* Consider 'memalign' to be the largest API word we want to align
+	     on so make the name 8 chars wide at a minimum.  */
+	  printf ("%08x %8s %c%c%c%c%c%c%c%c %016llx %016llx %016llx %016llx %016llx\n",
 		  t->thread,
-		  t->type == __MTB_TYPE_MAGIC ? "magic   " : typenames[t->type],
+		  t->type == __MTB_TYPE_MAGIC ? "magic" : typenames[t->type],
 		  t->path_thread_cache ? 'T' : '-',
 		  t->path_cpu_cache ? 'c' : '-',
 		  t->path_cpu_cache2 ? 'C' : '-',
@@ -86,7 +90,9 @@ dump_raw_trace (unsigned char *data, long n_data)
 		  t->path_hook ? 'H' : '-',
 		  (long long unsigned int) (size_t) t->ptr1,
 		  (long long unsigned int) t->size,
-		  (long long unsigned int) (size_t) t->ptr2);
+		  (long long unsigned int) (size_t) t->ptr2,
+		  (long long unsigned int) t->size2,
+		  (long long unsigned int) t->size3);
 	  break;
 	}
 
diff --git a/malloc/trace_run.c b/malloc/trace_run.c
index e64b189..a70de36 100644
--- a/malloc/trace_run.c
+++ b/malloc/trace_run.c
@@ -144,11 +144,11 @@ int threads_done = 0;
 //#define MDEBUG 1
 #define mprintf(...) (void)1
 
-#define myabort() my_abort_2(me, __LINE__)
+#define myabort() my_abort_2(thrc, __LINE__)
 void
-my_abort_2 (pthread_t me, int line)
+my_abort_2 (pthread_t thrc, int line)
 {
-  fprintf(stderr, "Abort thread %d at line %d\n", (int)me, line);
+  fprintf(stderr, "Abort thread %p at line %d\n", (void *)thrc, line);
   abort();
 }
 
@@ -199,8 +199,8 @@ static void free_wipe (size_t idx)
 static void *
 thread_common (void *my_data_v)
 {
-  pthread_t me = pthread_self ();
-  size_t p1, p2, sz;
+  pthread_t thrc = pthread_self ();
+  size_t p1, p2, sz, sz2;
   unsigned char *cp = my_data_v;
   ticks_t my_malloc_time = 0, my_malloc_count = 0;
   ticks_t my_calloc_time = 0, my_calloc_count = 0;
@@ -215,14 +215,14 @@ thread_common (void *my_data_v)
     {
       if (cp > data + n_data)
 	myabort();
-      dprintf("op %d:%ld is %d\n", (int)me, cp-data, *cp);
+      dprintf("op %p:%ld is %d\n", (void *)thrc, cp-data, *cp);
       switch (*cp++)
 	{
 	case C_NOP:
 	  break;
 
 	case C_DONE:
-	  dprintf("op %d:%ld DONE\n", (int)me, cp-data);
+	  dprintf("op %p:%ld DONE\n", (void *)thrc, cp-data);
 	  pthread_mutex_lock (&stat_mutex);
 	  malloc_time += my_malloc_time;
 	  calloc_time += my_calloc_time;
@@ -238,10 +238,46 @@ thread_common (void *my_data_v)
 	  pthread_mutex_unlock(&stop_mutex);
 	  return NULL;
 
+	case C_MEMALIGN:
+	  p2 = get_int (&cp);
+	  sz2 = get_int (&cp);
+	  sz = get_int (&cp);
+	  dprintf("op %p:%ld %ld = MEMALIGN %ld %ld\n", (void *)thrc, cp-data, p2, sz2, sz);
+	  /* we can't force memalign to return NULL (fail), so just skip it.  */
+	  if (p2 == 0)
+	    break;
+	  if (p2 > n_ptrs)
+	    myabort();
+	  stime = rdtsc_s();
+	  Q1;
+	  if (ptrs[p2])
+	    {
+	      free ((void *)ptrs[p2]);
+	      atomic_rss (-sizes[p2]);
+	    }
+	  ptrs[p2] = memalign (sz2, sz);
+	  /* Verify the alignment matches what is expected.  */
+	  if (((size_t)ptrs[p2] & (sz2 - 1)) != 0)
+	    myabort ();
+	  sizes[p2] = sz;
+	  mprintf("%p = memalign(%lx, %lx)\n", ptrs[p2], sz2, sz);
+	  Q2;
+	  etime = rdtsc_e();
+	  if (ptrs[p2] != NULL)
+	    atomic_rss (sz);
+	  if (etime < stime)
+	    {
+	      printf("s: %llx e:%llx  d:%llx\n", (long long)stime, (long long)etime, (long long)(etime-stime));
+	    }
+	  my_malloc_time += etime - stime;
+	  my_malloc_count ++;
+	  wmem(ptrs[p2], sz);
+	  break;
+
 	case C_MALLOC:
 	  p2 = get_int (&cp);
 	  sz = get_int (&cp);
-	  dprintf("op %d:%ld %ld = MALLOC %ld\n", (int)me, cp-data, p2, sz);
+	  dprintf("op %p:%ld %ld = MALLOC %ld\n", (void *)thrc, cp-data, p2, sz);
 	  /* we can't force malloc to return NULL (fail), so just skip it.  */
 	  if (p2 == 0)
 	    break;
@@ -273,7 +309,7 @@ thread_common (void *my_data_v)
 	case C_CALLOC:
 	  p2 = get_int (&cp);
 	  sz = get_int (&cp);
-	  dprintf("op %d:%ld %ld = CALLOC %ld\n", (int)me, cp-data, p2, sz);
+	  dprintf("op %p:%ld %ld = CALLOC %ld\n", (void *)thrc, cp-data, p2, sz);
 	  /* we can't force calloc to return NULL (fail), so just skip it.  */
 	  if (p2 == 0)
 	    break;
@@ -301,7 +337,7 @@ thread_common (void *my_data_v)
 	  p2 = get_int (&cp);
 	  p1 = get_int (&cp);
 	  sz = get_int (&cp);
-	  dprintf("op %d:%ld %ld = REALLOC %ld %ld\n", (int)me, cp-data, p2, p1, sz);
+	  dprintf("op %p:%ld %ld = REALLOC %ld %ld\n", (void *)thrc, cp-data, p2, p1, sz);
 	  if (p1 > n_ptrs)
 	    myabort();
 	  if (p2 > n_ptrs)
@@ -341,7 +377,7 @@ thread_common (void *my_data_v)
 	  p1 = get_int (&cp);
 	  if (p1 > n_ptrs)
 	    myabort();
-	  dprintf("op %d:%ld FREE %ld\n", (int)me, cp-data, p1);
+	  dprintf("op %p:%ld FREE %ld\n", (void *)thrc, cp-data, p1);
 	  free_wipe (p1);
 	  if (ptrs[p1])
 	    atomic_rss (-sizes[p1]);
@@ -357,7 +393,7 @@ thread_common (void *my_data_v)
 
 	case C_SYNC_W:
 	  p1 = get_int(&cp);
-	  dprintf("op %d:%ld SYNC_W %ld\n", (int)me, cp-data, p1);
+	  dprintf("op %p:%ld SYNC_W %ld\n", (void *)thrc, cp-data, p1);
 	  if (p1 > n_syncs)
 	    myabort();
 	  pthread_mutex_lock (&mutexes[p1]);
@@ -369,7 +405,7 @@ thread_common (void *my_data_v)
 
 	case C_SYNC_R:
 	  p1 = get_int(&cp);
-	  dprintf("op %d:%ld SYNC_R %ld\n", (int)me, cp-data, p1);
+	  dprintf("op %p:%ld SYNC_R %ld\n", (void *)thrc, cp-data, p1);
 	  if (p1 > n_syncs)
 	    myabort();
 	  pthread_mutex_lock (&mutexes[p1]);

-----------------------------------------------------------------------

Summary of changes:
 malloc/malloc.c     |  201 ++++++++++++++++++++++++++++++++++++++++++---------
 malloc/mtrace.h     |   40 ++++++++---
 malloc/trace2wl.cc  |   39 +++++++++-
 malloc/trace_dump.c |   28 +++++---
 malloc/trace_run.c  |   62 +++++++++++++----
 5 files changed, 299 insertions(+), 71 deletions(-)


hooks/post-receive
-- 
GNU C Library master sources


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]