This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: extend dl-minimal malloc implementation


On 06/20/2017 07:17 PM, DJ Delorie wrote:
> This is a minor (hah!) enhancement to the trivial malloc
> implementation in elf/dl-malloc.c to make free() work more
> generically, and allow for reusing free'd chunks.  Like the core
> malloc(), it uses a heap-based allocator with a header before each
> chunk, and a free list.  However, unlike the core allocator, there's
> only one free list and it's unsorted - the design goal here is to keep
> the *code* simple and easy to understand, knowing that this code
> should only be used during early program load (i.e. the free list
> should typically be small enough that naive algorithms are sufficient)

It will allow us to use dynarrays and scratch buffers in the dynamic
linker because we get a working realloc.

> +/* For each chunk of memory we work with in this minimal malloc, we
> +   store one 32-bit "size" field before it, and reuse the first word
> +   of the user-space memory to store a linked list of free chunks when
> +   the chunk is freed.  This is similar in concept to the regular
> +   malloc's design.
> + */
> +typedef struct free_chunk {
> +  /* This is just to avoid "packed" */
> +  uint32_t filler;
> +  /* This is the only variable that takes up space between user chunks.  */
> +  uint32_t size;
> +  /* The address of "next" is what we return to the user.  */
> +  struct free_chunk *next;
> +} free_chunk;

The struct is technically invalid C.  We might be better off with
pointer arithmetic.  But considering that malloc/malloc.c uses pretty
much the same construct, I'm not sure if it's worth changing this.

> +/* Convert between user and chunk pointers.  */
> +#define ptr2chunk(ptr) (free_chunk *)((char *)ptr - sizeof (uint32_t)*2)
> +#define chunk2ptr(ch) (void *)(& ch->next)
> +/* Extra bytes we need between chunks.  */
> +#define OVERHEAD sizeof(uint32_t)
> +#define TOOBIG (uint32_t)(~0L)
> +
> +/* Debugging - make sure the free_chunk struct doesn't have a 32-bit
> +   hole between size and next.  */
> +static char
> +  check_size [(sizeof(free_chunk) == sizeof(uint32_t) * 2 + sizeof(void *)) ? 1 : -1]
> +  __attribute__((used));

You can use _Static_assert for that.  (By the way: For such constructs,
should be __attribute__ ((unused)) to suppress the warning;
__attribute__ ((used)) suppresses the warning *and* emits the object to
the assembly file, which is not what we want.)

> +/* We have a singly linked list of free'd chunks.  Newly freed chunks
> +   are added at the head end, and chunks are removed from
> +   anywhere.  */
> +static free_chunk free_list = { 0, 0, NULL };

Can we turn this into a single pointer, to save .bss space?

> +static free_chunk *
> +find_on_free_list (void *old_pointer, size_t n, int zero_it)

zero_it should be bool (also in other places).

> +/* Combination malloc/calloc/realloc - allocates memory, possibly
> +   extending an existing chunk (realloc), possibly zeroing it if
> +   needed (calloc).  You shouldn't ask for both realloc and calloc at
> +   the same time, of course.  */
> +static void *
> +morecore(void *old_pointer, size_t n, int zero_it)
>  {
> +  free_chunk *rv;
> +
> +  djnl();
> +  djprintf("morecore\n", zero_it ? -n : n, old_pointer);
> +
> +  /* We round up the requested size so that we have room for the
> +     "size" of the next chunk after this one is properly aligned.
> +     This means the smallest chunk is 12 bytes on 64-bit
> +     platforms.  */
> +  if (n < sizeof(void *))
> +    n = sizeof(void *);
> +  n = ALIGN_UP (n + OVERHEAD, MALLOC_ALIGNMENT) - OVERHEAD;
> +
> +  /* Special cases for realloc.  */
> +  if (old_pointer)
> +    {
> +      free_chunk *old_chunk = ptr2chunk (old_pointer);
> +
> +      /* Shrinking - just reuse the old chunk.  */
> +      if (old_chunk->size >= n)
> +	return old_pointer;

Can we put the tail on the free list?

> -  /* Make sure the allocation pointer is ideally aligned.  */
> -  alloc_ptr = (void *) 0 + (((alloc_ptr - (void *) 0) + MALLOC_ALIGNMENT - 1)
> -			    & ~(MALLOC_ALIGNMENT - 1));
> +  /* If we're realloc'ing the last block, we either extend it (by
> +     re-allocating it over the old block) or we copy it (because a new
> +     noncontiguous mapping was needed).  */

What does “last” mean here?

>  /* This will rarely be called.  */
>  void weak_function
>  free (void *ptr)
>  {

> +  /* else if it's really big we can't store it on the free list.  */
> +  if (ch->size == TOOBIG)
> +    return;

I think we should just fail TOOBIG allocations.

> +
> +  /* else store the chunk on the free list.  */
> +  ch->next = free_list.next;
> +  free_list.next = ch;
> +  djchunk("free_list\n", &free_list);
> +  djchunk("free_list->next\n", free_list.next);
>  }

Would it make sense to keep the free list order by increasing addresses
and attempt to coalesce neighboring chunks?  The free list should be
really short, and this should help us to reduce fragmentation.

Another thing that might be worth considering is to keep the free list
pointer in a parameter.  Then we can use the same code to implement the
async-signal-safe malloc for TLS data (with a per-thread free list).
The free list is beneficial there because we have some space in the TCB
on many architectures which we could reuse for TLS data if the allocator
is sufficiently flexible.

Thanks,
Florian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]