This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][malloc] Avoid atomics in have_fastchunks


On 09/19/2017 03:11 PM, Wilco Dijkstra wrote:
> Carlos O'Donell wrote:
>> Does this patch change the number of times malloc_consolidate might
>> be called? Do you have any figures on this? That would be a user visible
>> change (and require a bug #).
> 
> The number of calls isn't fixed already. I'll have a go at hacking the malloc
> test to see how much variation there is and whether my patch changes it.

I'm only curious, this isn't a requirement on the submission of this patch.

> Btw what is your opinion on how to add generic single-threaded optimizations
> that work for all targets? Rather than doing more target hacks, I'd like to add
> something similar like we did with stdio getc/putc, ie. add a high-level check for
> the single-threaded case that uses a different code path (with no/relaxed atomics
> and no locks for the common cases).

I don't have an opinion on this for malloc. I haven't thought much about this kind
of optimization. I'm mostly thinking about how to speed up the multi-threaded
case.

> To give an idea how much this helps, creating a dummy thread that does nothing
> slows down x64 malloc/free by 2x (it has jumps that skip the 1-byte lock prefix...).

You might go even further down this path. If each thread has it's own arena, then
we don't need to do any locking in the arena, except for the arena selection path?

> An alternative would be to move all the fastbin handling into the t-cache - but
> then I bet it's much easier just to write a fast modern allocator...

Writing a fast modern allocator is not easy :-)

-- 
Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]