[PATCH v8 6/6] elf: Optimize _dl_new_hash in dl-new-hash.h

Siddhesh Poyarekar siddhesh@gotplt.org
Mon May 16 16:44:28 GMT 2022


On 16/05/2022 22:08, Noah Goldstein wrote:
>> Thanks for the context, this should go into comments.  A wishlist bug
>> would be nice but I suspect it'll just gather dust.  Maybe it's still
>> useful for someone coming in after 10-15 years looking for more context
>> on it.
> 
> I'll add a comment in the next version.

Thanks.

>> I would prefer the earlier variant in generic code, with (if necessary)
>> the scheduling hack being a sysdep for x86.  Other architectures that
>> want to use the latter should #include it and also post microbenchmark
>> results so that we keep track of how we arrived at that decision.
> 
> I'm happy to switch it back but I don't think the scheduling hack is x86
> oriented. I don't think re-ordering could ever de-optimize things.
> The only real architectural assumption is a reasonably fast
> 32-bit multiply which is true for both the more generic earlier version
> and the current one.

I don't entirely disagree, but I think the conservative stance here is 
to keep the scheduling hack in the sysdep that has actually been tested 
and then bring it out into generic if it has been experimentally 
verified to be a universal win for all architectures we support.  That 
is, if we find that every architecture is including the sysdep version, 
then it's time to bring it out to replace the generic version.

FWIW, I'm lowering the bar for acceptance because you only need to 
verify that the scheduling hack is better for the architecture you're 
interested in, not all architectures we support :)

Thanks,
Siddhesh


More information about the Libc-alpha mailing list