This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Faster strlen


On Tue, Oct 09, 2012 at 06:51:15AM -0700, Andi Kleen wrote:
> OndÅej BÃlka <neleai@seznam.cz> writes:
> >
> > I also benchmarked atom and added variant which is identical to 
> > strlen-sse2-pminub except bsf is replaced by table lookup.
> 
> Is your micro benchmark just a tight loop or does it fill the caches?
Starting position is random within 8MB interval and sizes are chosen 
randomly within same order of magnitude.
> 
> I have doubts that table lookups are a good idea if it blows away
> the working set in L1 for the application.
It does not have this problem. It does lookup only for powers of 2 which 
fits 11 cache lines.

However it has problem that atom L2 cache has slow latency. When I
add access 8 random reads between calls then performance becomes
same as pminub. 
> 
> Microbenchmarks that do not use caches much can be very misleading
> here. Even if it's slightly slower not doing table lookups 
> is usually preferred for functions like this, simply because it lessens
> the impact on the caches.
> 
> I would recommend to measure what happens both if the microbenchmark
> stresses data cache and icache. Otherwise you risk winning
> benchmarks, but making real apps slower.
> 
> -Andi
> 
> -- 
> ak@linux.intel.com -- Speaking for myself only

-- 

Typo in the code


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]