This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
[RFC] Possible performance problem with strcmp on core2.
- From: OndÅej BÃlka <neleai at seznam dot cz>
- To: libc-alpha at sourceware dot org
- Date: Sat, 20 Jun 2015 13:16:32 +0200
- Subject: [RFC] Possible performance problem with strcmp on core2.
- Authentication-results: sourceware.org; auth=none
- References: <20150620083525 dot GA31992 at domone>
I updated page above to contain avx2 data, profiler used is here.
http://kam.mff.cuni.cz/~ondra/benchmark_string/strcmp_profile200615.tar.bz2
I also found that there is possible regression on core2, see.
http://kam.mff.cuni.cz/~ondra/benchmark_string/core2/strcmp_profile/results_gcc/result.html
http://kam.mff.cuni.cz/~ondra/benchmark_string/core2/strcmp_profile/results_rand/result.html
Problem is that for larger sizes a ssse3 is still faster but overall its
worse as due to high startup cost. Also big instruction cache footprint
is problem. When in icache its benefical for strings larger than 128
bytes, when its cold then icache causes jump that point into ~400 bytes
http://kam.mff.cuni.cz/~ondra/benchmark_string/core2/strcmp_profile/results_rand_noicache/result.html
So how much we care about core2? I know several functions that could be
improved but I put it in backlog due that it isn't that important for me
to optimize and also its nontrivial to do switch without harming smaller
sizes that are hot path instead this cold one.
So how proceed?