This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Faster strchr implementation.

From: OndÅej BÃlka <neleai at seznam dot cz>
To: Liubov Dmitrieva <liubov dot dmitrieva at gmail dot com>
Cc: GNU C Library <libc-alpha at sourceware dot org>
Date: Fri, 9 Aug 2013 18:44:20 +0200
Subject: Re: [PATCH] Faster strchr implementation.
References: <20130807140911 dot GA31968 at domone dot kolej dot mff dot cuni dot cz> <CAHjhQ926EE-MYDJR5Eftf+DUefBg-Gox0pw57vZ7XUwsO3OPJg at mail dot gmail dot com> <20130808190716 dot GA4589 at domone dot kolej dot mff dot cuni dot cz> <CAHjhQ92+C6uXyrUhTd3OWuoa6v2SeUaKLBuqaNX5Sqtn4ANBdg at mail dot gmail dot com> <CAHjhQ90S-1uBhwV44KODTcQkr=0U-P+_9Pu0O=RbYYY9e82JCA at mail dot gmail dot com>

On Fri, Aug 09, 2013 at 06:05:43PM +0400, Liubov Dmitrieva wrote:
>
Yes, thanks. I updated with them pages:

http://kam.mff.cuni.cz/~ondra/benchmark_string/strcmp_profile.html
http://kam.mff.cuni.cz/~ondra/benchmark_string/strchr_profile.html 
http://kam.mff.cuni.cz/~ondra/benchmark_string/strrchr_profile.html

As you can see graphs strcmp is 10% faster on haswell, atom and
silvermont.

As avx2 implementation goes on haswell it is around 20% faster on large
inputs, but problem there is that most of time inputs are small. Using
avx is around 3% slower than strcmp_new on gcc test. One possible cause is
bigger latency of avx2, it may be worth to add avx2 ifunc.

As strchr is concerned a no_bsf version is still faster on atom and
silvermont. My implementation win for larger sizes but I pays heavy penalty 
for switching to 64 byte loop while strchr_no_bsf loop is cheap.

A strrchr a new implementation is faster than current ones both
practically and asymptotically on haswell,atom and silvermont.

As haswell I fixed strchr and strrchr benchmarks to call correct avx2
implementation. New version is here:

http://kam.mff.cuni.cz/~ondra/benchmark_string/strchr_profile090813.tar.bz2
http://kam.mff.cuni.cz/~ondra/benchmark_string/strrchr_profile090813.tar.bz2

It was bug in implementation, now fixed.
They are tested automatically.

-- 

endothermal recalibration

Follow-Ups:
- Re: [PATCH] Faster strchr implementation.
  - From: Liubov Dmitrieva
- Re: [PATCH] Faster strchr implementation.
  - From: Liubov Dmitrieva

References:
- [PATCH] Faster strchr implementation.
  - From: OndÅej BÃlka
- Re: [PATCH] Faster strchr implementation.
  - From: Liubov Dmitrieva
- Re: [PATCH] Faster strchr implementation.
  - From: OndÅej BÃlka
- Re: [PATCH] Faster strchr implementation.
  - From: Liubov Dmitrieva
- Re: [PATCH] Faster strchr implementation.
  - From: Liubov Dmitrieva

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]