This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On 06/01/2015 09:52 PM, OndÅej BÃlka wrote:
On Mon, Jun 01, 2015 at 09:36:47AM -0500, Steven Munroe wrote:On Mon, 2015-06-01 at 14:28 +0200, OndÅej BÃlka wrote:On Mon, Jun 01, 2015 at 04:11:28PM +0530, Rajalakshmi Srinivasaraghavan wrote:This patch optimizes strcasestr function for power >= 7 systems. This patch uses optimized strlen and strnlen for calculating string length and the average improvement of this optimization is ~40%. This patch is tested on powerpc64 and powerpc64le. Attached the benchresults with this new patch.Thats not enough. As strcasestr that I submited is around three times slower your implementation would likely be regression over generic one. A problem here is that you use moronic algorithm. Fix algorithm first before trying to optimize it.This is not very helpful. You are demanding changes without clear explanation and justification. What is wrong with Raja's algorithm? What is insufficient in the benchmark data she has provided? And why do you think your specific design applies to PowerISA and POWER7/POWER8 micro-architecture. What data do you have that justified this objection?I replied on strstr patch thread on why what she submitted is performance regression. So I will repeat arguments from other thread which still apply. First was problem with quadratic behaviour. She tried to fix it but it isn't a fix at all. Just benchmark strcasestr ("aaa...(4000 times)...aaa", "aaa...(2000 times)...aab") That call would take around hundred times than before which is unacceptable.
This is already handled in the patch.If the needle len is more than 2048, it calls default string/strcasestr.c
Which benchmark are you referring as bogus? The benchtest result attached in the previous thread was created using benchtests/bench-strcasestr.c . Since your proposed benchtest changes were not yet committed, I have used default ones.If we ignore that red flag second problem was that benchmark she used is bogus. It test with periodic haystacks, needle is copy of first bytes of haystack with last byte set to something else.
. . .
Just use same patch like I send with ((unsigned) rand())%16 + 1 and you will see completely different numbers in benchmark.
Benchtest results attached with these changes. . .
As I don't have powerpc access now apply my patches [PATCH v5] Generic string skeleton [PATCH v5 4*] Generic string search functions (strstr, strcasestr, memmem)
I have attached the benchtest result with the above patches appliedalong with benchtests/bench-strcasestr.c changes.(similiar changed as proposed by you for benchtests/bench-strstr.c).
The result attached clearly shows improvement.
and run (preferably fixed) benchmark with these. As gains that I see on x64 are bigger than ones gained by this assembly you will likely see that generic implementation is indeed better and it would be pointless to try review that only to remove it shortly after adding to improve performance.
-- Thanks Rajalakshmi S
Attachment:
newresults
Description: Text document
Attachment:
new_benchtestcode
Description: Text document
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |