This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] powerpc: strcasestr optimization


On Tue, Jun 02, 2015 at 01:47:03PM -0500, Steven Munroe wrote:
> On Mon, 2015-06-01 at 18:22 +0200, OndÅej BÃlka wrote:
> > On Mon, Jun 01, 2015 at 09:36:47AM -0500, Steven Munroe wrote:
> > > On Mon, 2015-06-01 at 14:28 +0200, OndÅej BÃlka wrote:
> > > > On Mon, Jun 01, 2015 at 04:11:28PM +0530, Rajalakshmi Srinivasaraghavan wrote:
> > > > > 
> > > > > This patch optimizes strcasestr function for power >= 7 systems.
> > > > > This patch uses optimized strlen and strnlen for calculating
> > > > > string length and the average improvement of this optimization is ~40%.
> > > > > This patch is tested on powerpc64 and powerpc64le.
> > > > > Attached the benchresults with this new patch.
> > > > > 
> > > > Thats not enough. As strcasestr that I submited is around three times
> > > > slower your implementation would likely be regression over generic one.
> > > > 
> > > > A problem here is that you use moronic algorithm. Fix algorithm first
> > > > before trying to optimize it.
> > > > 
> > > 
> > > This is not very helpful. You are demanding changes without clear
> > > explanation and justification.
> > > 
> > > What is wrong with Raja's algorithm? What is insufficient in the
> > > benchmark data she has provided? And why do you think your specific
> > > design applies to PowerISA and POWER7/POWER8 micro-architecture.
> > > 
> > > What data do you have that justified this objection?
> > 
> > I replied on strstr patch thread on why what she submitted is
> > performance regression. So I will repeat arguments from other thread
> > which still apply.
> > 
> > First was problem with quadratic behaviour. She tried to fix it but it
> > isn't a fix at all. Just benchmark 
> 
> Which is fixed in the latest patch
> > 
> > strcasestr ("aaa...(4000 times)...aaa", "aaa...(2000 times)...aab")
> > 
Did you read also following line? It isn't solved at all.




> > That call would take around hundred times than before which is
> > unacceptable.
> > 
> > If we ignore that red flag second problem was that benchmark she used is
> > bogus. It test with periodic haystacks, needle is copy of first bytes of
> > haystack with last byte set to something else.
> > 
> Have you looked at he latest patch? If the benchmark provided does not
> cover your assertion, then it is your responsibility to provide and
> upstream the appropriate benchmark. And convince the community that
> benchmark covers a valid issue.
> 
That was my comment why benchmark isn't feasible to prove performance as
I could for almost any patch write benchmark that shows its regression
by selecting suitable data. Question is what happens in reality and for
that you shouldn't use artifical benchmarks.
> > snip
> > 
> > As microarchitecture one doesn't need to know details to see that
> > something is fundamentally wrong. A algorithm here is essentially
> > byte-by-byte check which essentially does following
> > 
> 
> That assertion is at odds with my experience. Every day I see code that
> somebody thinks is optimized but in reality is not. Current processors
> are vastly more complicated then most realize and RISC processors have
> stronger interactions between compiler and micro-architecture.
> 
> So it is on you have to prove that your generic optimization actually
> works for more then one platform.
> 
I have same especially when author submits an assembly improvement where
algorithm is completely wrong and doesn't cover known performance
issues.

> > snip
> 
> > 
> > As that was covered my final comment was that it isn't substantial
> > speedup. My implemantation is generic and gives large boost 
> > so why should be powerpc different.
> > 
> > As I don't have powerpc access now apply my patches
> > 
> > [PATCH v5] Generic string skeleton
> > [PATCH v5 4*] Generic string search functions (strstr, strcasestr, memmem)
> > 
> Every one has access to the the GCC Compile farm 
> 
> https://gcc.gnu.org/wiki/CompileFarm
> 
> So there is no excuse to not have data to back you assertion.

Except that all power machines except osuosl are offline. There may be
osuosl ones but gcc11[012].osuosl.org which don't resolve


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]