This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] powerpc: strstr optimization


On Wed, 2015-05-06 at 18:58 +0530, Rajalakshmi Srinivasaraghavan wrote:
> 
> On 05/06/2015 02:44 AM, OndÅej BÃlka wrote:
> >
> > It does. AFAIR for practical strings on x64 its only around 3 times slower than
> > theoretical lower bound.
> >
> > A lower bound is strlen call as you must find terminating null when
> > needle isn't present.
> >
> > I hope that main idea was clear and you will
> > use it. Code I send is complex as it contains ten different speedups
> > that improve performance.
> >
> >
> >> In my optimization there are totally three branches.
> >> case 1# if needle len >= 8 (common case)
> >> case 2# if needle < 8 (common case)
> >> case 3# For page boundary cross haystack. (rare case)
> >>
> > Checking if its case 1 or 2 is relatively costy as in practice haystacks
> > tend to be short. Most apps that use strstr have 90% call haystack less than 64 bytes long.
> >
> > You don't need to know needle size if you know that it doesn't contain
> > first pair. Also its rare that it contains first pair of characters twice.
> strlen of needle is needed in all the cases so as to make sure haystack 
> is longer than needle
> as part of initial check before proceeding to search. Once this check is 
> passed, strchr is
> called to find the  position of first character.
> 
I think we should move on with (commit) Raji's patch as is. 

While we appreciate you suggestions Ondrej, not all optimization apply
equally to all platforms. 

> >
> >> For case 1, I always compare 8 characters at a time using cmpb instruction.
> >> -> perform basic strlen, strnlen, checks. Move haystack based on
> >> strchr result.
> >> -> Load first 8 bytes(taking care of aligned load)from haystack and
> >> needle and compare them.
> >> If they are same proceed to next 8 bytes of haystack and needle.
> >> Else load next doubleword from haystack.Form a doubleword with last
> >> seven characters from first load and first character from
> >> second load and compare again with needle. Proceed in a similar way.
> >> This gives good improvement.
> >>
> > Thats looks worse than naive algorithm on practical strings. Most
> > instructions you do are useless as you have mismatch at first character.
> >
> > Following function should beat it.
> >
> > char *strstr (char *s, char *n)
> > {
> >    if (n[0] == 0)
> >      return s;
> >    while (s = strchr (s, n[0]))
> >      {
> >        long i;
> >        for (i=1; n[i] && n[i] == s[i]; i++);
> >        if (n[i] == 0)
> >          return s;
> >        s++;
> >      }
> > }
> Calling strchr in a loop is not showing  improvement when compared to 
> the proposed patch.
> >
> > Or use musl strstr idea. While implementation is terrible as it in
> > practice unnecessary spends 90% time on preprocessing their loop is
> > relatively good for portable implementation.
> >
> >
> 



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]