This is the mail archive of the mailing list for the glibc project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 4/*] Generic string memchr and strnlen

On Tue, Jul 28, 2015 at 02:07:35PM +0100, Wilco Dijkstra wrote:
> > OndÅej BÃlka wrote:
> > On Mon, Jul 27, 2015 at 02:42:47PM -0400, Chris Metcalf wrote:
> > > On 07/27/2015 12:56 PM, OndÅej BÃlka wrote:
> > > >Best example is strrchr where almost all architectures got it wrong
> > > >first time until I tell them otherwise. Problem is that they calculate
> > > >position of byte at each iteration instead just at end which is
> > > >ineffective. Now tile and powerpc have this problem. My proposed
> > > >alternative
> > > >
> > > >return memrchr (s, c, strlen (s) + 1);
> > > >
> > > >would beat these on suitable inputs.
> > >
> > > Your comment and your code snippet don't really match, so can I ask
> > > you to elucidate a little what your concern is?
> > >
> > It is mix of two concerns, so its both First a snippet was there
> > mainly to show that generic code could be faster. I mentioned that
> > it applies for some inputs, as it would need larger inputs than usual
> > to overcome initialization and call overhead of assembly one.
> I believe the above sequence is the right one to use in the generic code,
> especially when you have an optimized memrchr that scans backwards
> (simpler inner loop as you quit at first match). The current generic
> implementation repeatedly calls strchr so will be very slow if there
> are multiple matches.
Then could you review a generic patch that I am about to ping?
> > > So I'm not sure which point you are making but unless you know
> > > something about the average distribution of the characters in the
> > > strrchr() string to suggest they are likely to occur in the last third
> > > of the string more than 50% of the time, I don't think I'm convinced.
> So what we really need is better statistics on strrchr and friends. Looking
> at GLIBC sources, it seems that >90% search for '/' in a path. 
> Do you have actual stats for strrchr OndÅej? That would really help solving
> this issue. What I'd like to know is:
> 1. Average length of the strings
> 2. What percentage fails to match
> 3. Average number of matches per string if it matches at least once
> 4. Average relative position within string of last match
> With that info we could create a simple patch for bench-strrchr.c to make
> it use realistic inputs. Then based on that we can fix the generic code and
> let maintainers further tune their optimized implementations if they do not
> beat the generic code.

Yes, I used dryrun for that. You don't need to model microbenchmark from
that data which could go wrong in several ways but use dryrun to
directly replay these.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]