This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

RE: [RFC] Statistics of non-ASCII characters in strings

From: "Wilco Dijkstra" <wdijkstr at arm dot com>
To: "'Rich Felker'" <dalias at libc dot org>
Cc: <libc-alpha at sourceware dot org>
Date: Mon, 5 Jan 2015 15:40:42 -0000
Subject: RE: [RFC] Statistics of non-ASCII characters in strings
Authentication-results: sourceware.org; auth=none
References: <001401d01df6$0f7cc5a0$2e7650e0$ at com> <E1Y34Yu-0004LC-KH at fencepost dot gnu dot org> <A610E03AD50BFC4D95529A36D37FA55E38C7897324 at GEORGE dot Emea dot Arm dot com> <20141222225044 dot GU4574 at brightrain dot aerifal dot cx> <A610E03AD50BFC4D95529A36D37FA55E38C7897326 at GEORGE dot Emea dot Arm dot com> <20141223002521 dot GV4574 at brightrain dot aerifal dot cx>

> Rich Felker wrote:
> On Mon, Dec 22, 2014 at 11:53:39PM +0000, Wilco Dijkstra wrote:
> > Rich Felker wrote:
> > > It's not quite clear to me from your reply, but I get the impression
> > > you're comparing an ASCII-optimized strlen to a non-optimized one,
> > > rather than comparing it to the alternate optimization that also works
> > > for non-ASCII bytes. This is a standard implementation trade-off, not
> > > Aarch64-specific, and in general the right answer is to use the
> > > slightly more expensive code that works for all bytes rather than the
> > > version that has to take a slow-path when it gets a false-positive nul
> > > terminator on non-ASCII bytes.
> >
> > No the comparison is between the existing optimized version (which
> > already processes 16 characters per iteration) and an even more optimized
> > version. Some of the speedup is due to my trick to process ASCII strings faster.
> > Non-ascii strings are still at least as fast as the original version (usually much
> > faster), so it's definitely not a slow path, but rather a fast path plus an
> > extremely fast path.
> 
> Can you clarify how you're doing this? Since you can't know a priori
> whether the data will be ASCII or not, I don't understand how you can
> first use a test that gives the wrong result for non-ASCII and then a
> secondary test to get the right result without hurting performance for
> non-ASCII.

I'll post my patch later, but the fast check is a subset of the slower,
more accurate check, so if the fast check fails it has executed just 3
extra instructions compared to doing only the accurate check. If the
accurate check detects no zeroes, I use a 2nd loop as a fallback for
non-ASCII strings. This only uses the accurate check so that you don't 
pay this overhead more than once per call.

So if we get mostly ASCII strings we always use the fastest case. 
If we get mostly non-ASCII strings we only pay the cost of ~1 cycle for 
the additional instructions (which is more than made up for by various 
other improvements, so we're never worse off than the previous version).

That means the only hard case for the branch predictor is if there are
a similar number of ASCII and non-ASCII strings and they happen randomly.
If the figure of >98% strings being ASCII 
(http://sourceware.org/ml/libc-alpha/2014-12/msg00879.html) is true for 
GLIBC string functions too then all is well. 

> Of course the standard filesystem paths are the same, but the result
> of things like strlen("/usr/local/bin") is not the important case.
> What's much more likely to matter is string operations encountered
> processing symbol tables/relocations, parsing source code or
> text-based data, etc. These are cases where you're going to encounter
> a lot of ASCII-only strings regardless of the user's language.

A fast parser/scanner typically does not use any library functions.
Last time I wrote a scanner, I computed the length, hash and whether a
character was a valid identifier char using just 4 instructions per
character.

> > (I believe OndÃej had some
> > results that show string functions are most often used on path names).
> 
> Even if this is true it's likely to be irrelevant since strlen's on
> pathnames are almost certainly accompanied by syscalls on those
> pathnames, and the syscall overhead dominates the cost of the strlen.
> (I wonder if that result was for the kernel rather than userspace,
> though..?)

Overall string functions are likely a tiny percentage of total time. 
But that doesn't mean speeding them up is not beneficial - a faster strlen
would still reduce overall number of cycles to do a given task (eg. OndÃej
mentioned repeated strlen/strcpy of 500 byte long PATH variable).

Wilco

Follow-Ups:
- Re: [RFC] Statistics of non-ASCII characters in strings
  - From: OndÅej BÃlka
- Re: [RFC] Statistics of non-ASCII characters in strings
  - From: keld

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]