This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] Statistics of non-ASCII characters in strings


On Tue, Dec 23, 2014 at 06:25:07PM +0300, Alexander Monakov wrote:
> 
> 
> On Tue, 23 Dec 2014, Florian Weimer wrote:
> > Why can't you do the equivalent of
> > 
> >   X = ((X & 0x80) >> 1) | (X & 0x7F);
> > 
> > before the new check?  Does this lengthen the dependency chain too much?
> 
> If understood the previous discussion correctly, there's another possibility.
> Wilco's proposal is to use a zero byte matcher that would give a false
> positive on byte 0x80.  One can use such matcher to skip from the beginning of
> string to the first occurence of either 0x0 or 0x80 in the string, and then
> continue with normal strlen from there.

This sounds like a very good approach.

Rich


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]