This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: strxfrm output stability
- From: Zack Weinberg <zackw at panix dot com>
- To: libc-alpha at sourceware dot org
- Date: Wed, 9 Sep 2015 12:54:36 -0400
- Subject: Re: strxfrm output stability
- Authentication-results: sourceware.org; auth=none
- References: <55EF4F95 dot 4020703 at redhat dot com> <20150908211805 dot 36E5E2C3A73 at topped-with-meat dot com> <55EF529E dot 7070108 at redhat dot com> <55EF5494 dot 8030506 at cs dot ucla dot edu> <55F01BDC dot 70908 at redhat dot com> <55F06230 dot 8080003 at cs dot ucla dot edu>
On 09/09/2015 12:45 PM, Paul Eggert wrote:
> Florian Weimer wrote:
>>> >I'll go out on a limb and say that no sane application uses strxfrm,
>>> >either on disk or off.
>> PostgreSQL uses it to avoid calling strcoll on strings which have
>> distinctly ordered prefixes in their strxfrm output.
>
> Good catch, and that is what I get for going out on a limb.
>
> Although after looking at it a bit, it is not a true counterexample, as
> the PostgreSQL code is crazy. For example, it inspects strxfrm output
> for upper-case ASCII letters?! Overall PostgreSQL's use of strxfrm has
> the smell of someone applying theory from the 1970s without having
> measured whether performance actually improves in the typical case
> nowadays. and it might be amusing to replace PostgreSQL's use of strxfrm
> with a function that always returns the empty string; the code would
> still work, and might even run faster.
https://lwn.net/Articles/653411/ indicates that PostgreSQL's use of
strxfrm is new in 9.5 -- formerly only strcoll was used -- and provided
a dramatic performance improvement. See
http://pgeoghegan.blogspot.com/2015/01/abbreviated-keys-exploiting-locality-to.html
for additional detail.
zw