This is the mail archive of the
libc-locales@sourceware.org
mailing list for the GNU libc locales project.
[Bug localedata/17750] wrong collation order of diacritics in most locales
- From: "keld at keldix dot com" <sourceware-bugzilla at sourceware dot org>
- To: libc-locales at sourceware dot org
- Date: Tue, 23 Dec 2014 18:11:54 +0000
- Subject: [Bug localedata/17750] wrong collation order of diacritics in most locales
- Auto-submitted: auto-generated
- References: <bug-17750-716 at http dot sourceware dot org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=17750
--- Comment #2 from keld at keldix dot com <keld at keldix dot com> ---
On Tue, Dec 23, 2014 at 04:25:27AM +0000, aoliva at sourceware dot org wrote:
> https://sourceware.org/bugzilla/show_bug.cgi?id=17750
>
> Bug ID: 17750
> Summary: wrong collation order of diacritics in most locales
> Product: glibc
> Version: unspecified
> Status: NEW
> Severity: normal
> Priority: P2
> Component: localedata
> Assignee: unassigned at sourceware dot org
> Reporter: aoliva at sourceware dot org
> CC: libc-locales at sourceware dot org
>
> http://www.unicode.org/reports/tr10/tr10-30.html states:
>
> <quote>
> Normally, all differences in sorting are assessed from the start to the end of
> the string. If all of the base letters are the same, the first accent
> difference determines the final order. In row 1 of Table 5, the first accent
> difference is on the o, so that is what determines the order. In some French
> dictionary ordering traditions, however, it is the last accent difference that
> determines the order, as shown in row 2.
> </quote>
>
> Table 5 says:
>
> <pre>
> Normal Accent Ordering cote < cotà < cÃte < cÃtÃ
> Backward Accent Ordering cote < cÃte < cotà < cÃtÃ
> </pre>
>
> However, glibc implements backward accent ordering for all locales except de_DE
> and lb_LU.
>
> Unicode CLDR 26 confirms this is wrong: the only file in
> http://unicode.org/cldr/trac/browser/tags/release-26/common/collation/ that has
> settings backwards="on" is fr_CA.xml.
This was probably done because if there are more than one accented letter in a
string,
the word or name is probably French, and then the french rules should be
followed.
This would mean that CLDR is wrong.
Best regards
Keld
--
You are receiving this mail because:
You are on the CC list for the bug.