This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [COMMITTED] lt_LT locale: Base collation on copy "iso14651_t1" [BZ #22524]

From: Aurelien Jarno <aurelien at aurel32 dot net>
To: GNU C Library <libc-alpha at sourceware dot org>
Date: Thu, 14 Dec 2017 15:09:12 +0100
Subject: Re: [COMMITTED] lt_LT locale: Base collation on copy "iso14651_t1" [BZ #22524]
Authentication-results: sourceware.org; auth=none
References: <s9dmv2t1zim.fsf@taka.site> <20171213203626.GA13829@aurel32.net> <s9dind91xun.fsf@taka.site>

On 2017-12-14 10:07, Mike FABIAN wrote:
> Aurelien Jarno <aurelien@aurel32.net> さんはかきました:
> 
> > On 2017-12-08 07:53, Mike FABIAN wrote:
> >> 
> >>             [BZ #22524]
> >>             * localedata/Makefile: Add lt_LT.UTF-8 to test-input
> >>             and to the list of locales to be built for testing.
> >>             * localedata/lt_LT.UTF-8.in: New file for testing the collation.
> >>             * localedata/locales/lt_LT (LC_COLLATE): Use “copy "iso14651_t1"”
> >>             and build the collation rules upon that.
> >
> > The lt_LT locale and a few others ones (et_EE and tr_TR) used to sort
> > upper case letters before lower case ones. Basing the collation on
> > iso14651_t1 actually changes that. I don't know if the change is
> > intentional or not.
> 
> Yes, I know. In some locales I kept it, for example in et_EE I kept
> it by adding something like this:
> 
>     % Uppercase first:
>     % (This is not in the CLDR rules, but the old et_EE locale before I based
>     % the collation on iso_41651_t1 did uppercase first. I don’t know whether
>     % there is a good reason for this, but let’s keep it for the moment.
>     % This reimplementation of the Estonian sorting just reproduces the same
>     % order as before (except fixing some bugs,
>     % see: https://sourceware.org/bugzilla/show_bug.cgi?id=22517#c1)).
>     reorder-after <RES-1>
>     <CAP>
>     <MIN>
> 
> But actually CLDR sorts upper case first only for 3 languages:
> 
>     mfabian@taka:/local/mfabian/src/cldr-svn/trunk/common/collation
>     $ grep 'caseFirst upper' *
>     cu.xml:[caseFirst upper]
>     da.xml:                                 [caseFirst upper]
>     mt.xml:[caseFirst upper]  # DMS MSA 200:2009
> 
> So I’ll certainly keep it for these our Danish locale (I am currently
> updating the localedata/locales/iso14651_t1_common to the latest
> version released from ISO (see: https://www.iso.org/standard/68309.html).
> And to do that I have to adapt our collation rules in many locales
> including da_DK.
> 
> If we trust CLDR, I think we should do uppercase first only for the
> above 3 languages.
 
Ok thanks for the explanations. If the change is intentional, then it's
all fine. And thanks for the good work on locales.

Aurelien

-- 
Aurelien Jarno                          GPG: 4096R/1DDD8C9B
aurelien@aurel32.net                 http://www.aurel32.net

References:
- [COMMITTED] lt_LT locale: Base collation on copy "iso14651_t1" [BZ #22524]
  - From: Mike FABIAN
- Re: [COMMITTED] lt_LT locale: Base collation on copy "iso14651_t1" [BZ #22524]
  - From: Aurelien Jarno
- Re: [COMMITTED] lt_LT locale: Base collation on copy "iso14651_t1" [BZ #22524]
  - From: Mike FABIAN

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]