This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] cs_CZ locale: fix collation [BZ #22336]


On 11/24/2017 03:47 AM, Mike FABIAN wrote:
> 
>             [BZ #22336]
>             * localedata/locales/cs_CZ (LC_COLLATE): Use “copy "iso14651_t1"”
>             and implement the collation rules for cs from CLDR on top of that.
>             * Makefile: Add cs_CZ.UTF-8 to test-input and to the list
>             of locales to be built for testing.
>             * cs_CZ.UTF-8.in: New file with test data to test the Czech sorting.
> 
> Difference in sorting of the added test file (added in the patch)
> before and after applying the sorting changes of the patch:
> 
> $ diff -u cs_CZ.UTF-8.in.old cs_CZ.UTF-8.in 
> --- cs_CZ.UTF-8.in.old  2017-11-24 16:47:52.688348458 +0530
> +++ cs_CZ.UTF-8.in      2017-11-24 16:42:08.364221944 +0530
> @@ -1,7 +1,3 @@
> -ȥ
> -Ȥ
> -ʒ
> -Ʒ
>  a
>  a
>  a
> @@ -65,7 +61,6 @@
>  cenných
>  cenným
>  cenou
> -cH
>  cvrček
>  cz
>  cZ
> @@ -94,8 +89,9 @@
>  H
>  hruška
>  ch
> -CH
> +cH
>  Ch
> +CH
>  chřestýšům
>  Chřestýšům
>  chřipka
> @@ -188,6 +184,8 @@
>  Z
>  ź
>  Ź
> +ȥ
> +Ȥ
>  za
>  Za
>  źa
> @@ -209,6 +207,8 @@
>  Žb
>  žluva
>  Žluva
> +ʒ
> +Ʒ
>  0
>  1
>  1
> 
> I think "cH" was sorted completely wrong before and "CH" slightly
> wrong as well. So this patch seems to not only base the Czech
> LC_COLLATE implementation on the iso14651_t1 file as requested in this
> bug but also improves the sorting of the uppercase/lowercase variants
> of the ch digraph.
> 
> And of course it improves the sorting of some non-Czech characters
> like ʒ and ȥ because these were not handled at all in the old
> Czech LC_COLLATE implementation.
> 

Looks good to me, and yes, it fixes what looks like wrong sorting in the
CH digraph.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

-- 
Cheers,
Carlos.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]