This is the mail archive of the libc-locales@sourceware.org mailing list for the GNU libc locales project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug localedata/22336] cs_CZ LC_COLLATE does not use i18n


https://sourceware.org/bugzilla/show_bug.cgi?id=22336

--- Comment #1 from Mike FABIAN <maiku.fabian at gmail dot com> ---
Created attachment 10632
  --> https://sourceware.org/bugzilla/attachment.cgi?id=10632&action=edit
0001-cs_CZ-locale-Base-collation-on-iso14651_t1-BZ-22336.patch

Patch to fix the problem.

Difference in sorting of the added test file (added in the patch)
before and after applying the sorting changes of the patch:

$ diff -u cs_CZ.UTF-8.in.old cs_CZ.UTF-8.in 
--- cs_CZ.UTF-8.in.old  2017-11-24 16:47:52.688348458 +0530
+++ cs_CZ.UTF-8.in      2017-11-24 16:42:08.364221944 +0530
@@ -1,7 +1,3 @@
-ȥ
-Ȥ
-ʒ
-Ʒ
 a
 a
 a
@@ -65,7 +61,6 @@
 cenných
 cenným
 cenou
-cH
 cvrček
 cz
 cZ
@@ -94,8 +89,9 @@
 H
 hruška
 ch
-CH
+cH
 Ch
+CH
 chřestýšům
 Chřestýšům
 chřipka
@@ -188,6 +184,8 @@
 Z
 ź
 Ź
+ȥ
+Ȥ
 za
 Za
 źa
@@ -209,6 +207,8 @@
 Žb
 žluva
 Žluva
+ʒ
+Ʒ
 0
 1
 1

I think "cH" was sorted completely wrong before and "CH" slightly
wrong as well. So this patch seems to not only base the Czech
LC_COLLATE implementation on the iso14651_t1 file as requested in this
bug but also improves the sorting of the uppercase/lowercase variants
of the ch digraph.

And of course it improves the sorting of some non-Czech characters
like ʒ and ȥ because these were not handled at all in the old
Czech LC_COLLATE implementation.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]