This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug locale/22898] Some Chinese characters cannot be sorted by adding sorting rules to LC_COLLATE
- From: "maiku.fabian at gmail dot com" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sourceware dot org
- Date: Mon, 26 Feb 2018 07:43:49 +0000
- Subject: [Bug locale/22898] Some Chinese characters cannot be sorted by adding sorting rules to LC_COLLATE
- Auto-submitted: auto-generated
- References: <bug-22898-131@http.sourceware.org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=22898
--- Comment #1 from Mike FABIAN <maiku.fabian at gmail dot com> ---
diff --git a/localedata/en_GB.UTF-8.in b/localedata/en_GB.UTF-8.in
new file mode 100644
index 0000000000..b365767bac
--- /dev/null
+++ b/localedata/en_GB.UTF-8.in
@@ -0,0 +1,10 @@
+a
+A
+ĉ
+Ĉ
+𠮞 ; <U00020B9E>
+𫡅 ; <U0002B845>
So the test file expects U+2B845 to be sorted at this position.
+b
+B
+c
+C
diff --git a/localedata/locales/en_GB b/localedata/locales/en_GB
index 5b895574ac..e114a3a440 100644
--- a/localedata/locales/en_GB
+++ b/localedata/locales/en_GB
@@ -60,6 +60,19 @@ END LC_CTYPE
LC_COLLATE
% Copy the template from ISO/IEC 14651
copy "iso14651_t1"
+
+collating-symbol <ccirc>
+
+reorder-after <AFTER-A>
+<ccirc>
+
+<U0108> <ccirc>;<BASE>;<CAP>;<U0108>
+<U0109> <ccirc>;<BASE>;<MIN>;<U0109>
+<U00020B9E> <ccirc>;<BASE>;<CAP>;<U00020B9E>
+<U0002B845> <ccirc>;<BASE>;<CAP>;<U0002B845>
Here we have a rule to sort U+2B845 like the collation symbol <ccirc> which is
reordered
after the Latin letter a.
+
+reorder-end
+
END LC_COLLATE
LC_MONETARY
But when running "make check" one gets:
$ grep ^FAIL tests.sum
FAIL: localedata/sort-test
And the test output contains:
en_GB.UTF-8 collate-test FAIL
--- en_GB.UTF-8.in 2018-02-26 10:53:50.810558237 +0100
+++ /local/mfabian/src/glibc-build/localedata/en_GB.UTF-8.out 2018-02-26
13:36:16.922398151 +0100
@@ -1,9 +1,9 @@
+𫡅 ; <U0002B845>
a
A
ĉ
Ĉ
𠮞 ; <U00020B9E>
-𫡅 ; <U0002B845>
b
B
c
So U+20B9E is sorted as expected but U+2B845 is not. U+2B845 is sorted as if
there
were not rules at all for this character. Therefore, it ends up before a.
--
You are receiving this mail because:
You are on the CC list for the bug.