This is the mail archive of the
libc-locales@sourceware.org
mailing list for the GNU libc locales project.
[Bug localedata/21547] New: Tibetan script collation broken (Dzongkha and Tibetan)
- From: "elie dot roux at telecom-bretagne dot eu" <sourceware-bugzilla at sourceware dot org>
- To: libc-locales at sourceware dot org
- Date: Mon, 05 Jun 2017 10:32:15 +0000
- Subject: [Bug localedata/21547] New: Tibetan script collation broken (Dzongkha and Tibetan)
- Auto-submitted: auto-generated
https://sourceware.org/bugzilla/show_bug.cgi?id=21547
Bug ID: 21547
Summary: Tibetan script collation broken (Dzongkha and Tibetan)
Product: glibc
Version: 2.24
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: localedata
Assignee: unassigned at sourceware dot org
Reporter: elie.roux@telecom-bretagne.eu
CC: libc-locales at sourceware dot org
Target Milestone: ---
Hello,
Tibetan or Dzongkha sorting do not work properly with the current locale data.
With the following test file:
$ cat tibt_order_test.txt
ལྔ
ང
ཅ
རྔ
སྔ
བརྔ
བསྔ
I get the following wrong result:
$ LC_COLLATE="dz_BT.utf8" sort tibt_order_test.txt
ང
བརྔ
བསྔ
རྔ
ལྔ
སྔ
ཅ
The correct result would be
ང
རྔ
ལྔ
སྔ
བརྔ
བསྔ
ཅ
Dz and bo have the same collation data in CLDR.
See https://github.com/eroux/tibetan-collation for more on tibetan collation.
Result of locale -a:
bo_CN
bo_CN.utf8
bo_IN
bo_IN.utf8
C
C.UTF-8
dz_BT
dz_BT.utf8
en_GB.utf8
en_US.utf8
fr_FR.utf8
POSIX
Thank you,
--
Elie
--
You are receiving this mail because:
You are on the CC list for the bug.