This is the mail archive of the libc-locales@sourceware.org mailing list for the GNU libc locales project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug localedata/21547] New: Tibetan script collation broken (Dzongkha and Tibetan)


https://sourceware.org/bugzilla/show_bug.cgi?id=21547

            Bug ID: 21547
           Summary: Tibetan script collation broken (Dzongkha and Tibetan)
           Product: glibc
           Version: 2.24
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: localedata
          Assignee: unassigned at sourceware dot org
          Reporter: elie.roux@telecom-bretagne.eu
                CC: libc-locales at sourceware dot org
  Target Milestone: ---

Hello,

Tibetan or Dzongkha sorting do not work properly with the current locale data.

With the following test file:

$ cat tibt_order_test.txt
ལྔ
ང
ཅ
རྔ
སྔ
བརྔ
བསྔ

I get the following wrong result:

$ LC_COLLATE="dz_BT.utf8" sort tibt_order_test.txt
ང
བརྔ
བསྔ
རྔ
ལྔ
སྔ
ཅ

The correct result would be

ང
རྔ
ལྔ
སྔ
བརྔ
བསྔ
ཅ

Dz and bo have the same collation data in CLDR.

See https://github.com/eroux/tibetan-collation for more on tibetan collation.

Result of locale -a:

bo_CN
bo_CN.utf8
bo_IN
bo_IN.utf8
C
C.UTF-8
dz_BT
dz_BT.utf8
en_GB.utf8
en_US.utf8
fr_FR.utf8
POSIX

Thank you,
-- 
Elie

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]