This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug localedata/13063] 'sort -u' will erase some Chinese characters
- From: "arthur200126 at gmail dot com" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sourceware dot org
- Date: Sun, 22 Jan 2017 23:56:17 +0000
- Subject: [Bug localedata/13063] 'sort -u' will erase some Chinese characters
- Auto-submitted: auto-generated
- References: <bug-13063-131@http.sourceware.org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=13063
Mingye Wang <arthur200126 at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |arthur200126 at gmail dot com
--- Comment #6 from Mingye Wang <arthur200126 at gmail dot com> ---
This bug is not only seen with extA characters, but also seen with simple
punctuations and/or kanas.
$ printf '%s\n' , 。 : ¥ あ か ア カ a b c , . : $ | LC_COLLATE=zh_CN.UTF-8 sort -u
,
:
.
$
,
a
b
c
(uniq does the same thing.)
It seems that glibc is just eating away anything not on that list. (What kind
of equivalence assumption is that?)
--
You are receiving this mail because:
You are on the CC list for the bug.