This is the mail archive of the
libc-locales@sourceware.org
mailing list for the GNU libc locales project.
[Bug localedata/13063] 'sort -u' will erase some Chinese characters
- From: "maiku.fabian at gmail dot com" <sourceware-bugzilla at sourceware dot org>
- To: libc-locales at sourceware dot org
- Date: Thu, 20 Jul 2017 08:01:58 +0000
- Subject: [Bug localedata/13063] 'sort -u' will erase some Chinese characters
- Auto-submitted: auto-generated
- References: <bug-13063-716@http.sourceware.org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=13063
--- Comment #7 from Mike FABIAN <maiku.fabian at gmail dot com> ---
(In reply to Mingye Wang from comment #6)
> This bug is not only seen with extA characters, but also seen with simple
> punctuations and/or kanas.
>
> $ printf '%s\n' , 。 : ¥ あ か ア カ a b c , . : $ | LC_COLLATE=zh_CN.UTF-8 sort
> -u
> ,
> :
> .
> $
> ,
> a
> b
> c
>
> (uniq does the same thing.)
>
> It seems that glibc is just eating away anything not on that list. (What
> kind of equivalence assumption is that?)
This is caused by the collation symbol UNDEFINED not working correctly,
see:
https://sourceware.org/bugzilla/show_bug.cgi?id=18978
--
You are receiving this mail because:
You are the assignee for the bug.