This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug locale/18927] Different strings should never collate as equal


https://sourceware.org/bugzilla/show_bug.cgi?id=18927

--- Comment #10 from Egmont Koblinger <egmont at gmail dot com> ---
The 0x01 byte, bytes of an invalid UTF-8, and bytes of unrecognized Unicode
codepoints (e.g. U+AC00) all get converted to the exact same token, that is,
e.g. any two of "ê" (U+AC00), "ê" (U+AC01), "\x01\x01\x01" (^A^A^A),
"\x80\x80\x80" (invalid), "\xd0\xfe\xff" (invalid) etc. collate the same.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]