This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug locale/19883] New: Unicode encodings should be limited to characters U+0010FFFF and below
- From: "jsm28 at gcc dot gnu.org" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sourceware dot org
- Date: Tue, 29 Mar 2016 22:00:37 +0000
- Subject: [Bug locale/19883] New: Unicode encodings should be limited to characters U+0010FFFF and below
- Auto-submitted: auto-generated
https://sourceware.org/bugzilla/show_bug.cgi?id=19883
Bug ID: 19883
Summary: Unicode encodings should be limited to characters
U+0010FFFF and below
Product: glibc
Version: 2.23
Status: NEW
Severity: normal
Priority: P2
Component: locale
Assignee: unassigned at sourceware dot org
Reporter: jsm28 at gcc dot gnu.org
Target Milestone: ---
There is code in various places in glibc that allows for UTF-8 encodings
representing characters above U+0010FFFF, which were valid in the 2003 edition
of ISO 10646 but not in the 2011 edition.
Such code should be identified and removed. Such encodings should be treated
as invalid on input. Values above U+0010FFFF should be treated as invalid for
UCS-4, wchar_t and any equivalent encodings, in the same way that values above
U+7FFFFFFF and values in the surrogate pair range already are (or should be)
for such encodings, rather than converted to such UTF-8 encodings on conversion
to UTF-8.
--
You are receiving this mail because:
You are on the CC list for the bug.