This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
[PATCH] BZ #19575: Clarify status of entries in GB 18030-2005.
- From: "Carlos O'Donell" <carlos at redhat dot com>
- To: Andreas Schwab <schwab at suse dot de>, GNU C Library <libc-alpha at sourceware dot org>
- Date: Mon, 8 Feb 2016 15:28:25 -0500
- Subject: [PATCH] BZ #19575: Clarify status of entries in GB 18030-2005.
- Authentication-results: sourceware.org; auth=none
In bug 19575 Florian Weimer asks about the status of the glibc
support for GB 18030-2005, since ICU and Emacs produce slightly
different results than glibc.
The following patch adds clarifying comments to GB 18030-2005's
character map to explain why glibc has the following mapping and
why it is best-practice.
localedata/
2016-02-08 Carlos O'Donell <carlos@redhat.com>
* charmaps/GB18030
diff --git a/localedata/charmaps/GB18030 b/localedata/charmaps/GB18030
index 863a123..c48276e 100644
--- a/localedata/charmaps/GB18030
+++ b/localedata/charmaps/GB18030
@@ -57234,6 +57234,12 @@ CHARMAP
<UE78A> /xa6/xbe <Private Use>
<UE78B> /xa6/xbf <Private Use>
<UE78C> /xa6/xc0 <Private Use>
+% The newest GB 18030-2005 standard still uses some private use area
+% code points. Any implementation which has Unicode 4.1 or newer
+% support should not use these PUA code points, and instead should
+% map these entries to their equivalent non-PUA code points which
+% in this case map from <UFE10> to <UFE19>. This recommendation is
+% based on "CJKV Processing" by Dr. Ken Lunde.
% <UE78D> /xa6/xd9 <Private Use>
% <UE78E> /xa6/xda <Private Use>
% <UE78F> /xa6/xdb <Private Use>
@@ -62997,6 +63003,10 @@ CHARMAP
<UFE0D> /x84/x31/x82/x33 VARIATION SELECTOR-14
<UFE0E> /x84/x31/x82/x34 VARIATION SELECTOR-15
<UFE0F> /x84/x31/x82/x35 VARIATION SELECTOR-16
+% The code points from <UFE10> to <UFE19> are a adjustment
+% of the GB 18030-2005 standard to account for the fact that
+% with Unicode 4.1 support we can now correctly represent those
+% entries, which in the standard, used PUA code points.
<UFE10> /xa6/xd9 PRESENTATION FORM FOR VERTICAL COMMA
<UFE11> /xa6/xdb PRESENTATION FORM FOR VERTICAL IDEOGRAPHIC COMMA
<UFE12> /xa6/xda PRESENTATION FORM FOR VERTICAL IDEOGRAPHIC FULL STOP
---
Cheers,
Carlos.