This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: [PATCH] BZ #19575: Clarify status of entries in GB 18030-2005.
- From: "Carlos O'Donell" <carlos at redhat dot com>
- To: Andreas Schwab <schwab at suse dot de>
- Cc: GNU C Library <libc-alpha at sourceware dot org>
- Date: Wed, 10 Feb 2016 09:50:42 -0500
- Subject: Re: [PATCH] BZ #19575: Clarify status of entries in GB 18030-2005.
- Authentication-results: sourceware.org; auth=none
- References: <56B8FA69 dot 8030508 at redhat dot com> <87mvrakhab dot fsf at linux-m68k dot org> <56B90D0C dot 7090000 at redhat dot com> <87a8nakfq6 dot fsf at linux-m68k dot org> <56B92BC9 dot 7010103 at redhat dot com> <mvma8naxnxs dot fsf at hawking dot suse dot de> <56B9B942 dot 2030203 at redhat dot com> <mvm60xyw5ni dot fsf at hawking dot suse dot de> <56B9BD56 dot 70709 at redhat dot com> <mvm1t8mw509 dot fsf at hawking dot suse dot de> <56B9F179 dot 1060803 at redhat dot com> <mvmmvrat0wp dot fsf at hawking dot suse dot de> <56BA0E93 dot 1060300 at redhat dot com> <mvm60xxu8su dot fsf at hawking dot suse dot de> <56BA25CC dot 8040303 at redhat dot com> <mvmfux1rko9 dot fsf at hawking dot suse dot de> <56BB430C dot 7090901 at redhat dot com> <mvmoabor6tw dot fsf at hawking dot suse dot de>
On 02/10/2016 09:14 AM, Andreas Schwab wrote:
> "Carlos O'Donell" <carlos@redhat.com> writes:
>
>> On 02/10/2016 04:15 AM, Andreas Schwab wrote:
>>> "Carlos O'Donell" <carlos@redhat.com> writes:
>>>
>>>> This statement is only partly correct. Some of the mappings were updated
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>> but 24 mappings for PUA code points still remained.
>>>
>>> What are the updated mappings apart from the 24 being left?
>>
>> Sorry, I don't quite understand the question.
>>
>> Could you please clarify exactly what you would like to know?
>
> Which are those updated mappings?
So you would like to know which mappings changed between GB 18030-2000
and GB 18030-2005? I don't have such a list. In "CJKV Information Processing"
it is noted that there are 2 major areas of revision for 2000 -> 2005:
* Acknowledgment of CJK Unified Ideographs Extension B --- 42,711 hanzi
* Acknowledgment of the six regional scripts: Korean, Mongolian, Tai Le, Tibetan, Uyghur, and Yi.
So it supports all 42,711 hanzi characters, and the six scripts (all 4-byte
regions). There are also 4 pictoral glyph corrections.
May I ask why such a list of updated mappings is relevant here?
The only important thing here is that with those 24 PUA mappings made
into non-PUA equivalents the *entire* GB 18030-2005 can be represented
in Unicode without the use of PUA code points. Which is great because
it means normal unmodified programs can process and represent those
characters correctly.
In summary:
- glibc support GB 18030-2005.
- glibc modifies GB 18030-2005 to use 24 non-PUA code points and make
the implementation fully use Unicode only.
- My comments are there to indicate the modifications for non-PUA code
points (which deviate from the standard).
Cheers,
Carlos.