This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Unicode 3.2 support (4)


Thanks again for this update, Bruno.  :-)

Just a reminder that the current implementation of GB18030 conversion
table in glibc (even with your latest additions) does not comply to the
full GB18030 standard and will not pass the official certification
test. (See the heated discussion in January 2002.) Here are the facts:

   1. GB18030 is intended to be an UTF: Just as UTF-8 is ASCII-compatible
      and can map to all unassigned-yet-legal codepoints in Unicode, so is
      GB18030 GB2312/GBK-compatible and can map to all unassigned-yet-legal
      codepoints in Unicode.

   2. The current GB18030 implementation in glibc is incomplete:
      it does not conform to Point 1 above.  Remember, GB18030 is
      not just another encoding; it needs special treatment.
      Throwing away "unassigned-yet-legal" codepoints means data-loss.

   3. No Linux distribution can pass the the Chinese government's
      certification with glibc's GB18030.  We at ThizLinux Laboratory Ltd.
      had to fix glibc's GB18030 table in a jiffy when we discovered
      the deficiency.  After that, we were able to pass the certification
      with A+ grade.

   4. Even Red Hat had to add a patch to patch the "holes" in glibc-2.2.5's
      gb18030.c in order the meet the certification requirement.

   5. Even Microsoft Windows' GB18030 module converts the full range
      of GB18030<->Unicode properly.  (Why should glibc's GB18030 table
      do less?)

   6. It is up to the Chinese IT Standization Technical Committee (of
      CESI) et al. to decide what is GB18030 compliant and what is
      not.  It is not up to glibc to decide.  Glibc's code can do its
      own thing, but it just won't pass the official GB18030
      certification.

I have spoken more than enough on this issues, as have other
individuals, but no one were able convince certain key glibc people
otherwise.  Which is a sad thing, because all distributions intended to
be GB18030-compliant must patch (and has patched) glibc's GB18030 table
one way or another.  What is the point of keeping glibc's incomplete
GB18030 table the way it is then?  *sigh*...

Best regards,

Anthony


On Wed, Apr 17, 2002 at 04:45:48PM +0200, Bruno Haible wrote:
> 
> Here is a patch to upgrade the GB18030 charmap and iconv converter to
> Unicode 3.2.
> 
> 
> ChangeLog:
> 2002-04-15  Bruno Haible  <bruno@clisp.org>
> 
> 	* iconvdata/gb18030.c (__twobyte_to_ucs, __fourbyte_to_ucs,
> 	__ucs_to_gb18030_tab1, __ucs_to_gb18030_tab2): Update to Unicode 3.2.
> 
> localedata/ChangeLog:
> 2002-04-15  Bruno Haible  <bruno@clisp.org>
> 
> 	* charmaps/GB18030: Update for Unicode 3.2:
> 	Add <U0220>, <U034F>, <U0363>..<U036F>, <U03D8>..<U03D9>,
> 	<U03F6>, <U048A>..<U048B>, <U04C5>..<U04C6>, <U04C9>..<U04CA>,
> 	<U04CD>..<U04CE>, <U0500>..<U050F>, <U066E>..<U066F>, <U07B1>,
> 	<U10F7>..<U10F8>, <U1700>..<U170C>, <U170E>..<U1714>, <U1720>..<U1736>,
> 	<U1740>..<U1753>, <U1760>..<U1770>, <U1772>..<U1773>, <U2047>,
> 	<U204E>..<U2052>, <U2057>, <U205F>..<U2063>, <U2071>, <U20B0>..<U20B1>,
> 	<U20E4>..<U20EA>, <U213D>..<U214B>, <U21F4>..<U21FF>, <U22F2>..<U22FF>,
> 	<U237C>, <U239B>..<U23CE>, <U24EB>..<U24FE>, <U2596>..<U259F>,
> 	<U25F8>..<U25FF>, <U2616>..<U2617>, <U2672>..<U267D>, <U2680>..<U2689>,
> 	<U2768>..<U2775>, <U27D0>..<U27EB>, <U27F0>..<U27FF>, <U2900>..<U2AFF>,
> 	<U303B>..<U303D>, <U3095>..<U3096>, <U309F>..<U30A0>, <U30FF>,
> 	<U31F0>..<U31FF>, <U3251>..<U325F>, <U32B1>..<U32BF>, <UA4A2>..<UA4A3>,
> 	<UA4B4>, <UA4C1>, <UA4C5>, <UFA30>..<UFA6A>, <UFDFC>, <UFE00>..<UFE0F>,
> 	<UFE45>..<UFE46>, <UFE73>, <UFF5F>..<UFF60>. Update width table.
> 
> [The patch is too large for this mailing list. You can download it from
> ftp://ftp.ilog.fr/pub/Users/haible/gnu/glibc-unicode32-patch4.bz2 .]

-- 
Anthony Fok Tung-Ling
ThizLinux Laboratory   <anthony@thizlinux.com> http://www.thizlinux.com/
Debian Chinese Project <foka@debian.org>       http://www.debian.org/intl/zh/
Come visit Our Lady of Victory Camp!           http://www.olvc.ab.ca/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]