This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] [BZ 17588 13064] Update UTF-8 charmap and width to Unicode 7.0.0

From: "Carlos O'Donell" <carlos at redhat dot com>
To: Alexandre Oliva <aoliva at redhat dot com>
Cc: Pravin Satpute <psatpute at redhat dot com>, Siddhesh Poyarekar <siddhesh at redhat dot com>, Mike FABIAN <mfabian at redhat dot com>, libc-alpha at sourceware dot org, Jens Petersen <petersen at redhat dot com>
Date: Fri, 20 Feb 2015 13:57:37 -0500
Subject: Re: [PATCH] [BZ 17588 13064] Update UTF-8 charmap and width to Unicode 7.0.0
Authentication-results: sourceware.org; auth=none
References: <573624784 dot 8871393 dot 1416848051220 dot JavaMail dot zimbra at redhat dot com> <orzjb3o7yf dot fsf at free dot home> <s9dy4qir6fu dot fsf at ari dot site> <orfvce7y90 dot fsf at free dot home> <s9d388duu5r dot fsf at ari dot site> <orioh35mbq dot fsf at free dot home> <20141223111038 dot GA5172 at spoyarek dot pnq dot redhat dot com> <119234933 dot 5523688 dot 1422972847328 dot JavaMail dot zimbra at redhat dot com> <or7fvnlbeo dot fsf at livre dot home> <orwq3njuvc dot fsf at livre dot home> <54E23EC9 dot 5020400 at redhat dot com> <ortwyig5xa dot fsf at livre dot home>

On 02/18/2015 06:23 PM, Alexandre Oliva wrote:
> 	[BZ #17588]
> 	[BZ #13064]
> 	[BZ #14094]
> 	[BZ #17998]
> 	* unicode-gen/Makefile: New.
> 	* unicode-gen/unicode-license.txt: New, from Unicode.
> 	* unicode-gen/UnicodeData.txt: New, from Unicode.
> 	* unicode-gen/DerivedCoreProperties.txt: New, from Unicode.
> 	* unicode-gen/EastAsianWidth.txt: New, from Unicode.
> 	* unicode-gen/gen_unicode_ctype.py: New generator, from Mike
> 	FABIAN <mfabian@redhat.com>.
> 	* unicode-gen/ctype_compatibility.py: New verifier, from
> 	Pravin Satpute <psatpute@redhat.com> and Mike FABIAN.
> 	* unicode-gen/ctype_compatibility_test_cases.py: New verifier
> 	module, from Mike FABIAN.
> 	* unicode-gen/utf8_gen.py: New generator, from Pravin Satpute
> 	and Mike FABIAN.
> 	* unicode-gen/utf8_compatibility.py: New verifier, from Pravin
> 	Satpute and Mike FABIAN.
> 	* charmaps/UTF-8: Update.
> 	* locales/i18n: Update.
> 	* gen-unicode-ctype.c: Remove.
> 	* tst-ctype-de_DE.ISO-8859-1.in: Adjust, islower now returns
> 	true for ordinal indicators.

Looks good to me. Please feel free to commit.

One nit:

-% Character width according to Unicode 5.0.0.
+% Character width according to Unicode 7.0.0.
 % - Default width is 1.
 % - Double-width characters have width 2; generated from
 %        "grep '^[^;]*;[WF]' EastAsianWidth.txt"
-%   and  "grep '^[^;]*;[^WF]' EastAsianWidth.txt"
 % - Non-spacing characters have width 0; generated from PropList.txt or
 %   "grep '^[^;]*;[^;]*;[^;]*;[^;]*;NSM;' UnicodeData.txt"
 % - Format control characters have width 0; generated from
 %   "grep '^[^;]*;[^;]*;Cf;' UnicodeData.txt"
-% - Zero width characters have width 0; generated from
-%   "grep '^[^;]*;ZERO WIDTH ' UnicodeData.txt"

Why even mention the `grep` to be used to generate this data?
It should just say to use the scripts. Nobody should be confused
that this data was actually generated by this method. Nor do I want
anyone doing it this way ever again.

Thus shouldn't `write_header_width` simply not output any of this
stuff? I understand we're trying to minimize the initial diff, but
in cleanup, we should remove all of this and just say:

"% Character width according to Unicode 7.0.0."

Thoughts?

Cheers,
Carlos.

References:
- Re: [PATCH] [BZ 17588 13064] Update UTF-8 charmap and width to Unicode 7.0.0
  - From: Pravin Satpute
- Re: [PATCH] [BZ 17588 13064] Update UTF-8 charmap and width to Unicode 7.0.0
  - From: Alexandre Oliva
- Re: [PATCH] [BZ 17588 13064] Update UTF-8 charmap and width to Unicode 7.0.0
  - From: Carlos O'Donell
- Re: [PATCH] [BZ 17588 13064] Update UTF-8 charmap and width to Unicode 7.0.0
  - From: Alexandre Oliva

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]