This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug libc/4335] New: EastAsianAmbiguous character width is always 1 in UTF-8 locale


According to /usr/share/i18n/charmaps/UTF-8.gz,
Character width is 1 by default.  W(Wide) and F(Full Width) are 2.

% Character width according to Unicode 3.2.
% - Default width is 1.
% - Double-width characters have width 2; generated from
%        "grep '^[^;]*;[WF]' EastAsianWidth.txt"
%   and  "grep '^[^;]*;[^WF]' EastAsianWidth.txt"
% - Non-spacing characters have width 0; generated from PropList.txt or
%   "grep '^[^;]*;[^;]*;[^;]*;[^;]*;NSM;' UnicodeData.txt"
% - Format control characters have width 0; generated from
%   "grep '^[^;]*;[^;]*;Cf;' UnicodeData.txt"
% - Zero width characters have width 0; generated from
%   "grep '^[^;]*;ZERO WIDTH ' UnicodeData.txt"

A(Ambiguous) is expected that it is context-sensitive,
but its width is always 1 irrelevant to context.

According to http://www.unicode.org/reports/tr11/#Recommendations

> When mapping Unicode to East Asian legacy character encodings
> 
>     * Wide Unicode characters always map to fullwidth characters.
>     * Narrow (and neutral) Unicode characters always map to halfwidth characters.
>     * Halfwidth Unicode characters always map to halfwidth characters.
>     * Ambiguous Unicode characters always map to fullwidth characters.

I think EastAsianAmbiguous character width should be 2 in CJK UTF-8 locale.

-- 
           Summary: EastAsianAmbiguous character width is always 1 in UTF-8
                    locale
           Product: glibc
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: libc
        AssignedTo: drepper at redhat dot com
        ReportedBy: d+bugzilla at vdr dot jp
                CC: glibc-bugs at sources dot redhat dot com


http://sourceware.org/bugzilla/show_bug.cgi?id=4335

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]