This is the mail archive of the libc-locales@sourceware.org mailing list for the GNU libc locales project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug localedata/20864] New: iconv: cp936 missing single-byte euro sign (0x80, U+20AC), not same as GBK


https://sourceware.org/bugzilla/show_bug.cgi?id=20864

            Bug ID: 20864
           Summary: iconv: cp936 missing single-byte euro sign (0x80,
                    U+20AC), not same as GBK
           Product: glibc
           Version: unspecified
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: localedata
          Assignee: unassigned at sourceware dot org
          Reporter: arthur200126 at gmail dot com
                CC: libc-locales at sourceware dot org
  Target Milestone: ---

The addition of a single-byte euro sign at 0x80 in CP936 is possibly the most
well-known difference between the Windows Code Page and the GBK specification.
However, current versions of glibc seems to alias CP936 to GBK and not display
this behavior.

The following session comes from GNU bash running in a UTF-8 console. $''
denotes bash's ANSI C-style quoting, where \xhh generates a raw hex byte and
\uhhhh generates the representation of U+hhhh under current locale.

# iconv (Ubuntu GLIBC 2.23-0ubuntu4) 2.23
$ iconv -f cp936 -t utf-8 <<< $'\x80'
iconv: illegal input sequence at position 0
$ iconv -t cp936 -f utf-8 <<< $'\u20ac' | hexdump -C
iconv: illegal input sequence at position 0 

Expected behavior (from libiconv) is shown below.

# iconv (GNU libiconv 1.14)
$ iconv -f cp936 -t utf-8 <<< $'\x80'
€
$ iconv -t cp936 -f utf-8 <<< $'\u20ac' | hexdump -C
00000000  80 0a                                             |..|
00000002

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]