This is the mail archive of the
libc-locales@sourceware.org
mailing list for the GNU libc locales project.
[Bug localedata/22371] U+FFE2 and U+FFE4, iconv does not convert to HALFWIDTH(EUC-JISX0213)
- From: "nakajima.akira at nttcom dot co.jp" <sourceware-bugzilla at sourceware dot org>
- To: libc-locales at sourceware dot org
- Date: Thu, 02 Nov 2017 06:11:02 +0000
- Subject: [Bug localedata/22371] U+FFE2 and U+FFE4, iconv does not convert to HALFWIDTH(EUC-JISX0213)
- Auto-submitted: auto-generated
- References: <bug-22371-716@http.sourceware.org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=22371
--- Comment #5 from Akira Nakajima <nakajima.akira at nttcom dot co.jp> ---
\xa1\xef is mapped to U+00A5(HALFWIDTH YEN) in EUC-JISX0213 and EUC-JIS-2004 by
following URL.
\xa1\xb1 is as same.
"However, with Unicode 3.2.0 the mappings differ in 3 codepoints."
http://search.cpan.org/~dankogai/Encode-JIS2K-0.03/JIS2K.pm#what_is_JIS_X_0213_anyway?
=============================================
http://charset.uic.jp/show/eucjisx0213/
http://x0213.org/codetable/euc-jis-2004-with-char.txt
char JIS Unicode
 ̄ 0xA1B1 U+203E # OVERLINE Windows: U+FFE3
― 0xA1BD U+2014 # EM DASH Windows: U+2015
¥ 0xA1EF U+00A5 # YEN SIGN Windows: U+FFE5
=============================================
=============================================
perl 5.24.3
# perl -e 'use Encode; use Encode::JISX0213; print encode("euc-jisx0213",
"\x{00a5}");' | od -tx1
0000000 a1 ef
# perl -e 'use Encode; use Encode::JISX0213; print encode("euc-jisx0213",
"\x{ffe5}");' | od -tx1
0000000 a1 ef
=============================================
But Python and "/usr/local/share/i18n/charmaps/EUC-JISX0213.gz"
have mapping to U+FFE5.
I don't know which one is correct.
=============================================
Python 3.6.2
# python3 -c "print(u'\u00a5'.encode('euc-jisx0213'))"
Traceback (most recent call last):
File "<string>", line 1, in <module>
UnicodeEncodeError: 'euc_jisx0213' codec can't encode character '\xa5' in
position 0: illegal multibyte sequence
# python3 -c "print(u'\uffe5'.encode('euc-jisx0213'))"
b'\xa1\xef'
=============================================
=============================================
/usr/local/share/i18n/charmaps/EUC-JISX0213.gz (Fedora 26)
<UFFE3> /xa1/xb1 FULLWIDTH MACRON
<UFFE5> /xa1/xef FULLWIDTH YEN SIGN
=============================================
--
You are receiving this mail because:
You are on the CC list for the bug.