This is the mail archive of the
libc-locales@sourceware.org
mailing list for the GNU libc locales project.
[Bug localedata/20242] New: Wrong Unicode mapping for code units 0xFA and 0xFB in cp1125
- From: "felix.von.s at posteo dot de" <sourceware-bugzilla at sourceware dot org>
- To: libc-locales at sourceware dot org
- Date: Sat, 11 Jun 2016 12:27:36 +0000
- Subject: [Bug localedata/20242] New: Wrong Unicode mapping for code units 0xFA and 0xFB in cp1125
- Auto-submitted: auto-generated
https://sourceware.org/bugzilla/show_bug.cgi?id=20242
Bug ID: 20242
Summary: Wrong Unicode mapping for code units 0xFA and 0xFB in
cp1125
Product: glibc
Version: 2.25
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: localedata
Assignee: unassigned at sourceware dot org
Reporter: felix.von.s at posteo dot de
CC: libc-locales at sourceware dot org
Target Milestone: ---
According to IBM documentation at [1], code unit 0xFA in code page 1125
corresponds to U+00F7 DIVISION SIGN and not U+00B7 MIDDLE DOT. Similarly, 0xFB
is U+00B1 PLUS-MINUS SIGN and not U+221A SQUARE ROOT.
The file localedata/charmaps/CP1125 points to the source code of the ICU
library as the source, but apparently it has never contained such a thing[2].
This mistake has already been copied by GNU libiconv, the Python encodings
module and probably others. Amusingly enough, it has also been incorporated
back into the ICU library: [3].
More generally, it might be advisable to look for other incorrect mappings.
[1]
<https://www.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.admin.nls.doc/doc/r0051847.html>
[2]
<https://web.archive.org/web/20030925042016/http://oss.software.ibm.com/cvs/icu/charset/data/ucm/ibm-1125_P100-1997.ucm?rev=1.1&content-type=text/x-cvsweb-markup>
[3]
<https://ssl.icu-project.org/repos/icu/data/trunk/charset/data/ucm/glibc-CP1125-2.3.3.ucm>
--
You are receiving this mail because:
You are on the CC list for the bug.