This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug localedata/20903] New: glibc's Windows single-byte pages don't map like Windows for previously unmapped points
- From: "arthur200126 at gmail dot com" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sourceware dot org
- Date: Fri, 02 Dec 2016 04:11:50 +0000
- Subject: [Bug localedata/20903] New: glibc's Windows single-byte pages don't map like Windows for previously unmapped points
- Auto-submitted: auto-generated
https://sourceware.org/bugzilla/show_bug.cgi?id=20903
Bug ID: 20903
Summary: glibc's Windows single-byte pages don't map like
Windows for previously unmapped points
Product: glibc
Version: unspecified
Status: UNCONFIRMED
Severity: minor
Priority: P2
Component: localedata
Assignee: unassigned at sourceware dot org
Reporter: arthur200126 at gmail dot com
CC: libc-locales at sourceware dot org
Target Milestone: ---
Contrary to what MSDN "App UI" archives say, Windows' code pages don't map
unassigned code points like 0x81 in cp1252 to U+FFFD, but uses a Unicode code
point of the same value like Latin with C1. This bidirectional "windows code
page behavior" is documented in the "best fit" charts.[1]
[1]:
http://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/readme.txt
Since these Windows mappings for these Windows code pages are seldom
implemented outside of Windows, and nobody would like to use an incomplete set
of C1 characters in interchange, this task is set to minor importance. It's
likely that only archaeologists[2] trying to decode or reproduce
multiple-encoded UTF-8[3][4] will find this task helpful.
[2]: "ftfy: fixes text for you", <https://ftfy.readthedocs.io>
[3]: "WTF-8, a transformation format of code page 1252",
<http://www-uxsup.csx.cam.ac.uk/~fanf2/hermes/doc/qsmtp/draft-fanf-wtf8.html>
[4]: " the future of publishing at
W3C", <https://twitter.com/koalie/status/506821684687413248>
--
You are receiving this mail because:
You are on the CC list for the bug.