This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug localedata/19932] New: mbrtowc returns (size_t) -1 in C locale
- From: "eggert at gnu dot org" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sourceware dot org
- Date: Sat, 09 Apr 2016 08:14:46 +0000
- Subject: [Bug localedata/19932] New: mbrtowc returns (size_t) -1 in C locale
- Auto-submitted: auto-generated
https://sourceware.org/bugzilla/show_bug.cgi?id=19932
Bug ID: 19932
Summary: mbrtowc returns (size_t) -1 in C locale
Product: glibc
Version: 2.22
Status: NEW
Severity: normal
Priority: P2
Component: localedata
Assignee: unassigned at sourceware dot org
Reporter: eggert at gnu dot org
CC: libc-locales at sourceware dot org
Target Milestone: ---
Created attachment 9173
--> https://sourceware.org/bugzilla/attachment.cgi?id=9173&action=edit
test mbrtowc in the C locale
This follows up on a bug reported by BjÃrn Jacke against GNU grep 2.23; see
<http://bugs.gnu.org/23234>. The bug occurs because GNU grep uses mbrtowc to
detect encoding errors, and because glibc mbrtowc reports an encoding error in
the C locale when given a byte in the range 128-255 decimal.
It was always the intent of POSIX that all 256 bytes be valid characters in the
C locale, as that was the traditional behavior. This wasn't clearly stated in
the standard, but this is a bug that is planned to be fixed in a future version
of POSIX; see <http://austingroupbugs.net/view.php?id=663#c2738> (2015-07-02).
Glibc should be fixed to conform to this, i.e., mbrtowc should never return
(size_t) -1 in the C locale.
I plan to work around this bug in the gnulib mbrtowc module, which should fix
the grep bug; but this is a hack and will slow grep down a bit. The bug should
be fixed in glibc.
Please see the attached program for an illustration of the bug. The program
should output nothing and exit with status 0, but on glibc it outputs lines
like the following:
byte 0x80 (0200) encoding error
byte 0x81 (0201) encoding error
...
byte 0xff (0377) encoding error
and exits with status 1.
--
You are receiving this mail because:
You are on the CC list for the bug.