This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug locale/19726] Converting UCS4LE to INTERNAL with iconv() does not update pointers and lengths in error-case.
- From: "cvs-commit at gcc dot gnu.org" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sourceware dot org
- Date: Wed, 25 May 2016 15:19:48 +0000
- Subject: [Bug locale/19726] Converting UCS4LE to INTERNAL with iconv() does not update pointers and lengths in error-case.
- Auto-submitted: auto-generated
- References: <bug-19726-131 at http dot sourceware dot org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=19726
--- Comment #1 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".
The branch, master has been updated
via 7ab1de21067d72460ac14089bf6541b10fc14c80 (commit)
via 8f25676c83eef5c85db98f9cf027890fbe810447 (commit)
via a42a95c43133d69b1108f582cffa0f6986a9c3da (commit)
via 52f8a48e24563daa807f94824ce9782b9a9eece9 (commit)
via ee518b7070b1bcb41382b6db10f513e071b2c20e (commit)
via 6896776c3c9c32fd22324e6de6737dd69ae73213 (commit)
via 5bd11b19099b3f22d821515f9c93f1ecc1a7e15e (commit)
via 421c5278d83e72740150259960a431706ac343f9 (commit)
via 81c6380887c6d62c56e5f0f85a241f759f58b2fd (commit)
via 3b704e26b33e35d99de920f8462d8e438f89be39 (commit)
via 4690dab084f854bf0013b5eaabcf90c2d5b692ff (commit)
via 9b7f05599a92dead97d6683bc838a57bc63ac52b (commit)
via c70e9913d2fc2d0bf6a3ca98a4dece759d40a4ec (commit)
from 5ff81530dd14552a48a8fcb119e5867a1b504cc6 (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=7ab1de21067d72460ac14089bf6541b10fc14c80
commit 7ab1de21067d72460ac14089bf6541b10fc14c80
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date: Wed May 25 17:18:06 2016 +0200
Fix UTF-16 surrogate handling. [BZ #19727]
According to the latest Unicode standard, a conversion from/to UTF-xx has
to report an error if the character value is in range of an utf16 surrogate
(0xd800..0xdfff). See
https://sourceware.org/ml/libc-help/2015-12/msg00015.html.
Thus this patch fixes this behaviour for converting from utf32 to internal
and
from internal to utf8.
Furthermore the conversion from utf16 to internal does not report an error
if the
input-stream consists of two low-surrogate values. If an uint16_t value is
in the
range of 0xd800 .. 0xdfff, the next uint16_t value is checked, if it is in
the
range of a low surrogate (0xdc00 .. 0xdfff). Afterwards these two uint16_t
values are interpreted as a high- and low-surrogates pair. But there is no
test
if the first uint16_t value is really in the range of a high-surrogate
(0xd800 .. 0xdbff). If there would be two uint16_t values in the range of a
low
surrogate, then they will be treated as a valid high- and low-surrogates
pair.
This patch adds this test.
This patch also adds a new testcase, which checks UTF conversions with
input
values in range of UTF16 surrogates. The test converts from UTF-xx to
INTERNAL,
INTERNAL to UTF-xx and directly between UTF-xx to UTF-yy. The latter
conversion
is needed because s390 has iconv-modules, which converts from/to UTF in one
step.
The new testcase was tested on a s390, power and intel machine.
ChangeLog:
[BZ #19727]
* iconvdata/utf-16.c (BODY): Report an error if first word is not a
valid high surrogate.
* iconvdata/utf-32.c (BODY): Report an error if the value is in range
of an utf16 surrogate.
* iconv/gconv_simple.c (BODY): Likewise.
* iconvdata/bug-iconv12.c: New file.
* iconvdata/Makefile (tests): Add bug-iconv12.
rename test
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=8f25676c83eef5c85db98f9cf027890fbe810447
commit 8f25676c83eef5c85db98f9cf027890fbe810447
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date: Wed May 25 17:18:06 2016 +0200
Fix ucs4le_internal_loop in error case. [BZ #19726]
When converting from UCS4LE to INTERNAL, the input-value is checked for a
too
large value and the iconv() call sets errno to EILSEQ. In this case the
inbuf
argument of the iconv() call should point to the invalid character, but it
points to the beginning of the inbuf.
Thus this patch updates the pointers inptrp and outptrp before returning in
this error case.
This patch also adds a new testcase for this issue.
The new test was tested on a s390, power, intel machine.
ChangeLog:
[BZ #19726]
* iconv/gconv_simple.c (ucs4le_internal_loop): Update inptrp and
outptrp in case of an illegal input.
* iconv/tst-iconv6.c: New file.
* iconv/Makefile (tests): Add tst-iconv6.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a42a95c43133d69b1108f582cffa0f6986a9c3da
commit a42a95c43133d69b1108f582cffa0f6986a9c3da
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date: Wed May 25 17:18:06 2016 +0200
S390: Fix utf32 to utf16 handling of low surrogates (disable cu42).
According to the latest Unicode standard, a conversion from/to UTF-xx has
to report an error if the character value is in range of an utf16 surrogate
(0xd800..0xdfff). See
https://sourceware.org/ml/libc-help/2015-12/msg00015.html.
Thus the cu42 instruction, which converts from utf32 to utf16, has to be
disabled because it does not report an error in case of a value in range of
a low surrogate (0xdc00..0xdfff). The etf3eh variant is removed and the c,
vector variant is adjusted to handle the value in range of an utf16 low
surrogate correctly.
ChangeLog:
* sysdeps/s390/utf16-utf32-z9.c: Disable cu42 instruction and report
an error in case of a value in range of an utf16 low surrogate.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=52f8a48e24563daa807f94824ce9782b9a9eece9
commit 52f8a48e24563daa807f94824ce9782b9a9eece9
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date: Wed May 25 17:18:05 2016 +0200
S390: Fix utf32 to utf8 handling of low surrogates (disable cu41).
According to the latest Unicode standard, a conversion from/to UTF-xx has
to report an error if the character value is in range of an utf16 surrogate
(0xd800..0xdfff). See
https://sourceware.org/ml/libc-help/2015-12/msg00015.html.
Thus the cu41 instruction, which converts from utf32 to utf8, has to be
disabled because it does not report an error in case of a value in range of
a low surrogate (0xdc00..0xdfff). The etf3eh variant is removed and the c,
vector variant is adjusted to handle the value in range of an utf16 low
surrogate correctly.
ChangeLog:
* sysdeps/s390/utf8-utf32-z9.c: Disable cu41 instruction and report
an error in case of a value in range of an utf16 low surrogate.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ee518b7070b1bcb41382b6db10f513e071b2c20e
commit ee518b7070b1bcb41382b6db10f513e071b2c20e
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date: Wed May 25 17:18:05 2016 +0200
S390: Use s390-64 specific ionv-modules on s390-32, too.
This patch reworks the existing s390 64bit specific iconv modules in order
to use them on s390 31bit, too.
Thus the parts for subdirectory iconvdata in sysdeps/s390/s390-64/Makefile
were moved to sysdeps/s390/Makefile so that they apply on 31bit, too.
All those modules are moved from sysdeps/s390/s390-64 directory to
sysdeps/s390.
The iso-8859-1 to/from cp037 module was adjusted, to use brct (branch
relative
on count) instruction on 31bit s390 instead of brctg, because the brctg is
a
zarch instruction and is not available on a 31bit kernel.
The utf modules are using zarch instructions, thus the directive
machinemode
zarch_nohighgprs was added to the inline assemblies to omit the high-gprs
flag
in the shared libraries. Otherwise they can't be loaded on a 31bit kernel.
The ifunc resolvers were adjusted in order to call the etf3eh or vector
variants
only if zarch instructions are available (64bit kernel in 31bit
compat-mode).
Furthermore some variable types were changed. E.g. unsigned long long would
be
a register pair on s390 31bit, but we want only one single register.
For variables of type size_t the register contents have to be enlarged from
a
32bit to a 64bit value on 31bit, because the inline assemblies uses 64bit
values
in such cases.
ChangeLog:
* sysdeps/s390/s390-64/Makefile (iconvdata-subdirectory):
Move to ...
* sysdeps/s390/Makefile: ... here.
* sysdeps/s390/s390-64/iso-8859-1_cp037_z900.c: Move to ...
* sysdeps/s390/iso-8859-1_cp037_z900.c: ... here.
(BRANCH_ON_COUNT): New define.
(TR_LOOP): Use BRANCH_ON_COUNT instead of brctg.
* sysdeps/s390/s390-64/utf16-utf32-z9.c: Move to ...
* sysdeps/s390/utf16-utf32-z9.c: ... here and adjust to
run on s390-32, too.
* sysdeps/s390/s390-64/utf8-utf16-z9.c: Move to ...
* sysdeps/s390/utf8-utf16-z9.c: ... here and adjust to
run on s390-32, too.
* sysdeps/s390/s390-64/utf8-utf32-z9.c: Move to ...
* sysdeps/s390/utf8-utf32-z9.c: ... here and adjust to
run on s390-32, too.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=6896776c3c9c32fd22324e6de6737dd69ae73213
commit 6896776c3c9c32fd22324e6de6737dd69ae73213
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date: Wed May 25 17:18:05 2016 +0200
S390: Optimize utf16-utf32 module.
This patch reworks the s390 specific module to convert between utf16 and
utf32.
Now ifunc is used to choose either the c or etf3eh (with convert utf
instruction) variants at runtime.
Furthermore a new vector variant for z13 is introduced which will be build
and chosen if vector support is available at build / runtime.
In case of converting utf 32 to utf16, the vector variant optimizes input
of
2byte utf16 characters. The convert utf instruction is used if an utf16
surrogate is found.
For the other direction utf16 to utf32, the cu24 instruction can't be re-
enabled, because it does not report an error, if the input-stream consists
of
a single low surrogate utf16 char (e.g. 0xdc00). This applies to the newest
z13,
too. Thus there is only the c or the new vector variant, which can handle
utf16
surrogate characters.
This patch also fixes some whitespace errors. Furthermore, the etf3eh
variant is
handling the "UTF-xx//IGNORE" case now. Before they ignored the ignore-case
and
always stopped at an error.
ChangeLog:
* sysdeps/s390/s390-64/utf16-utf32-z9.c: Use ifunc to select c,
etf3eh or new vector loop-variant.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=5bd11b19099b3f22d821515f9c93f1ecc1a7e15e
commit 5bd11b19099b3f22d821515f9c93f1ecc1a7e15e
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date: Wed May 25 17:18:05 2016 +0200
S390: Optimize utf8-utf16 module.
This patch reworks the s390 specific module to convert between utf8 and
utf16.
Now ifunc is used to choose either the c or etf3eh (with convert utf
instruction)
variants at runtime. Furthermore a new vector variant for z13 is introduced
which will be build and chosen if vector support is available at build /
runtime.
In case of converting utf 8 to utf16, the vector variant optimizes input of
1byte utf8 characters. The convert utf instruction is used if a multibyte
utf8
character is found.
For the other direction utf16 to utf8, the cu21 instruction can't be
re-enabled,
because it does not report an error, if the input-stream consists of a
single
low surrogate utf16 char (e.g. 0xdc00). This applies to the newest z13,
too.
Thus there is only the c or the new vector variant, which can handle 1..4
byte
utf8 characters.
The c variant from utf16 to utf8 has beed fixed. If a high surrogate was at
the
end of the input-buffer, then errno was set to EINVAL and the input-pointer
pointed just after the high surrogate. Now it points to the beginning of
the
high surrogate.
This patch also fixes some whitespace errors. The c variant from utf8 to
utf16
is now checking that tail-bytes starts with 0b10... and the value is not in
range of an utf16 surrogate.
Furthermore, the etf3eh variants are handling the "UTF-xx//IGNORE" case
now.
Before they ignored the ignore-case and always stopped at an error.
ChangeLog:
* sysdeps/s390/s390-64/utf8-utf16-z9.c: Use ifunc to select c,
etf3eh or new vector loop-variant.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=421c5278d83e72740150259960a431706ac343f9
commit 421c5278d83e72740150259960a431706ac343f9
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date: Wed May 25 17:18:05 2016 +0200
S390: Optimize utf8-utf32 module.
This patch reworks the s390 specific module to convert between utf8 and
utf32.
Now ifunc is used to choose either the c or etf3eh (with convert utf
instruction) variants at runtime.
Furthermore a new vector variant for z13 is introduced which will be build
and chosen if vector support is available at build / runtime.
The vector variants optimize input of 1byte utf8 characters. The convert
utf
instruction is used if a multibyte utf8 character is found.
This patch also fixes some whitespace errors. The c variants are rejecting
UTF-16 surrogates and values above 0x10ffff now.
Furthermore, the etf3eh variants are handling the "UTF-xx//IGNORE" case
now.
Before they ignored the ignore-case and always stopped at an error.
ChangeLog:
* sysdeps/s390/s390-64/utf8-utf32-z9.c: Use ifunc to select c, etf3eh
or new vector loop-variant.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=81c6380887c6d62c56e5f0f85a241f759f58b2fd
commit 81c6380887c6d62c56e5f0f85a241f759f58b2fd
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date: Wed May 25 17:18:05 2016 +0200
S390: Optimize iso-8859-1 to ibm037 iconv-module.
This patch reworks the s390 specific module which used the z900
translate one to one instruction. Now the g5 translate instruction is used,
because it outperforms the troo instruction.
ChangeLog:
* sysdeps/s390/s390-64/iso-8859-1_cp037_z900.c (TROO_LOOP):
Rename to TR_LOOP and usage of tr instead of troo instruction.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=3b704e26b33e35d99de920f8462d8e438f89be39
commit 3b704e26b33e35d99de920f8462d8e438f89be39
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date: Wed May 25 17:18:04 2016 +0200
S390: Optimize builtin iconv-modules.
This patch introduces a s390 specific gconv_simple.c file which provides
optimized versions for z13 with vector instructions, which will be chosen
at
runtime via ifunc.
The optimized conversions can convert between internal and ascii, ucs4,
ucs4le,
ucs2, ucs2le.
If the build-environment lacks vector support, then iconv/gconv_simple.c
is used wihtout any change. Otherwise iconvdata/gconv_simple.c is used to
create
conversion loop routines without vector instructions as fallback, if vector
instructions aren't available at runtime.
ChangeLog:
* sysdeps/s390/multiarch/gconv_simple.c: New File.
* sysdeps/s390/multiarch/Makefile (sysdep_routines): Add gconv_simple.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=4690dab084f854bf0013b5eaabcf90c2d5b692ff
commit 4690dab084f854bf0013b5eaabcf90c2d5b692ff
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date: Wed May 25 17:18:04 2016 +0200
S390: Optimize 8bit-generic iconv modules.
This patch introduces a s390 specific 8bit-generic.c file which provides an
optimized version for z13 with translate-/vector-instructions, which will
be
chosen at runtime via ifunc.
If the build-environment lacks vector support, then
iconvdata/8bit-generic.c
is used wihtout any change. Otherwise iconvdata/8bit-generic.c is used to
create
conversion loop routines without vector instructions as fallback, if vector
instructions aren't available at runtime.
The vector routines can only be used with charsets where the maximum UCS4
value
fits in 1 byte size. Then the hardware translate-instruction is used
to translate between up to 256 generic characters and "1 byte UCS4"
characters at once. The vector instructions are used to convert between
the "1 byte UCS4" and UCS4.
The gen-8bit.sh script in sysdeps/s390/multiarch generates the conversion
table to_ucs1. Therefore in sysdeps/s390/multiarch/Makefile is added an
override define generate-8bit-table, which is originally defined in
iconvdata/Makefile. This version calls the gen-8bit.sh in iconvdata folder
and the s390 one.
ChangeLog:
* sysdeps/s390/multiarch/8bit-generic.c: New File.
* sysdeps/s390/multiarch/gen-8bit.sh: New File.
* sysdeps/s390/multiarch/Makefile (generate-8bit-table):
New override define.
* sysdeps/s390/multiarch/iconv/skeleton.c: Likewise.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=9b7f05599a92dead97d6683bc838a57bc63ac52b
commit 9b7f05599a92dead97d6683bc838a57bc63ac52b
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date: Wed May 25 17:18:04 2016 +0200
S390: Configure check for vector support in gcc.
The S390 specific test checks if the gcc has support for vector registers
by compiling an inline assembly which clobbers vector registers.
On success the macro HAVE_S390_VX_GCC_SUPPORT is defined.
This macro can be used to determine if e.g. clobbering vector registers
is allowed or not.
ChangeLog:
* config.h.in (HAVE_S390_VX_GCC_SUPPORT): New macro undefine.
* sysdeps/s390/configure.ac: Add test for S390 vector register
support in gcc.
* sysdeps/s390/configure: Regenerated.
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=c70e9913d2fc2d0bf6a3ca98a4dece759d40a4ec
commit c70e9913d2fc2d0bf6a3ca98a4dece759d40a4ec
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date: Wed May 25 17:18:04 2016 +0200
S390: Get rid of make warning: overriding recipe for target gconv-modules.
This patch introduces a way to provide an architecture dependent
gconv-modules
file. Before this patch, the gconv-modules file was normally installed from
src-dir/iconvdata/gconv-modules. The S390 Makefile had overridden the
installation recipe (with a make warning) in order to install the
gconv-module-s390 file from build-dir.
The iconvdata/Makefile provides another recipe, which copies the
gconv-modules
file from src to build dir, which are used by the testcases.
Thus the testcases does not use the currently build s390-modules.
This patch uses build-dir/iconvdata/gconv-modules for installation, which
is generated by concatenating src-dir/iconvdata/gconv-modules and the
architecture specific one. The latter one can be specified by setting the
variable
sysdeps-gconv-modules in sysdeps/.../Makefile.
The architecture specific gconv-modules file is emitted before the common
one
because these modules aren't used in all possible conversions. E.g. the
converting
from INTERNAL to UTF-16 used the common UTF-16.so module instead of
UTF16_UTF32_Z9.so.
This way, the s390-Makefile does not need to override the recipe for
gconv-modules
and no warning is emitted anymore.
Since we no longer support empty objpfx the conditional test in
iconvdata/Makefile
is removed.
ChangeLog:
* iconvdata/Makefile ($(inst_gconvdir)/gconv-modules):
Install file from $(objpfx)gconv-modules.
($(objpfx)gconv-modules): Concatenate architecture specific file
in variable sysdeps-gconv-modules and gconv-modules in src dir.
* sysdeps/s390/gconv-modules: New file.
* sysdeps/s390/s390-64/Makefile: ($(inst_gconvdir)/gconv-modules):
Deleted.
($(objpfx)gconv-modules-s390): Deleted.
(sysdeps-gconv-modules): New variable.
-----------------------------------------------------------------------
Summary of changes:
ChangeLog | 100 ++
config.h.in | 4 +
iconv/Makefile | 2 +-
iconv/gconv_simple.c | 5 +-
iconv/tst-iconv6.c | 117 +++
iconvdata/Makefile | 10 +-
iconvdata/bug-iconv12.c | 263 ++++++
iconvdata/utf-16.c | 12 +
iconvdata/utf-32.c | 2 +-
sysdeps/s390/Makefile | 31 +
sysdeps/s390/configure | 32 +
sysdeps/s390/configure.ac | 21 +
sysdeps/s390/gconv-modules | 50 +
sysdeps/s390/iso-8859-1_cp037_z900.c | 262 ++++++
sysdeps/s390/multiarch/8bit-generic.c | 415 +++++++++
sysdeps/s390/multiarch/Makefile | 14 +
sysdeps/s390/multiarch/gconv_simple.c | 1266 ++++++++++++++++++++++++++
sysdeps/s390/multiarch/gen-8bit.sh | 6 +
sysdeps/s390/multiarch/iconv/skeleton.c | 21 +
sysdeps/s390/s390-64/Makefile | 81 --
sysdeps/s390/s390-64/iso-8859-1_cp037_z900.c | 237 -----
sysdeps/s390/s390-64/utf16-utf32-z9.c | 337 -------
sysdeps/s390/s390-64/utf8-utf16-z9.c | 471 ----------
sysdeps/s390/s390-64/utf8-utf32-z9.c | 511 -----------
sysdeps/s390/utf16-utf32-z9.c | 605 ++++++++++++
sysdeps/s390/utf8-utf16-z9.c | 818 +++++++++++++++++
sysdeps/s390/utf8-utf32-z9.c | 862 ++++++++++++++++++
27 files changed, 4910 insertions(+), 1645 deletions(-)
create mode 100644 iconv/tst-iconv6.c
create mode 100644 iconvdata/bug-iconv12.c
create mode 100644 sysdeps/s390/Makefile
create mode 100644 sysdeps/s390/gconv-modules
create mode 100644 sysdeps/s390/iso-8859-1_cp037_z900.c
create mode 100644 sysdeps/s390/multiarch/8bit-generic.c
create mode 100644 sysdeps/s390/multiarch/gconv_simple.c
create mode 100644 sysdeps/s390/multiarch/gen-8bit.sh
create mode 100644 sysdeps/s390/multiarch/iconv/skeleton.c
delete mode 100644 sysdeps/s390/s390-64/iso-8859-1_cp037_z900.c
delete mode 100644 sysdeps/s390/s390-64/utf16-utf32-z9.c
delete mode 100644 sysdeps/s390/s390-64/utf8-utf16-z9.c
delete mode 100644 sysdeps/s390/s390-64/utf8-utf32-z9.c
create mode 100644 sysdeps/s390/utf16-utf32-z9.c
create mode 100644 sysdeps/s390/utf8-utf16-z9.c
create mode 100644 sysdeps/s390/utf8-utf32-z9.c
--
You are receiving this mail because:
You are on the CC list for the bug.