This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug locale/19726] Converting UCS4LE to INTERNAL with iconv() does not update pointers and lengths in error-case.


https://sourceware.org/bugzilla/show_bug.cgi?id=19726

--- Comment #1 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU C Library master sources".

The branch, master has been updated
       via  7ab1de21067d72460ac14089bf6541b10fc14c80 (commit)
       via  8f25676c83eef5c85db98f9cf027890fbe810447 (commit)
       via  a42a95c43133d69b1108f582cffa0f6986a9c3da (commit)
       via  52f8a48e24563daa807f94824ce9782b9a9eece9 (commit)
       via  ee518b7070b1bcb41382b6db10f513e071b2c20e (commit)
       via  6896776c3c9c32fd22324e6de6737dd69ae73213 (commit)
       via  5bd11b19099b3f22d821515f9c93f1ecc1a7e15e (commit)
       via  421c5278d83e72740150259960a431706ac343f9 (commit)
       via  81c6380887c6d62c56e5f0f85a241f759f58b2fd (commit)
       via  3b704e26b33e35d99de920f8462d8e438f89be39 (commit)
       via  4690dab084f854bf0013b5eaabcf90c2d5b692ff (commit)
       via  9b7f05599a92dead97d6683bc838a57bc63ac52b (commit)
       via  c70e9913d2fc2d0bf6a3ca98a4dece759d40a4ec (commit)
      from  5ff81530dd14552a48a8fcb119e5867a1b504cc6 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=7ab1de21067d72460ac14089bf6541b10fc14c80

commit 7ab1de21067d72460ac14089bf6541b10fc14c80
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date:   Wed May 25 17:18:06 2016 +0200

    Fix UTF-16 surrogate handling. [BZ #19727]

    According to the latest Unicode standard, a conversion from/to UTF-xx has
    to report an error if the character value is in range of an utf16 surrogate
    (0xd800..0xdfff). See
https://sourceware.org/ml/libc-help/2015-12/msg00015.html.
    Thus this patch fixes this behaviour for converting from utf32 to internal
and
    from internal to utf8.

    Furthermore the conversion from utf16 to internal does not report an error
if the
    input-stream consists of two low-surrogate values. If an uint16_t value is
in the
    range of 0xd800 .. 0xdfff, the next uint16_t value is checked, if it is in
the
    range of a low surrogate (0xdc00 .. 0xdfff). Afterwards these two uint16_t
    values are interpreted as a high- and low-surrogates pair. But there is no
test
    if the first uint16_t value is really in the range of a high-surrogate
    (0xd800 .. 0xdbff). If there would be two uint16_t values in the range of a
low
    surrogate, then they will be treated as a valid high- and low-surrogates
pair.
    This patch adds this test.

    This patch also adds a new testcase, which checks UTF conversions with
input
    values in range of UTF16 surrogates. The test converts from UTF-xx to
INTERNAL,
    INTERNAL to UTF-xx and directly between UTF-xx to UTF-yy. The latter
conversion
    is needed because s390 has iconv-modules, which converts from/to UTF in one
step.
    The new testcase was tested on a s390, power and intel machine.

    ChangeLog:

        [BZ #19727]
        * iconvdata/utf-16.c (BODY): Report an error if first word is not a
        valid high surrogate.
        * iconvdata/utf-32.c (BODY): Report an error if the value is in range
        of an utf16 surrogate.
        * iconv/gconv_simple.c (BODY): Likewise.
        * iconvdata/bug-iconv12.c: New file.
        * iconvdata/Makefile (tests): Add bug-iconv12.

    rename test

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=8f25676c83eef5c85db98f9cf027890fbe810447

commit 8f25676c83eef5c85db98f9cf027890fbe810447
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date:   Wed May 25 17:18:06 2016 +0200

    Fix ucs4le_internal_loop in error case. [BZ #19726]

    When converting from UCS4LE to INTERNAL, the input-value is checked for a
too
    large value and the iconv() call sets errno to EILSEQ. In this case the
inbuf
    argument of the iconv() call should point to the invalid character, but it
    points to the beginning of the inbuf.
    Thus this patch updates the pointers inptrp and outptrp before returning in
    this error case.

    This patch also adds a new testcase for this issue.
    The new test was tested on a s390, power, intel machine.

    ChangeLog:

        [BZ #19726]
        * iconv/gconv_simple.c (ucs4le_internal_loop): Update inptrp and
        outptrp in case of an illegal input.
        * iconv/tst-iconv6.c: New file.
        * iconv/Makefile (tests): Add tst-iconv6.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=a42a95c43133d69b1108f582cffa0f6986a9c3da

commit a42a95c43133d69b1108f582cffa0f6986a9c3da
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date:   Wed May 25 17:18:06 2016 +0200

    S390: Fix utf32 to utf16 handling of low surrogates (disable cu42).

    According to the latest Unicode standard, a conversion from/to UTF-xx has
    to report an error if the character value is in range of an utf16 surrogate
    (0xd800..0xdfff). See
https://sourceware.org/ml/libc-help/2015-12/msg00015.html.

    Thus the cu42 instruction, which converts from utf32 to utf16,  has to be
    disabled because it does not report an error in case of a value in range of
    a low surrogate (0xdc00..0xdfff). The etf3eh variant is removed and the c,
    vector variant is adjusted to handle the value in range of an utf16 low
    surrogate correctly.

    ChangeLog:

        * sysdeps/s390/utf16-utf32-z9.c: Disable cu42 instruction and report
        an error in case of a value in range of an utf16 low surrogate.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=52f8a48e24563daa807f94824ce9782b9a9eece9

commit 52f8a48e24563daa807f94824ce9782b9a9eece9
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date:   Wed May 25 17:18:05 2016 +0200

    S390: Fix utf32 to utf8 handling of low surrogates (disable cu41).

    According to the latest Unicode standard, a conversion from/to UTF-xx has
    to report an error if the character value is in range of an utf16 surrogate
    (0xd800..0xdfff). See
https://sourceware.org/ml/libc-help/2015-12/msg00015.html.

    Thus the cu41 instruction, which converts from utf32 to utf8,  has to be
    disabled because it does not report an error in case of a value in range of
    a low surrogate (0xdc00..0xdfff). The etf3eh variant is removed and the c,
    vector variant is adjusted to handle the value in range of an utf16 low
    surrogate correctly.

    ChangeLog:

        * sysdeps/s390/utf8-utf32-z9.c: Disable cu41 instruction and report
        an error in case of a value in range of an utf16 low surrogate.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ee518b7070b1bcb41382b6db10f513e071b2c20e

commit ee518b7070b1bcb41382b6db10f513e071b2c20e
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date:   Wed May 25 17:18:05 2016 +0200

    S390: Use s390-64 specific ionv-modules on s390-32, too.

    This patch reworks the existing s390 64bit specific iconv modules in order
    to use them on s390 31bit, too.

    Thus the parts for subdirectory iconvdata in sysdeps/s390/s390-64/Makefile
    were moved to sysdeps/s390/Makefile so that they apply on 31bit, too.
    All those modules are moved from sysdeps/s390/s390-64 directory to
sysdeps/s390.

    The iso-8859-1 to/from cp037 module was adjusted, to use brct (branch
relative
    on count) instruction on 31bit s390 instead of brctg, because the brctg is
a
    zarch instruction and is not available on a 31bit kernel.

    The utf modules are using zarch instructions, thus the directive
machinemode
    zarch_nohighgprs was added to the inline assemblies to omit the high-gprs
flag
    in the shared libraries. Otherwise they can't be loaded on a 31bit kernel.
    The ifunc resolvers were adjusted in order to call the etf3eh or vector
variants
    only if zarch instructions are available (64bit kernel in 31bit
compat-mode).
    Furthermore some variable types were changed. E.g. unsigned long long would
be
    a register pair on s390 31bit, but we want only one single register.
    For variables of type size_t the register contents have to be enlarged from
a
    32bit to a 64bit value on 31bit, because the inline assemblies uses 64bit
values
    in such cases.

    ChangeLog:

        * sysdeps/s390/s390-64/Makefile (iconvdata-subdirectory):
        Move to ...
        * sysdeps/s390/Makefile: ... here.
        * sysdeps/s390/s390-64/iso-8859-1_cp037_z900.c: Move to ...
        * sysdeps/s390/iso-8859-1_cp037_z900.c: ... here.
        (BRANCH_ON_COUNT): New define.
        (TR_LOOP): Use BRANCH_ON_COUNT instead of brctg.
        * sysdeps/s390/s390-64/utf16-utf32-z9.c: Move to ...
        * sysdeps/s390/utf16-utf32-z9.c: ... here and adjust to
        run on s390-32, too.
        * sysdeps/s390/s390-64/utf8-utf16-z9.c: Move to ...
        * sysdeps/s390/utf8-utf16-z9.c: ... here and adjust to
        run on s390-32, too.
        * sysdeps/s390/s390-64/utf8-utf32-z9.c: Move to ...
        * sysdeps/s390/utf8-utf32-z9.c: ... here and adjust to
        run on s390-32, too.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=6896776c3c9c32fd22324e6de6737dd69ae73213

commit 6896776c3c9c32fd22324e6de6737dd69ae73213
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date:   Wed May 25 17:18:05 2016 +0200

    S390: Optimize utf16-utf32 module.

    This patch reworks the s390 specific module to convert between utf16 and
utf32.
    Now ifunc is used to choose either the c or etf3eh (with convert utf
    instruction) variants at runtime.
    Furthermore a new vector variant for z13 is introduced which will be build
    and chosen if vector support is available at build / runtime.

    In case of converting utf 32 to utf16, the vector variant optimizes input
of
    2byte utf16 characters. The convert utf instruction is used if an utf16
    surrogate is found.

    For the other direction utf16 to utf32, the cu24 instruction can't be re-
    enabled, because it does not report an error, if the input-stream consists
of
    a single low surrogate utf16 char (e.g. 0xdc00). This applies to the newest
z13,
    too. Thus there is only the c or the new vector variant, which can handle
utf16
    surrogate characters.

    This patch also fixes some whitespace errors. Furthermore, the etf3eh
variant is
    handling the "UTF-xx//IGNORE" case now. Before they ignored the ignore-case
and
    always stopped at an error.

    ChangeLog:

        * sysdeps/s390/s390-64/utf16-utf32-z9.c: Use ifunc to select c,
        etf3eh or new vector loop-variant.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=5bd11b19099b3f22d821515f9c93f1ecc1a7e15e

commit 5bd11b19099b3f22d821515f9c93f1ecc1a7e15e
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date:   Wed May 25 17:18:05 2016 +0200

    S390: Optimize utf8-utf16 module.

    This patch reworks the s390 specific module to convert between utf8 and
utf16.
    Now ifunc is used to choose either the c or etf3eh (with convert utf
instruction)
    variants at runtime. Furthermore a new vector variant for z13 is introduced
    which will be build and chosen if vector support is available at build /
runtime.

    In case of converting utf 8 to utf16, the vector variant optimizes input of
    1byte utf8 characters. The convert utf instruction is used if a multibyte
utf8
    character is found.

    For the other direction utf16 to utf8, the cu21 instruction can't be
re-enabled,
    because it does not report an error, if the input-stream consists of a
single
    low surrogate utf16 char (e.g. 0xdc00). This applies to the newest z13,
too.
    Thus there is only the c or the new vector variant, which can handle 1..4
byte
    utf8 characters.

    The c variant from utf16 to utf8 has beed fixed. If a high surrogate was at
the
    end of the input-buffer, then errno was set to EINVAL and the input-pointer
    pointed just after the high surrogate. Now it points to the beginning of
the
    high surrogate.

    This patch also fixes some whitespace errors. The c variant from utf8 to
utf16
    is now checking that tail-bytes starts with 0b10... and the value is not in
    range of an utf16 surrogate.

    Furthermore, the etf3eh variants are handling the "UTF-xx//IGNORE" case
now.
    Before they ignored the ignore-case and always stopped at an error.

    ChangeLog:

        * sysdeps/s390/s390-64/utf8-utf16-z9.c: Use ifunc to select c,
        etf3eh or new vector loop-variant.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=421c5278d83e72740150259960a431706ac343f9

commit 421c5278d83e72740150259960a431706ac343f9
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date:   Wed May 25 17:18:05 2016 +0200

    S390: Optimize utf8-utf32 module.

    This patch reworks the s390 specific module to convert between utf8 and
utf32.
    Now ifunc is used to choose either the c or etf3eh (with convert utf
    instruction) variants at runtime.
    Furthermore a new vector variant for z13 is introduced which will be build
    and chosen if vector support is available at build / runtime.
    The vector variants optimize input of 1byte utf8 characters. The convert
utf
    instruction is used if a multibyte utf8 character is found.

    This patch also fixes some whitespace errors. The c variants are rejecting
    UTF-16 surrogates and values above 0x10ffff now.
    Furthermore, the etf3eh variants are handling the "UTF-xx//IGNORE" case
now.
    Before they ignored the ignore-case and always stopped at an error.

    ChangeLog:

        * sysdeps/s390/s390-64/utf8-utf32-z9.c: Use ifunc to select c, etf3eh
        or new vector loop-variant.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=81c6380887c6d62c56e5f0f85a241f759f58b2fd

commit 81c6380887c6d62c56e5f0f85a241f759f58b2fd
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date:   Wed May 25 17:18:05 2016 +0200

    S390: Optimize iso-8859-1 to ibm037 iconv-module.

    This patch reworks the s390 specific module which used the z900
    translate one to one instruction. Now the g5 translate instruction is used,
    because it outperforms the troo instruction.

    ChangeLog:

        * sysdeps/s390/s390-64/iso-8859-1_cp037_z900.c (TROO_LOOP):
        Rename to TR_LOOP and usage of tr instead of troo instruction.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=3b704e26b33e35d99de920f8462d8e438f89be39

commit 3b704e26b33e35d99de920f8462d8e438f89be39
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date:   Wed May 25 17:18:04 2016 +0200

    S390: Optimize builtin iconv-modules.

    This patch introduces a s390 specific gconv_simple.c file which provides
    optimized versions for z13 with vector instructions, which will be chosen
at
    runtime via ifunc.
    The optimized conversions can convert between internal and ascii, ucs4,
ucs4le,
    ucs2, ucs2le.
    If the build-environment lacks vector support, then iconv/gconv_simple.c
    is used wihtout any change. Otherwise iconvdata/gconv_simple.c is used to
create
    conversion loop routines without vector instructions as fallback, if vector
    instructions aren't available at runtime.

    ChangeLog:

        * sysdeps/s390/multiarch/gconv_simple.c: New File.
        * sysdeps/s390/multiarch/Makefile (sysdep_routines): Add gconv_simple.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=4690dab084f854bf0013b5eaabcf90c2d5b692ff

commit 4690dab084f854bf0013b5eaabcf90c2d5b692ff
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date:   Wed May 25 17:18:04 2016 +0200

    S390: Optimize 8bit-generic iconv modules.

    This patch introduces a s390 specific 8bit-generic.c file which provides an
    optimized version for z13 with translate-/vector-instructions, which will
be
    chosen at runtime via ifunc.
    If the build-environment lacks vector support, then
iconvdata/8bit-generic.c
    is used wihtout any change. Otherwise iconvdata/8bit-generic.c is used to
create
    conversion loop routines without vector instructions as fallback, if vector
    instructions aren't available at runtime.

    The vector routines can only be used with charsets where the maximum UCS4
value
    fits in 1 byte size. Then the hardware translate-instruction is used
    to translate between up to 256 generic characters and "1 byte UCS4"
    characters at once. The vector instructions are used to convert between
    the "1 byte UCS4" and UCS4.

    The gen-8bit.sh script in sysdeps/s390/multiarch generates the conversion
    table to_ucs1. Therefore in sysdeps/s390/multiarch/Makefile is added an
    override define generate-8bit-table, which is originally defined in
    iconvdata/Makefile. This version calls the gen-8bit.sh in iconvdata folder
    and the s390 one.

    ChangeLog:

        * sysdeps/s390/multiarch/8bit-generic.c: New File.
        * sysdeps/s390/multiarch/gen-8bit.sh: New File.
        * sysdeps/s390/multiarch/Makefile (generate-8bit-table):
        New override define.
        * sysdeps/s390/multiarch/iconv/skeleton.c: Likewise.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=9b7f05599a92dead97d6683bc838a57bc63ac52b

commit 9b7f05599a92dead97d6683bc838a57bc63ac52b
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date:   Wed May 25 17:18:04 2016 +0200

    S390: Configure check for vector support in gcc.

    The S390 specific test checks if the gcc has support for vector registers
    by compiling an inline assembly which clobbers vector registers.
    On success the macro HAVE_S390_VX_GCC_SUPPORT is defined.
    This macro can be used to determine if e.g. clobbering vector registers
    is allowed or not.

    ChangeLog:

        * config.h.in (HAVE_S390_VX_GCC_SUPPORT): New macro undefine.
        * sysdeps/s390/configure.ac: Add test for S390 vector register
        support in gcc.
        * sysdeps/s390/configure: Regenerated.

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=c70e9913d2fc2d0bf6a3ca98a4dece759d40a4ec

commit c70e9913d2fc2d0bf6a3ca98a4dece759d40a4ec
Author: Stefan Liebler <stli@linux.vnet.ibm.com>
Date:   Wed May 25 17:18:04 2016 +0200

    S390: Get rid of make warning: overriding recipe for target gconv-modules.

    This patch introduces a way to provide an architecture dependent
gconv-modules
    file. Before this patch, the gconv-modules file was normally installed from
    src-dir/iconvdata/gconv-modules. The S390 Makefile had overridden the
    installation recipe (with a make warning) in order to install the
    gconv-module-s390 file from build-dir.
    The iconvdata/Makefile provides another recipe, which copies the
gconv-modules
    file from src to build dir, which are used by the testcases.
    Thus the testcases does not use the currently build s390-modules.

    This patch uses build-dir/iconvdata/gconv-modules for installation, which
    is generated by concatenating src-dir/iconvdata/gconv-modules and the
    architecture specific one. The latter one can be specified by setting the
variable
    sysdeps-gconv-modules in sysdeps/.../Makefile.

    The architecture specific gconv-modules file is emitted before the common
one
    because these modules aren't used in all possible conversions. E.g. the
converting
    from INTERNAL to UTF-16 used the common UTF-16.so module instead of
UTF16_UTF32_Z9.so.

    This way, the s390-Makefile does not need to override the recipe for
gconv-modules
    and no warning is emitted anymore.
    Since we no longer support empty objpfx the conditional test in
iconvdata/Makefile
    is removed.

    ChangeLog:

        * iconvdata/Makefile ($(inst_gconvdir)/gconv-modules):
        Install file from $(objpfx)gconv-modules.
        ($(objpfx)gconv-modules): Concatenate architecture specific file
        in variable sysdeps-gconv-modules and gconv-modules in src dir.
        * sysdeps/s390/gconv-modules: New file.
        * sysdeps/s390/s390-64/Makefile: ($(inst_gconvdir)/gconv-modules):
        Deleted.
        ($(objpfx)gconv-modules-s390): Deleted.
        (sysdeps-gconv-modules): New variable.

-----------------------------------------------------------------------

Summary of changes:
 ChangeLog                                    |  100 ++
 config.h.in                                  |    4 +
 iconv/Makefile                               |    2 +-
 iconv/gconv_simple.c                         |    5 +-
 iconv/tst-iconv6.c                           |  117 +++
 iconvdata/Makefile                           |   10 +-
 iconvdata/bug-iconv12.c                      |  263 ++++++
 iconvdata/utf-16.c                           |   12 +
 iconvdata/utf-32.c                           |    2 +-
 sysdeps/s390/Makefile                        |   31 +
 sysdeps/s390/configure                       |   32 +
 sysdeps/s390/configure.ac                    |   21 +
 sysdeps/s390/gconv-modules                   |   50 +
 sysdeps/s390/iso-8859-1_cp037_z900.c         |  262 ++++++
 sysdeps/s390/multiarch/8bit-generic.c        |  415 +++++++++
 sysdeps/s390/multiarch/Makefile              |   14 +
 sysdeps/s390/multiarch/gconv_simple.c        | 1266 ++++++++++++++++++++++++++
 sysdeps/s390/multiarch/gen-8bit.sh           |    6 +
 sysdeps/s390/multiarch/iconv/skeleton.c      |   21 +
 sysdeps/s390/s390-64/Makefile                |   81 --
 sysdeps/s390/s390-64/iso-8859-1_cp037_z900.c |  237 -----
 sysdeps/s390/s390-64/utf16-utf32-z9.c        |  337 -------
 sysdeps/s390/s390-64/utf8-utf16-z9.c         |  471 ----------
 sysdeps/s390/s390-64/utf8-utf32-z9.c         |  511 -----------
 sysdeps/s390/utf16-utf32-z9.c                |  605 ++++++++++++
 sysdeps/s390/utf8-utf16-z9.c                 |  818 +++++++++++++++++
 sysdeps/s390/utf8-utf32-z9.c                 |  862 ++++++++++++++++++
 27 files changed, 4910 insertions(+), 1645 deletions(-)
 create mode 100644 iconv/tst-iconv6.c
 create mode 100644 iconvdata/bug-iconv12.c
 create mode 100644 sysdeps/s390/Makefile
 create mode 100644 sysdeps/s390/gconv-modules
 create mode 100644 sysdeps/s390/iso-8859-1_cp037_z900.c
 create mode 100644 sysdeps/s390/multiarch/8bit-generic.c
 create mode 100644 sysdeps/s390/multiarch/gconv_simple.c
 create mode 100644 sysdeps/s390/multiarch/gen-8bit.sh
 create mode 100644 sysdeps/s390/multiarch/iconv/skeleton.c
 delete mode 100644 sysdeps/s390/s390-64/iso-8859-1_cp037_z900.c
 delete mode 100644 sysdeps/s390/s390-64/utf16-utf32-z9.c
 delete mode 100644 sysdeps/s390/s390-64/utf8-utf16-z9.c
 delete mode 100644 sysdeps/s390/s390-64/utf8-utf32-z9.c
 create mode 100644 sysdeps/s390/utf16-utf32-z9.c
 create mode 100644 sysdeps/s390/utf8-utf16-z9.c
 create mode 100644 sysdeps/s390/utf8-utf32-z9.c

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]