This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
[PATCH] Fix up regcomp/regexec
- From: Jakub Jelinek <jakub at redhat dot com>
- To: Ulrich Drepper <drepper at gmail dot com>
- Cc: libc-alpha at sources dot redhat dot com
- Date: Fri, 30 Dec 2011 17:18:16 +0100
- Subject: [PATCH] Fix up regcomp/regexec
- Reply-to: Jakub Jelinek <jakub at redhat dot com>
Hi!
When building glibc with trunk gcc, many regex tests fail:
make[2]: *** [/builddir/build/BUILD/glibc-2.14-16c6f99/build-x86_64-redhat-linux/posix/runtests.out] Error 1
make[2]: *** [/builddir/build/BUILD/glibc-2.14-16c6f99/build-x86_64-redhat-linux/posix/runptests.out] Error 1
make[2]: *** [/builddir/build/BUILD/glibc-2.14-16c6f99/build-x86_64-redhat-linux/posix/bug-regex16.out] Error 1
make[2]: *** [/builddir/build/BUILD/glibc-2.14-16c6f99/build-x86_64-redhat-linux/posix/bug-regex18.out] Error 1
make[2]: *** [/builddir/build/BUILD/glibc-2.14-16c6f99/build-x86_64-redhat-linux/posix/bug-regex20.out] Error 1
/[[:lower:]]+/: Unmatched [ or [^
make[2]: *** [/builddir/build/BUILD/glibc-2.14-16c6f99/build-x86_64-redhat-linux/posix/transbug.out] Error 1
make[2]: *** [/builddir/build/BUILD/glibc-2.14-16c6f99/build-x86_64-redhat-linux/posix/tst-rxspencer.out] Error 1
make[2]: *** [/builddir/build/BUILD/glibc-2.14-16c6f99/build-x86_64-redhat-linux/posix/tst-boost.out] Error 1
make[2]: *** [/builddir/build/BUILD/glibc-2.14-16c6f99/build-x86_64-redhat-linux/posix/tst-pcre.out] Error 1
The problem is that parse_bracket_symbol is miscompiled, and it turns
out it is because of an incorrect attribute on re_string_fetch_byte_case.
Unlike re_string_peek_byte_case, this one is really not pure, it modifies memory
(increments pstr->cur_idx), and with the pure attribute GCC assumed it doesn't
and it cached the presumed value of regexp->cur_idx in a variable across the
for (;; ++i)
{
if (i >= BRACKET_NAME_BUF_SIZE)
return REG_EBRACK;
if (token->type == OP_OPEN_CHAR_CLASS)
ch = re_string_fetch_byte_case (regexp);
else
ch = re_string_fetch_byte (regexp);
if (re_string_eoi(regexp))
return REG_EBRACK;
if (ch == delim && re_string_peek_byte (regexp, 0) == ']')
break;
elem->opr.name[i] = ch;
}
re_string_fetch_byte_case (regexp) call and used that during
re_string_peek_byte, so on e.g.
#include <regex.h>
#include <stdlib.h>
int
main (void)
{
regex_t reg;
if (regcomp (®, "x[[:alnum:]]z", 0) != REG_NOERROR)
abort ();
return 0;
}
testcase it wouldn't terminate on the second ':' character, because
it would see re_string_peek_byte (regexp, 0) returning again ':' instead
of ']'.
Fixed thusly:
2011-12-30 Jakub Jelinek <jakub@redhat.com>
* posix/regex_internal.c (re_string_fetch_byte_case): Remove
pure attribute.
--- libc/posix/regex_internal.c.jj 2011-11-23 11:06:23.000000000 +0100
+++ libc/posix/regex_internal.c 2011-12-30 16:56:38.948973129 +0100
@@ -868,7 +868,7 @@ re_string_peek_byte_case (const re_strin
}
static unsigned char
-internal_function __attribute ((pure))
+internal_function
re_string_fetch_byte_case (re_string_t *pstr)
{
if (BE (!pstr->mbs_allocated, 1))
Jakub