This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Sync gnulib regex with glibc


Hi Paul,

Working on syncing gnulib regex files with glibc a change on gnulib
triggers a regression on a glibc testcase:

posix/bug-regex28.c:

 24 struct tests
 25 {
 26   const char *regex;
 27   const char *string;
 28   reg_syntax_t syntax;
 29   int retval;
 30 } tests[] = {
 31 #define EGREP RE_SYNTAX_EGREP
 32 #define EGREP_NL (RE_SYNTAX_EGREP | RE_DOT_NEWLINE) & ~RE_HAT_LISTS_NOT_NEWLINE
 33   { "a.b", "a\nb", EGREP, -1 },
 34   { "a.b", "a\nb", EGREP_NL, 0 },
 35   { "a[^x]b", "a\nb", EGREP, -1 },
 36   { "a[^x]b", "a\nb", EGREP_NL, 0 },

Basically it is expecting "a.b" with RE_SYNTAX_EGREP to not grep
"a\nb" but the way that RE_SYNTAX_EGREP is defined now on gnulib
is does not hold true any more:

--- posix/regex.h       2017-12-21 12:57:39.814761254 -0200
+++ ../../gnulib/gnulib-lib/lib/regex.h 2017-09-14 14:20:50.809030813 -0300

-#define RE_SYNTAX_GREP                                                 \
-  (RE_BK_PLUS_QM              | RE_CHAR_CLASSES                                \
-   | RE_HAT_LISTS_NOT_NEWLINE | RE_INTERVALS                           \
-   | RE_NEWLINE_ALT)
-
-#define RE_SYNTAX_EGREP                                                        \
-  (RE_CHAR_CLASSES        | RE_CONTEXT_INDEP_ANCHORS                   \
-   | RE_CONTEXT_INDEP_OPS | RE_HAT_LISTS_NOT_NEWLINE                   \
-   | RE_NEWLINE_ALT       | RE_NO_BK_PARENS                            \
-   | RE_NO_BK_VBAR)
-
-#define RE_SYNTAX_POSIX_EGREP                                          \
-  (RE_SYNTAX_EGREP | RE_INTERVALS | RE_NO_BK_BRACES                    \
-   | RE_INVALID_INTERVAL_ORD)
+# define RE_SYNTAX_GREP                                                        \
+  ((RE_SYNTAX_POSIX_BASIC | RE_NEWLINE_ALT)                            \
+   & ~(RE_CONTEXT_INVALID_DUP | RE_DOT_NOT_NULL))
+
+# define RE_SYNTAX_EGREP                                               \
+  ((RE_SYNTAX_POSIX_EXTENDED | RE_INVALID_INTERVAL_ORD | RE_NEWLINE_ALT) \
+   & ~(RE_CONTEXT_INVALID_OPS | RE_DOT_NOT_NULL))
+
+/* POSIX grep -E behavior is no longer incompatible with GNU.  */
+# define RE_SYNTAX_POSIX_EGREP                                         \
+  RE_SYNTAX_EGREP


The glibc testfile, bug-regex28.c, is related to BZ#3957 [1], which is not
strictly related to RE_SYNTAX_{E}GREP definition. On gnulib side the change
was done somewhat recently (2015) by 5a5a9388e. 

It does look like a correct change, but what I am worried from glibc 
standpoint is if it would require a compatibility implementation 
(potentially mapping RE_SYNTAX_{E}GREP to old definition on compat symbol).

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=3957


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]