This is the mail archive of the glibc-bugs-regex@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug regex/19376] New: regcomp.c needs to be upgraded to GNU Grep's one


https://sourceware.org/bugzilla/show_bug.cgi?id=19376

            Bug ID: 19376
           Summary: regcomp.c needs to be upgraded to GNU Grep's one
           Product: glibc
           Version: 2.22
            Status: NEW
          Severity: normal
          Priority: P2
         Component: regex
          Assignee: unassigned at sourceware dot org
          Reporter: t.rus76 at ya dot ru
                CC: drepper.fsp at gmail dot com
  Target Milestone: ---

Symptom: GNU Grep does not handle Syriac characters (U+0700 â U+074F) correctly

$ echo 'ÜÜÜÜ' > peace
$ egrep '\<[Ü-Ü]' peace
grep: Invalid collation character
$ awk /'\<[Ü-Ü]'/ peace
ÜÜÜÜ

However when grep is build with ./configure --with-included-regex
it works just fine and there is no REG_ECOLLATE error

$ echo ÜÜÜÜ | src/egrep [Ü-Ü]
ÜÜÜÜ
$ echo ÜÜÜÜ | src/egrep [Ü-Ü]
$

This is because GNU Grep contains improved version of regcomp.

The bus was found here:
http://forum.rosalab.ru/viewtopic.php?f=53&t=6219&p=54747 (in Russian)

It is tested and confirmed also on Gentoo (both glibc and grep are 2.22).


I expect there are other bugs that could be fixed with this upgrade.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]