This is the mail archive of the glibc-bugs@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug regex/522] New: [regex] charset-based optimizations inhibited outside glibc


The attached patch enables optimizations based on the active charset being
UTF-8, or a superset of ASCII, even when regex is being compiled outside glibc.

While a more complete solution would use gnulib and gettext's locale_charset
function, this would make central tools such as /bin/sed and /bin/awk require
external files.  Since regex does not need the charset name, but only to know if
it is UTF-8, we can use simple string matching.

The patch also discovers if the active charset is a superset of ASCII, checking
if btowc(c)==(wchar_t)c for 0<=c<=127.  This assumes that wchar_t encoding is
ISO10646, which always seems to be case.

-- 
           Summary: [regex] charset-based optimizations inhibited outside
                    glibc
           Product: glibc
           Version: unspecified
            Status: NEW
          Severity: normal
          Priority: P2
         Component: regex
        AssignedTo: bonzini at gnu dot org
        ReportedBy: bonzini at gnu dot org
                CC: glibc-bugs-regex at sources dot redhat dot com,glibc-
                    bugs at sources dot redhat dot com


http://sources.redhat.com/bugzilla/show_bug.cgi?id=522

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]