This is the mail archive of the libc-help@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Questions on fnmatch() and case folding


Some questions have arisen during the Austin Group (the POSIX
maintainers) meetings around adding support in POSIX for case
insensitive file name matching (see
http://austingroupbugs.net/view.php?id=1031)

It was observed that the glibc implementation of fnmatch() with the
FNM_CASEFOLD flag does NOT do case folding when given an explicit
character class. That is to say, the string "A" does not match the
pattern "[[:lower:]]" even with FNM_CASEFOLD.

I've checked the current master branch on git, and the issue (if
indeed it is an issue) is still present there.

There's also a question with range expressions such as "[Z-a]"
(assuming a POSIX locale): should this match characters such as '_'
(which in ASCII at least lies between upper case Z and lower case a),
and whether or not case insensitivity should or should not affect
this.

My personal expectation is that "[[:lower:]]" should match an
uppercase character if case folding is occurring (which it does not in
glibc). Is this a bug?

In the POSIX locale, [:lower:] is the character set
abcdefghijklmnopqrstuvwxyz, and [:upper:] is a similar (upper case)
set. Thus we might expect
[[:upper:]-[:lower:]] to be the same as
[ABCDEFGHIJKLMNOPQRSTUVWXYZ-abcdefghijklmnopqrstuvwxyz]
... but it isn't!

The program below demonstrates...
-- 
Nick

#include <stdio.h>
#include <fnmatch.h>

#define ARRAY_SIZE(a)   (sizeof(a)/sizeof(a[0]))

int
main(int argc, const char *argv[])
{
        const char *pattern[] = {
                "aa", "AA", "[[:lower:]][[:lower:]]", "[a-z][a-z]",
                "[[=a=]][[=a=]]",
"[[:upper:]-[:lower:]][[:upper:]-[:lower:]]", "[Z-a][Z-a]",
        };
        const char *name[] = {
                "aa", "AA", "aA", "Aa", "aB", "__",
        };
        int flags[] = { FNM_PATHNAME, FNM_CASEFOLD | FNM_PATHNAME };

        for (int i = 0; i < ARRAY_SIZE(pattern); i++) {
                for (int j = 0; j < ARRAY_SIZE(name); j++) {
                        for (int k = 0; k < ARRAY_SIZE(flags); k++) {
                                int match = fnmatch(pattern[i],
name[j], flags[k]);
                                printf("%s %s %s case %s\n", pattern[i],
                                        match == 0 ? "matches" : "does
not match",
                                        name[j],
                                        flags[k] & FNM_CASEFOLD ?
"insensitively" : "sensitively");
                        }
                }
                printf("\n");
        }
        return 0;
}


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]