This is the mail archive of the libc-hacker@sourceware.cygnus.com mailing list for the glibc project.
Note that libc-hacker is a closed list. You may look at the archives of this list, but subscription and posting are not open.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
I just checked in a change to fnamtch which changes its behaviour wrt to ranges. Currently you are surprised if you do something like rm [a-c]* if the locale != C. In some locales this will remove also the files beginning with uppercase characters. The problem was that strcoll() was used. This seemed correct since the standard mentioned collation sequence order decides about the range. But the problem is the collation sequence order is not collation order. I talked with the original designer of the POSIX i18n interfaces two weeks ago and he explained it. They realized at that time that the collation order is not suitable. Therefore they are using collation sequence order. The problem is they are not defining this. From the talks with the guy I learned that the collation sequence order is the order of the collation definitions in the source file. It's a nice way out and I have implemented this now. One problem remains: the locale definitions must now be corrected. Currently the definitions look like this: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <a> <A>;<NONE>;<SMALL>;IGNORE <A> <A>;<NONE>;<CAPITAL>;IGNORE <-a> <A>;<NONE>;<-a>;IGNORE <a'> <A>;<ACUTE>;<SMALL>;IGNORE <A'> <A>;<ACUTE>;<CAPITAL>;IGNORE <a!> <A>;<GRAVE>;<SMALL>;IGNORE <A!> <A>;<GRAVE>;<CAPITAL>;IGNORE <a!!> <A>;<DOUBLE-GRAVE>;<SMALL>;IGNORE <A!!> <A>;<DOUBLE-GRAVE>;<CAPITAL>;IGNORE <a(> <A>;<BREVE>;<SMALL>;IGNORE <A(> <A>;<BREVE>;<CAPITAL>;IGNORE <a('> <A>;<BREVE+ACUTE>;<SMALL>;IGNORE <A('> <A>;<BREVE+ACUTE>;<CAPITAL>;IGNORE <a(!> <A>;<BREVE+GRAVE>;<SMALL>;IGNORE <A(!> <A>;<BREVE+GRAVE>;<CAPITAL>;IGNORE ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ What has to happen is that upper- and lower-case character must be defined in separate blocks. This has no effect on the collation order, but it ensures that [a-b] does not match A or B since the lines with A or B in the locale source file are not between the lines with the definitions for a or b. I'm a bit reluctant to spend much time on the old locale descriptions. Instead I'll check in in a few moments a ISO 14651 collation description which already pays attention to this. With the rewrite of the locale data to the new format we can also switch over to using this data. -- ---------------. drepper at gnu.org ,-. 1325 Chesapeake Terrace Ulrich Drepper \ ,-------------------' \ Sunnyvale, CA 94089 USA Red Hat `--' drepper at redhat.com `------------------------
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |