This is the mail archive of the libc-locales@sourceware.org mailing list for the GNU libc locales project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug localedata/18934] New: [PATCH] Hungarian collate: fix multiple bugs and add tests


https://sourceware.org/bugzilla/show_bug.cgi?id=18934

            Bug ID: 18934
           Summary: [PATCH] Hungarian collate: fix multiple bugs and add
                    tests
           Product: glibc
           Version: 2.22
            Status: NEW
          Severity: normal
          Priority: P2
         Component: localedata
          Assignee: unassigned at sourceware dot org
          Reporter: egmont at gmail dot com
                CC: libc-locales at sourceware dot org
        Depends on: 18589
  Target Milestone: ---

Created attachment 8587
  --> https://sourceware.org/bugzilla/attachment.cgi?id=8587&action=edit
Fix

Please apply the attached patch which addresses multiple bugs in Hungarian
collation.

It also adds an extensive unittest (including all the examples from the
official rules and much more), a significantly bigger one that any other locale
has.

Note that these unittests pass with glibc-2.21 but fail with 2.22 and current
git due to bug 18589 which points to a broken change in the collate algorithm
that needs to be reverted first.

(I know that generally one patch per issue is a cleaner approach, but this time
apologize for an all-in-one: the patches would heavily conflict, and it would
be really cumbersome to unittest an incremental series. Instead, think about it
as TDD (test driven development): I attach a decent unittest with explanations
and pointers to the rules, and a locale definition that implements it.)

The addressed bugs are (in no particular order):

- The fix to bug 13547 was incorrect. It fixed a corner case, whereas I didn't
realize it broke a more frequent once. See details over there.

- Two bugs/inconsistencies wrt. sorting upper/lowercase values, as described in
bug 18587.

- Someone enabled backwards ordering of diacrits by default (bug 17750),
breaking tons of locales including Hungarian.

- Foreign accents should be sorted after the native Hungarian ones, it wasn't
the case so far.

I hope that these changes will not only fix Hungarian, but also provide a
better overall quality for all the locales and a guideline to follow for other
locale implementations, since these extensive tests probably would have helped
(and probably will help in the future) catch bugs similar to 18589 and 17750
before they get committed.


Referenced Bugs:

https://sourceware.org/bugzilla/show_bug.cgi?id=18589
[Bug 18589] sort-test.sh fails at random
-- 
You are receiving this mail because:
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]