This is the mail archive of the
libc-locales@sourceware.org
mailing list for the GNU libc locales project.
[Bug localedata/13547] Different strings collate as equal in Hungarian
- From: "egmont at gmail dot com" <sourceware-bugzilla at sourceware dot org>
- To: libc-locales at sourceware dot org
- Date: Tue, 08 Sep 2015 08:35:49 +0000
- Subject: [Bug localedata/13547] Different strings collate as equal in Hungarian
- Auto-submitted: auto-generated
- References: <bug-13547-716 at http dot sourceware dot org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=13547
--- Comment #3 from Egmont Koblinger <egmont at gmail dot com> ---
Please note that the patch applied here was incorrect. It fixed a corner case,
while broke a more generic one.
By tokenizing "ssz" as <s_or_sz><sz> rather than <sz><sz>, and ordering the
tokens as <s> < <s_or_sz> < <sz>, the corner case when the only difference in
the two words is "ssz" vs. "szsz" is fixed.
However, sorting of e.g. "kasza" <k><a><sz><a> vs. "kassza"
<k><a><s_or_sz><sz><a> became broken. The correct ordering would be "kasza" <
"kassza" (since it's actually <k><a><sz><sz><a>), but with the current solution
they're ordered backwards (due to <s_or_sz> preceding <sz>).
The solution is to tokenize both "ssz" and "szsz" as <sz><sz> (as we did
before), but apply something weaker, something along the lines of a "fake
accent" (SINGLE-OR-COMPOUND vs. COMPOUND) on top of them that might distinguish
later.
Let's leave this bug closed. A fix is available in bug 18934.
--
You are receiving this mail because:
You are the assignee for the bug.