This is the mail archive of the
libc-locales@sourceware.org
mailing list for the GNU libc locales project.
[Bug localedata/15537] lv_LV: invalid collation for Latvian diacritical letters
- From: "maiku.fabian at gmail dot com" <sourceware-bugzilla at sourceware dot org>
- To: libc-locales at sourceware dot org
- Date: Mon, 20 Nov 2017 08:57:22 +0000
- Subject: [Bug localedata/15537] lv_LV: invalid collation for Latvian diacritical letters
- Auto-submitted: auto-generated
- References: <bug-15537-716@http.sourceware.org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=15537
--- Comment #3 from Mike FABIAN <maiku.fabian at gmail dot com> ---
(In reply to alexander smishlajev from comment #0)
> Besides, current version of Latvian locale contains letter R WITH CEDILLA
> (U0156, U0157), which is now sorted separately from letter R with other
> diacritical marks. This letter is not currently used for Latvian writing in
> Latvia (it was used in the first half of the 20th century, and is still used
> by some Latvian communities outside Latvia), so the sorting rules for this
> letter are not obvious. I think that it would be better to make the first
> weight for letter R WITH CEDILLA equal to R because most of current Latvian
> language users cannot say when to use R with cedilla instead of R.
My patch fixes the problems you report, *except* the problem you
report about R WITH CEDILLA.
I fixed it by throwing away all the existing rules in LC_COLLATE in the
lv_LV locale and do a
copy "iso14651_t1"
instead to include the default sort order.
Then, on top of the default sort order I implemented the same
rules as in
http://unicode.org/cldr/trac/browser/trunk/common/collation/lv.xml
This collation data from CLDR treats the R WITH CEDILLA as primary different
from R, i.e. it continues to sort it the same way as the current
lv_LV locale in glibc does.
I don’t want to deviate from the CLDR collation data for no good reason,
so if this is really wrong it would be good to report a bug
against CLDR. But I guess it is correct because it cites
a Latvian dictionary as a reference.
--
You are receiving this mail because:
You are on the CC list for the bug.