This is the mail archive of the libc-locales@sourceware.org mailing list for the GNU libc locales project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug localedata/21547] Tibetan script collation broken (Dzongkha and Tibetan)


https://sourceware.org/bugzilla/show_bug.cgi?id=21547

--- Comment #11 from Mike FABIAN <maiku.fabian at gmail dot com> ---
(In reply to Elie Roux from comment #9)
> I have to say I don't really understand why ICU behaves like that... I think
> we should do two things: 
> 
> - change my rule file so that it contains just one line and fix this oddity
> - report a bug on ICU (maybe it's not a bug per se, but I can't see any
> other way to solve this mistery)
> 
> I'll fix the rule file (possibly today). If you have some time do you think
> you could report the ICU bug?
> 
> Thank you!

Here are the two lines with rules:

    &ཉ<<ྋྙ<གཉ<མཉ<རྙ=ཪྙ<སྙ<བརྙ=བཪྙ<བསྙ
    &གཉ<གཉྫ

And here I added code points in [] brackets to understand better what
is going on:

    &ག[0F42]ཉ[0F49]<ག[0F42]ཉ[0F49]ྫ[0FAB]
   
&ཉ[0F49]<<ྋ[0F8B]ྙ[0F99]<ག[0F42]ཉ[0F49]<མ[0F58]ཉ[0F49]<ར[0F62]ྙ[0F99]=ཪ[0F6A]ྙ[0F99]<ས[0F66]ྙ[0F99]<བ[0F56]ར[0F62]ྙ[0F99]=བ[0F56]ཪ[0F6A]ྙ[0F99]<བ[0F56]ས[0F66]ྙ[0F99]


So the first line orders U+0F42 U+0F49 U+0FAB after U+0F42 U+0F49.

But then the second line reorders U+0F42 U+0F49 after U+0F8B U+0F99.

So the reference point after which U+0F42 U+0F49 U+0FAB has been
reordered in the first line has been moved to somewhere else by the second
line.
Moving away that reference point U+0F42 U+0F49 does not move U+0F42 U+0F49
U+0FAB
as well to stay behind the reference point U+0F42 U+0F49.

I.e. the second line overrides the first.

If the order of the lines is reversed, it works, because the line:

   
&ཉ[0F49]<<ྋ[0F8B]ྙ[0F99]<ག[0F42]ཉ[0F49]<མ[0F58]ཉ[0F49]<ར[0F62]ྙ[0F99]=ཪ[0F6A]ྙ[0F99]<ས[0F66]ྙ[0F99]<བ[0F56]ར[0F62]ྙ[0F99]=བ[0F56]ཪ[0F6A]ྙ[0F99]<བ[0F56]ས[0F66]ྙ[0F99]

then first reorders U+0F42 U+0F49 somewhere and *after* doing that,
the line:

    &ག[0F42]ཉ[0F49]<ག[0F42]ཉ[0F49]ྫ[0FAB]

inserts U+0F42 U+0F49 U+0FAB after the current position of U+0F42 U+0F49.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]