This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug localedata/12031] iconv -t ascii//translit with Greek characters


https://sourceware.org/bugzilla/show_bug.cgi?id=12031

Mike FABIAN <maiku.fabian at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |maiku.fabian at gmail dot com

--- Comment #8 from Mike FABIAN <maiku.fabian at gmail dot com> ---
(In reply to Petter Reinholdtsen from comment #5)
> (In reply to comment #4)
> > gives me "ae,?,a" but in my opinion it should give me "ae,o,a".
> [...]
> > Is this a bug?
> 
> I believe it is a bug.

It works in recent glibc (glibc-2.20-8.fc21.x86_64)
in *all* locales except C/POSIX. 

$ echo 'Ã,Ã,Ã,Ã,Ã,Ã' | LANG=nb_NO.UTF-8 iconv -t ascii//TRANSLIT 
AE,ae,OE,oe,A,a

$ echo 'Ã,Ã,Ã,Ã,Ã,Ã' | LANG=en_US.UTF-8 iconv -t ascii//TRANSLIT 
AE,ae,OE,oe,A,a

$ echo 'Ã,Ã,Ã,Ã,Ã,Ã' | LANG=POSIX iconv -t ascii//TRANSLIT 
iconv: illegal input sequence at position 0

It is independent of the locale because all locales (except C/POSIX)
include translit_neutral where this is defined.

> The request to change transliteration for ÃÃÃ is
> http://sourceware.org/bugzilla/show_bug.cgi?id=89 .  Please explain there
> why you believe it should transliterate to ae,o,a and not ae,oe,aa.

For Scandinavian locales, transliterating 'Ã,Ã,Ã,Ã,Ã,Ã' to 'Ae, ae,
Oe, oe, Aa, aa' is more appropriate. For most other locales,
transliterating à to a is probably OK.  I am a bit puzzled about à ->
AE, shouldnât this be transliterated to Ae, even in English locales?
(Same with Ã, transliterating to just O or maybe Oe in
translit_neutral for all locales which do not have special rules
seems better.

The patch attached to

https://sourceware.org/bugzilla/show_bug.cgi?id=89#c5

fixes the transliteration for Norwegian locales (nn_NO and nb_NO).
Probably the same fix should be applied also for Swedish and Finnish
locales (and maybe Icelandic locales as well).

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]