This is the mail archive of the
glibc-bugs@sourceware.org
mailing list for the glibc project.
[Bug localedata/12031] iconv -t ascii//translit with Greek characters
- From: "maiku.fabian at gmail dot com" <sourceware-bugzilla at sourceware dot org>
- To: glibc-bugs at sourceware dot org
- Date: Mon, 04 May 2015 18:53:52 +0000
- Subject: [Bug localedata/12031] iconv -t ascii//translit with Greek characters
- Auto-submitted: auto-generated
- References: <bug-12031-131 at http dot sourceware dot org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=12031
Mike FABIAN <maiku.fabian at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |maiku.fabian at gmail dot com
--- Comment #8 from Mike FABIAN <maiku.fabian at gmail dot com> ---
(In reply to Petter Reinholdtsen from comment #5)
> (In reply to comment #4)
> > gives me "ae,?,a" but in my opinion it should give me "ae,o,a".
> [...]
> > Is this a bug?
>
> I believe it is a bug.
It works in recent glibc (glibc-2.20-8.fc21.x86_64)
in *all* locales except C/POSIX.
$ echo 'Ã,Ã,Ã,Ã,Ã,Ã' | LANG=nb_NO.UTF-8 iconv -t ascii//TRANSLIT
AE,ae,OE,oe,A,a
$ echo 'Ã,Ã,Ã,Ã,Ã,Ã' | LANG=en_US.UTF-8 iconv -t ascii//TRANSLIT
AE,ae,OE,oe,A,a
$ echo 'Ã,Ã,Ã,Ã,Ã,Ã' | LANG=POSIX iconv -t ascii//TRANSLIT
iconv: illegal input sequence at position 0
It is independent of the locale because all locales (except C/POSIX)
include translit_neutral where this is defined.
> The request to change transliteration for ÃÃÃ is
> http://sourceware.org/bugzilla/show_bug.cgi?id=89 . Please explain there
> why you believe it should transliterate to ae,o,a and not ae,oe,aa.
For Scandinavian locales, transliterating 'Ã,Ã,Ã,Ã,Ã,Ã' to 'Ae, ae,
Oe, oe, Aa, aa' is more appropriate. For most other locales,
transliterating à to a is probably OK. I am a bit puzzled about à ->
AE, shouldnât this be transliterated to Ae, even in English locales?
(Same with Ã, transliterating to just O or maybe Oe in
translit_neutral for all locales which do not have special rules
seems better.
The patch attached to
https://sourceware.org/bugzilla/show_bug.cgi?id=89#c5
fixes the transliteration for Norwegian locales (nn_NO and nb_NO).
Probably the same fix should be applied also for Swedish and Finnish
locales (and maybe Icelandic locales as well).
--
You are receiving this mail because:
You are on the CC list for the bug.