This is the mail archive of the
libc-locales@sourceware.org
mailing list for the GNU libc locales project.
[Bug localedata/2253] unicode combining accents can't be iconv-ed to latin//translit (and others)
- From: "maiku.fabian at gmail dot com" <sourceware-bugzilla at sourceware dot org>
- To: libc-locales at sourceware dot org
- Date: Tue, 05 May 2015 09:40:46 +0000
- Subject: [Bug localedata/2253] unicode combining accents can't be iconv-ed to latin//translit (and others)
- Auto-submitted: auto-generated
- References: <bug-2253-716 at http dot sourceware dot org/bugzilla/>
https://sourceware.org/bugzilla/show_bug.cgi?id=2253
--- Comment #7 from Mike FABIAN <maiku.fabian at gmail dot com> ---
(In reply to Samuel Thibault from comment #6)
> Err, but here e+combineacute *is* representable in latin1, it's eacute. So
> transliteration should not discard the accent.
Yes, maybe.
But is this doable with the glibc transliteration system?
All the glibc/localedata/locales/translit_* files just transliterate
one single character to another character or a list of characters.
It never starts with a character sequence. So I guess this is not supported.
As Jungshik Shin suggests in comment#1, iconv could
normalize the input to NFC before attempting a transliteration.
Certainly not without transliteration, as Rich Felker writes in
comment#3, but *if* transliteration is used, normalizing to NFC and
then doing the transliteration might be a reasonable approach.
--
You are receiving this mail because:
You are on the CC list for the bug.