This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH][BZ 15593] Add transliteration data for "LATIN SMALL LETTER O WITH STROKE" (Ã)


Hi

When running transliteration for text including the Danish letter 'Ã', a
? is produced rather than a proper transliteration:

$ echo test à test | iconv -f utf8 -t ascii//translit
test ? test

This patch adds transliteration data for the small and capital
versions of this letter (LATIN {SMALL,CAPITAL} LETTER O WITH STROKE).
After applying the patch and rebuilding the locale, proper
transliteration is achieved:

$ echo test à test | iconv -f utf8 -t ascii//translit
test oe test

Tested on Arch Linux and CentOS 6.

The choice of transliteration to the phonetic 'oe' is primarily based on
being a native of Denmark (it's how my name is spelled on plane tickets
and in my passport...), and on comments from comments from
keld@keldix.com (see bugzilla comment thread), but these sources seem to
agree:

Wikipedia: https://en.wikipedia.org/wiki/Ã;

The Nordic FAQ:
http://www.faqs.org/faqs/nordic-faq/part1_INTRODUCTION/section-7.html

The Danish encyclopedia:
http://www.denstoredanske.dk/Samfund,_jura_og_politik/Sprog/Ortografi/Ã_Ã;

-Toke



diff --git a/localedata/locales/translit_combining b/localedata/locales/translit_combining
index 44c62f9..97527e6 100644
--- a/localedata/locales/translit_combining
+++ b/localedata/locales/translit_combining
@@ -298,6 +298,8 @@ translit_start
 <U00D5> <U004F>
 % LATIN CAPITAL LETTER O WITH DIAERESIS
 <U00D6> <U004F>
+% LATIN CAPITAL LETTER O WITH STROKE -> "OE"
+<U00D8> "<U004F><U0338>";"<U004F><U0045>"
 % LATIN CAPITAL LETTER U WITH GRAVE
 <U00D9> <U0055>
 % LATIN CAPITAL LETTER U WITH ACUTE
@@ -350,6 +352,8 @@ translit_start
 <U00F5> <U006F>
 % LATIN SMALL LETTER O WITH DIAERESIS
 <U00F6> <U006F>
+% LATIN SMALL LETTER O WITH STROKE -> "oe"
+<U00F8> "<U006F><U0338>";"<U006F><U0065>"
 % LATIN SMALL LETTER U WITH GRAVE
 <U00F9> <U0075>
 % LATIN SMALL LETTER U WITH ACUTE

Attachment: signature.asc
Description: PGP signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]