This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug localedata/2872] Transliteration Cyrillic -> ASCII fails


https://sourceware.org/bugzilla/show_bug.cgi?id=2872

--- Comment #11 from Egor Kobylkin <ekobylkin at paypal dot com> ---
I have read the linked documents from Marko Myllynen Comment 8. 
My understanding so far is that apart from possibly required code parts that
are not clear yet to me there should be a translation table for the
transliteration.

Based on the 
man page http://man7.org/linux/man-pages/man5/locale.5.html
Russian GOST 7.79-2000 official transliteration table
http://transliteration.ru/gost-7-79-2000/
and the Unicode file http://www.unicode.org/Public/UNIDATA/UnicodeData.txt
I have created a single character transliteration table in the form of a
following list
% CYRILLIC CAPITAL LETTER IO
<U0401> <U0059>
% CYRILLIC CAPITAL LETTER A
<U0410> <U0041>
% CYRILLIC CAPITAL LETTER BE
<U0411> <U0042>
% CYRILLIC CAPITAL LETTER VE
<U0412> <U0056>
etc.
First Unicode value is the Cyrillic letter and the second is a corresponding
ASCII symbol.

The file is attached as translit_cyrillic. 
I wonder if it could be useful already for inclusion into the Latin based
locales files via "include" keyword.

Please let me know what you think. Specifically my understanding is that this
is the list that Ulrich Drepper was requesting.

I would be grateful if somebody familiar with the logic behind the
transliteration file structure could outline the missing parts in case the
above is not sufficient to get bootstrap the cyrillic-ascii transliteration.

-- 
You are receiving this mail because:
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]