This is the mail archive of the glibc-bugs@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug localedata/14094] Update locale data to Unicode 6.3


https://sourceware.org/bugzilla/show_bug.cgi?id=14094

--- Comment #9 from Pravin S <pravin.d.s at gmail dot com> ---
(In reply to Rich Felker from comment #1)
> One of the major "local hacks" can be fixed, fixing many other problems at
> the same time, by switching to using the Unicode "Alphabetic" property (from
> DerivedCoreProperties.txt) instead of just categories L* for class alpha.
> Right now there are many languages whose letters are considered
> non-alphabetic by glibc because they're in category Mn or Mc or even Cf.
> There are "local hacks" to fix this for maybe one or two languages, but
> using the right Unicode property would fix it for all languages.

I was almost done with things bug While updating this, i found around 248
characters were added after gen-unicode-ctype.c processing in ALPHA group in
present i18n CTYPE (Unicode 5.1
https://github.com/pravins/glibc-i18n/blob/master/unicode5-1/Report ) and i am
facing same issue while upgrading it to Unicode 6.3 (246 characters)
(https://github.com/pravins/glibc-i18n/blob/master/Report)

During reading http://www.unicode.org/reports/tr44/#Property_List_Table It is
mentioned 

"Implementations should simply use the derived properties, and should not try
to rederive them from lists of simple properties and collections of rules,
because of the chances for error and divergence when doing so."  

I agree with Rich, We should collect available things from
DerivedCoreProperties.txt rather than processing raw UnicodeData.txt. I am
writing script to process groups from DerivedCoreProperties.txt

-- 
You are receiving this mail because:
You are on the CC list for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]