This is the mail archive of the
cygwin
mailing list for the Cygwin project.
Re: The C locale
On Sep 24 16:03, IWAMURO Motonori wrote:
> 2009/9/22 Andy Koppe <andy.koppe@gmail.com>:
> > Let's use the Windows "ANSI" codepage as the character set for the C
> > locale, for both the conversion functions and filenames. This means
> > CP1252 on Western systems, CP1251 on Cyrillic ones, CP932 on Japanese
> > ones, and so on.
>
> I oppose the approach (the ANSI codepage is used at C locale) because
> CP932 (the codepage for Japanese) is hostile to the UNIX-like tools.
>
> The reason is that the CP932 format contains a lot of meta characters
> as follows.
>
> single character of CP932:
> /[\x00-\x7F\xA0-\xDF]|[\x81-\x9F\xE0-\xFC][\x40-\x7E\x80-\xFC]/
I don't understand. Are you saying that the single character in CP932
consists of 12 bytes? As far as I can see, CP932 is S-JIS, which
is a just a simple double byte character set. What am I missing.
> This has a ruined influence to the tools that don't see locale.
Can you please try to explain the problem in a bit more detail for
those of us not fluent in eastern asian languages? What do you
mean with "hostile" and "ruined influence"?
Thanks,
Corinna
--
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Project Co-Leader cygwin AT cygwin DOT com
Red Hat
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple