This is the mail archive of the cygwin-developers mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: "C" UTF-8 trouble


According to Andy Koppe on 10/6/2009 12:46 PM:
>> Corinna Vinschen wrote:
>>> Os it really necessary to change Cygwibn for this?  I'm wondering if we
>>> should put this into /etc/profile and /etc/csh.login instead.
> 
> And also bash.bashrc (for non-login shells), zprofile and zshrc,
> whatever ksh uses, ... . Yet programs that are invoked without going
> through a shell still wouldn't get the setting. Same for users who've
> modified those files.

For the problematic apps, are they checking just the environment
variables, or are they using setlocale(,NULL) and/or setlocale(,"") to
determine the current/default settings?  Anyone using _just_ the
environment variables is doomed to failure.  POSIX states:

"If the LANG environment variable is not set or is set to the empty
string, the implementation-defined default locale shall be used."

My preference would be that if the environment variables were not set when
cygwin1.dll started, then setlocale(,NULL) returns "C.UTF-8" rather than
"C".  If an application does setlocale(,"C"), then queries, they should
get "C" (required by POSIX) - we can't change it to "C.UTF-8"; but it
still implies UTF-8 charset.  To get a unibyte charset, they have to ask
for it via setlocale(,"C.ASCII") or some such.

Meanwhile, it would make sense to update the base-files package to
explicitly do:

: ${LANG:=C.UTF-8}

(in bash syntax; adjust as needed for tcsh) in all of the skeleton files
(such as the sample .bashrc).  That way, the environment variable will be
unchanged if the user set it through the control panel, set to the default
behavior used by bash if it was previously unset, and explicitly set for
all child processes that look at just the environment instead of properly
calling setlocale().

> Instead of hardcoding "C-UTF8", how about reading the initial LANG
> setting from a config file, e.g. /etc/defaults/locale?

That would also be a cool idea.

-- 
Don't work too hard, make some time for fun as well!

Eric Blake             ebb9@byu.net


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]