This is the mail archive of the newlib@sourceware.org mailing list for the newlib project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Read locale settings from environment

From: Jeff Johnston <jjohnstn at redhat dot com>
To: newlib at sourceware dot org
Date: Wed, 25 Feb 2009 14:06:28 -0500
Subject: Re: [PATCH] Read locale settings from environment
References: <20090213180652.GB24274@calimero.vinschen.de> <499DD5D0.8050008@redhat.com> <20090220101406.GA24834@calimero.vinschen.de> <20090220120904.GB24834@calimero.vinschen.de> <20090221103411.GO24834@calimero.vinschen.de>

Corinna Vinschen wrote:

On Feb 20 13:09, Corinna Vinschen wrote:

Ok, here's my new setlocale implementation.  It fixes the following
problems:
[...]
- Per POSIX allow the required "POSIX" locale.  Map it to the "C" locale
  as on Linux.
- If locale is "", honor the environment in the order required by POSIX for all supported categories.


Apart from that, would it be ok to change setlocale() and subsequent
functions using __lc_ctype (e.g. mbtowc_r, wctomb_r, iswXXX) so that all
POSIX compliant LC_XXX environment variable settings are taken?  The
currently accepted locales

C[-codeset]

are non-POSIX. The POSIX variant is

[language[_territory][.codeset][@modifier]]

Of course we should keep recognizing the C[-codeset] for backward
compatibility but I think we should not stick to them.

Actually all the related functions only rely on the charset part of the
setting, not the actual language.  So, what we could do is to split away
the charset part along the lines of what is already done in the
LC_MESSAGES part of the code and only check for that in the subsequent
functions.  Instead of checking against __lc_ctype these functions could
check for, say, __lc_charset.  The LC_CTYPE setting could then reflect
the real setting of the environment.  For instance:

LC_ALL=POSIX

  ==>  __lc_ctype == C
       __lc_charset = ISO-8859  (!)

LC_ALL=en_US.UTF-8

  ==>  __lc_ctype == en_US.UTF-8
       __lc_charset = UTF-8

LC_ALL=jp_JP.EUCJP

  ==>  __lc_ctype == jp_JP.EUCJP
       __lc_charset = EUCJP

LC_ALL=de

  ==>  __lc_ctype == de
       __lc_charset = ISO-8859  (!)

LC_ALL=fr_FR.ISO-8859-15

==> __lc_ctype == fr_FR.ISO-8859-15 __lc_charset = ISO-8859 (!) Actually the __lc_charset could be a single character like I for ISO, U for UTF, E for EUCJP, etc, to simplify the checks in mbtowc_r and the others.

What do you say?

Ok. I think the charset should be full instead of single character.

-- Jeff J.

Corinna

Follow-Ups:
- Re: [PATCH] Read locale settings from environment
  - From: Corinna Vinschen

References:
- [PATCH] Read locale settings from environment
  - From: Corinna Vinschen
- Re: [PATCH] Read locale settings from environment
  - From: Jeff Johnston
- Re: [PATCH] Read locale settings from environment
  - From: Corinna Vinschen
- Re: [PATCH] Read locale settings from environment
  - From: Corinna Vinschen
- Re: [PATCH] Read locale settings from environment
  - From: Corinna Vinschen

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]