This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Should glibc provide a builtin C.UTF-8 locale?

From: Rich Felker <dalias at libc dot org>
To: Mike FABIAN <mfabian at redhat dot com>
Cc: keld at keldix dot com, Carlos O'Donell <carlos at redhat dot com>, GNU C Library <libc-alpha at sourceware dot org>, Pravin Satpute <psatpute at redhat dot com>, Jens Petersen <petersen at redhat dot com>
Date: Thu, 29 Oct 2015 14:35:44 -0400
Subject: Re: Should glibc provide a builtin C.UTF-8 locale?
Authentication-results: sourceware.org; auth=none
References: <54DB8243 dot 3050903 at redhat dot com> <20151021174936 dot GA26317 at vapier dot lan> <5627DAAE dot 8060703 at redhat dot com> <20151021205540 dot GA30739 at www5 dot open-std dot org> <s9dr3kgfqlx dot fsf at ari dot site> <20151027223455 dot GB8645 at brightrain dot aerifal dot cx> <s9dmvv1tu2n dot fsf at ari dot site>

On Thu, Oct 29, 2015 at 07:20:48PM +0100, Mike FABIAN wrote:
> Rich Felker <dalias@libc.org> wrote:
> 
> >> LC_CTYPE
> >>    almost the same
> >>    - C.UTF-8 just copies the LC_CTYPE from "i18n" (Which is kept
> >>      in sync with the latest Unicode release using some scripts) and
> >>      adds "translit_combining".
> >
> > So C.UTF-8 will have the full character-class data? I'm in favor of
> > that but just want to clarify, since omitting it would also be
> > possible.
> 
> Yes, with the patch I made, it has the full character-class data,
> i.e. exactly the same as in the i18n file.

Sounds good.

> >> LC_MONETARY
> >>    - C.UTF-8 tries to agree with C/POSIX as much as possible
> >>      and thus uses "USD" for int_curr_symbol, "$" for currency_symbol,
> >>      and "." for mon_decimal_point.
> >
> > This is incorrect, at least based on the spec. C requires the values
> > for int_curr_symbol and currency_symbol to be "" in the C locale (7.11
> > Localization <locale.h>, paragraph 2). I think the values you cited
> > are from en_US.
> 
> I wanted to fill in something for int_curr_symbol and currency_symbol
> mainly because "localedef" complains when these fields are empty
> and refuses to generate the binary locale unless one uses the force
> option:
> 
>      -c, --force
>           Write the output files  even if warnings were generated
>           about the input file.
> 
> and this might make one miss real errors.
> 
> Maybe "localedef" should be adapted to allow empty values
> for these two fields if the locale to be generated is C.UTF-8?

Yes, I think so. Putting en_US values in there is inappropriate and
makes this locale not much of a C.UTF-8 locale but just a
slightly-different variant of en_US.

> >> LC_MESSAGES
> >>    - C.UTF-8 uses the same as C/POSIX
> >>      (for example yesexpr "^[yY]" and noexpr "^[nN]"
> >>    - i18n.UTF-8 apparently tries to avoid English
> >>      (for example yesexpr  "^[+1]" and noexpr "^[-0]")
> >
> > What about error messages? This is probably off-topic, but it might be
> > nice if i18n used the actual errno macro names as strings ("ENOENT",
> > etc.) if it doesn't already.
> 
> There was nothing for error messages in the i18n file. Neither
> in C/POSIX.

OK. The reason I raise this is that I actually got several user
requests for musl to use the raw E* macro names rather than
descriptive English strings in the C locale. I don't think glibc would
want to make such a change in the C locale (and we probably wouldn't
in musl either), but the i18n locale might be a nice place to
experiment with it.

Rich

References:
- Re: Should glibc provide a builtin C.UTF-8 locale?
  - From: Mike Frysinger
- Re: Should glibc provide a builtin C.UTF-8 locale?
  - From: Carlos O'Donell
- Re: Should glibc provide a builtin C.UTF-8 locale?
  - From: keld
- Re: Should glibc provide a builtin C.UTF-8 locale?
  - From: Mike FABIAN
- Re: Should glibc provide a builtin C.UTF-8 locale?
  - From: Rich Felker

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]