This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] en_CA, es_AR, es_ES: Define yesstr and nostr.


  Hi!

On Mon, Apr 08, 2013 at 01:14:51AM +0200, Keld Simonsen wrote:
> On Sun, Apr 07, 2013 at 11:02:06PM +0200, Petr Baudis wrote:
> >   (Though I'm not particularly fond of having the ASCII contents of the
> > datapoint sequence repeated in the comment, as all data duplication adds
> > a potential for inconsistencies. Ideally, we would just actually write
> > the characters right in the values instead of the codepoints; I didn't
> > find any technical reason why to insist on the <U...> syntax for all
> > characters. But then again, I'm personally unlikely to gather the
> > momentum to do such a change, mainly to verify that it really is 100%
> > safe.)
> 
> The locales are character set independent, so they will run with utf-8, iso-8859-1, iso-8859-15
> and even EBCDIC. They are written in ASCII only, to better the portability between systems with
> different character sets.

  But itt's 2013. I claim that portability of locale source files to
EBCDIC is totally irrelevant in glibc and whoever cares should bear the
burden of writing the conversion tools.

  I don't think it would be a big fuss if we just UTF8-encoded locale
files, but even if we only embrace the ASCII (!) and substitute 7bit
codepoint markups with the actual ASCII characters, that would be a
huge practical step forward already.

  The only thing is, I'm not 100% sure if there are any other tools
looking at the locale source files that would break if we did this,
and if it's a big deal to break these tools in case there are any.

> Originally I wrote many locales using some mnemonic scheme, that
> made them easier to read, such as <A> for <U0041>, <B> for <U0042>, <b> for <U0062> etc,
> but Ulrich Drepper did not like that and recoded all the locales to use the <Uxxxx> notation.
> Some of the mnemonics were a bit complex, but IMHO they were far easier to
> proofread than the <Uxxxx> notation, and some came directly from the POSIX standard.
> They were documented in the POSIX.2 standard from 1992, and also in TR 14652.

  Indeed, I have seen some of these locale files I think. But if you
mean <U0041>, why write even <A> if you can write A?

-- 
				Petr "Pasky" Baudis
	For every complex problem there is an answer that is clear,
	simple, and wrong.  -- H. L. Mencken


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]