This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

mix/match lang/territory (was: [PATCH] localedata: en_NL: new English in the Netherlands locale [BZ #14085])


On 22 Apr 2016 18:10, Florian Weimer wrote:
> * Mike Frysinger:
> > +% English language locale for the Netherlands.
> > +% Internationally oriented users who are physically located in the Netherlands
> > +% use software mainly in the English language.  Therefore they have their
> > +% systems usually configured to US English International.  However, due to the
> > +% geographic location, it can be desirable for certain data to be represented
> > +% according to the local Dutch notation while the rest remains in English.
> 
> Why is this necessary?  Isn't this use case the reason for having
> separate LC_* environment variables, so that you can mix-and-match
> locales like this?  In other words, glibc doesn't need to provide a
> pre-cooked locale.

for the majority of categories, you are certainly correct.  however,
users run into trouble when dealing with categories that commingle
language and territory details.  i highlighted this in the localedef
copy thread, but let's look at just this locale for specifics.

first, these categories can be wholly sourced from elsewhere and are
uninteresting to us for this new locale as they can be handled via
env vars as you described.
 - lang specific categories
  LC_CTYPE       = en_GB
  LC_COLLATE     = en_GB
  LC_MESSAGES    = en_GB
  LC_NAME        = en_GB
 - territory specific categories
  LC_NUMERIC     = en_GB
  LC_PAPER       = nl_NL
  LC_TELEPHONE   = nl_NL
  LC_MEASUREMENT = nl_NL

which leaves us with the ones that are actually defined in this locale:
  LC_TIME
   - day/month fields are def language specific.
   - all the other fields are largely territory specific (the way date &
     time are displayed locally).
  LC_ADDRESS
   - country_name & lang_* are def language specific.
   - all the other fields are territory specific.
  LC_IDENTIFICATION
   - clearly includes both lang & territory details, but not nearly as
     important as the categories above.  could just be lived with.

this one can be a bit murky, so i guess
  LC_MONETARY
   - currency fields are def territory related.
   - the others are semi-lang dependent/personal preference (digit spacing
     and such), but there's no way to customize on a sub-category basis.

so the question before us is how do we want to proceed ?  telling users
"that sucks but that's just how it goes" doesn't seems like the right path
to me long term.  exploding combinations of lang/territories also sucks,
but it's the only way today to accomplish this.

maybe we could spec out a new format for the env vars that'd allow people
to mix & match lang & territory themselves ?  POSIX leaves the format of
locale names up to the implementation after all, as well as the output of
the localedef tool.  we could do something like:
	LANG=[lang]:[territory]
	LANG=en_US:nl_NL
and we'd take care of filling in lang fields using en_US and territory
fields using nl_NL.  this would go beyond just category selection since
as i described above.
-mike

Attachment: signature.asc
Description: Digital signature


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]