I am listed as a contributor for the be_BY locale
We had a heated discussion almost 1 year ago on the belarusian i18n mailing list
<email@example.com> around belarusian locales.
Many belarusian translators for FOSS software are on the list and we invited
Bruno Haible to help us on the topic.
Upon the discussion results, I suggest introducing a new locale named
be_BY@classic for the Belarusian classic writing which is the only productive
(in the linguistic sense) writing in Belarus nowadays.
This matches the advice of Bruno. Whoever will commit this change should contact
him if more details are required. Petter Reinholdtsen has also been included in
the dicussion but fell off pretty quickily.
Below comes the contents of the locale definition:
% Belarusian Language Locale for Belarus
% Contact: Alexander Mikhailian
% Email: firstname.lastname@example.org
% Language: be
% Territory: BY
% Revision: 0.5
% Date: 2004-08-24
% Application: general
% Users: general
% Repertoiremap: mnemonic.ds
% Charset: CP1251, UTF-8
% Distribution and use is free, also
% for commercial purposes.
title "Belarusian locale for Belarus, traditional spelling"
source "Belarusian i18n mailing list"
contact "Alexander Mikhailian"
tel "+32 494 60 91 31"
% iso14651_t1 is missing Ukrainian ghe
Alexander, you should consider replacing LC_* sections by
when they are identical to be_BY (i.e. all except LC_TIME,
from what I've seen, and of course LC_IDENTIFICATION),
this would help future maintenance of your locale files.
Created attachment 529 [details]
be_BY@classic belarusian locale
OK, here comes the modified version. Sorry for the delay.
I think, word "classic" here would be misleading.
This qualifier of the aforementioned orthography is self-awarded and is inaccurate by any definition, as both qualifier
and orthography aren't used or recognized as such by anybody excepting rather minor minority.
I'd suggest naming this be_BY@alternative. This would be 100% accurate per all definitions of "classic" and
"alternative", pertain accurately to the goals of the mentioned minority movement, and there's a precedent of the
usage of the term "alternative" in this part of the world (DOS Cyrillic codepages in late 1980-s).
Also, you won't have a problem with multitude of "alternatives" as there seems to exist now some kind of standard
on this alternative.
Yes, I was participating in aforementioned discussion and No, I won't re-start the discussion *here*, unless asked
for further explanations. However, feel free to contact me privately via e-mail, if needed or interested.
Belarusian locale is very necessary!
> I think, word "classic" here would be misleading.
See to linguistic literature - http://www.knihi.net/index.php?productID=224
"classic" is name of the spelling.
We need Belarusian locale
I didn't make myself clear, then. This isn't about linguistics at all.
In Belarusian language community, there exists certain interest group, promoting use of several (long obsoleted)
orthography rules. This group calls their variant of Belarusian orthography "classic".
Alexander Mikhailian proposes creating additional be_BY@... branch, which would assert usage of the mentioned
orthography variant. And that's perfectly okay! Just the qualifier isn't chosen well.
I wouldn't put "classic" but rather, e.g., "alternative" there because:
The term "classic", by every definition, is something well-recognized, widely or traditionally used.
However, virtually nobody in Belarusian community outside of the interest group (which isn't numerous and/or
popular!) recognizes the mentioned variant as "classic", neither by knowing or referring the name, nor by usage
tradition -- as the variant's key orthography features were obsoleted about 70 years ago!
Even the group-promoted usage of name "classic" started, it seems, between 1992 and 1994 (judging by two big
publications on Belarusian orthography by one of the group leaders).
On the other hand, term "alternative" here would be immediately recognizable, both by popular understanding and
by group self-imaging.
P.S. The book Kirill A. Shutemov pointed me to contains one of the editions of the mentioned orthography variant,
published by the interest group, supervised, even authored, it seems, by one of the interest group leaders.
I really have no interest to get in the middle of all this. The extension
@classic seems indeed to be wrong to me from what I read. And there is already
a Belarusian locale.
Unless this second language variant is the official one (which I doubt it is) it
is best to just collect a tarball with all the appropriate files and distribute
it separately. There is nothing a separately distribute locale source file
cannot do if it is compiled using localedef upon installation.
Adding variants like this (as opposed to Latin vs Cyrillic, for instance) would
mean we open ourselves to all kind of fights like this.
So, unless I get some really convincing arguments I'll close this as WONTFIX.
(In reply to comment #8)
> So, unless I get some really convincing arguments I'll close this as WONTFIX.
If nothing else comes up, I'm supporting this as it goes.
Yury Tarasievich says:
"I think, word "classic" here would be misleading. This qualifier of the
aforementioned orthography is self-awarded and is inaccurate by any definition,
as both qualifier and orthography aren't used or recognized as such by anybody
excepting rather minor minority."
I wouldn't like to start the old discussion with Mr Tarasevich who has some
unexplained repugnance for that other orthography and has been the only one to
fight it vehemently in all relevant net discussions.
However, I want to draw your attention to an inaccuracy in his comment - upon
which all his argumentation is built:
Those who follow that "other" orthography are in slight MAJORITY (not in minor
minority) on the Net. You can easily prove it - just google any pair of words
spelled differently in the two orthographies.
Let me re-iterate (and bring this back to topic):
I am, generally, *in* *favour* of this separation of locales.
The way I see it, if folks want their very own sub-locale, then okay and good riddance.
There's already latin-scripted sub-locale approved, created, I hear, for the userbase that is yet to
emerge one day. So why not one extra?
But then, the initially proposed "classic" qualifier is inappropriate and unmerited, either measured by
popular support or by usage tradition.
And google hits aren't relevant at all to this exact question of being or not being classic.
Other things, and quite material at that, are.
Be it noticed, I do *not* accept even the general quality of Mr.Shupa's expoundations.
But that kind of discussion would be well out of scope of this issue.
If the word 'classic' is controversial point then let name it e.g.
'alternative'. Do not let our puristic discussions lead us to nothing. But we
need the locale.
Just for the record: I've nothing against naming this branch "alternative".
So what's the result?
Maybe, it's time to register be_BY@alternative and to move/copy the existing
*.po files of GNOME/coreutils etc. into this "namespace"?
Also we should change the be_BY locale for the norms of standard Belarusian I think.
BTW, the Debian be-locale-data supports the be_BY@alternative extension quite a
lot. Please, make this extension upstream.
IANA authorities has already approved the official name of alternative
Belarusian orthography variant: be-tarask
Here is the link: http://www.iana.org/assignments/language-subtag-registry
Can we register this locale then?
we really need additional belarusian locale (be@tarask as aproved by IANA). Just
because the most of translations made for be@tarask, not for be, and you can't
ignore this fact
Dear Ulrich: Is there any chance we can get the bug fixed in glibc?
We want to use alternative Belarusian locale in our project (openinkpot.org),
but we don't want to create yet another (216th) patch for glibc.
Created attachment 4525 [details]
be_BY@tarask locale definition for glibc
it's a be_BY@tarask locale definition for glibc. please, accept it at last.
I believe Ulrich's aim is to avoid all kinds of fringe variations plaguing glibc locale database. If only small minority uses be_BY@tarask, it should be distributed separately. If only small minority would use current be_BY, be_BY@tarask maybe should just be entered as be_BY. If there is rough equilibrium between the two groups (which you seem to indicate), I would say there is a value in having this. (The only real argument in this bug seemed to be about the naming, which seems to have been resolved in the IANA scope.)
Can you somehow show that the equilibrium is the case? E.g. are there (pre-existing) Wikipedia articles about this, other notable sources (e.g. major newspaper articles) or such?
Ok, let me show you that this language variant is quite strong to have its own locale.
1. There are 2 (two) Belarusian Wikipedias: be.wikipedia.org (IANA: be_BY-1959acad, be_BY in glibc) and be-x-old.wikipedia.org (IANA: be_BY-tarask) with quite similar articles count: 24914 (be) vs. 29004 (be-x-old).
2. As for localisation, we have different open- and closed-source software having either one or another language variant translations. F.e. OpenOffice, Mozilla Suite, Firefox, Thunderbird, KDE work with academic language variant (be_BY-1959acad) though GNOME, Gimp, Xfce4 have tarashkevitsa (be_BY-tarask) translation (the latter was forced to use be_BY locale till now because there is no proper place for their contributions). Mediawiki and some other software packages have both language versions translation.
3. As for real life, Belarusian Academy of Science, schools, state publishers and media work in -1959acad version. Though some other popular media work in alternative, -tarask version (mainly: Radio Liberty for Belarus, Radio Racyja, ARCHE, some private publishers).
I don't know about any official statistics on the percentage of usage of each of the variants but I think every Belarusian language user will support that this percentage can vary (80/20 to 20/80 percentage in different spheres with total stats of about 75/25 for -1959acad and -tarask respectively).
You can read a bit more on the roots of two language norm variants existance on: http://en.wikipedia.org/wiki/Taraškievica
So this is not the case of "fringe variations plaguing glibc locale database." :)
*** Bug 4020 has been marked as a duplicate of this bug. ***
*** Bug 7014 has been marked as a duplicate of this bug. ***
Hey, guys, how much years do you need more to accept this trivial patch and close at last this annoying bug?
This is what I've received from a native speaker:
"I don't think this change should be done. Since the time
the bug was reported, usage of the alternative spelling has
significantly decreased both in localization projects as well as in
real/web life, and I don't believe there are a lot of people who would
contribute in this specific locale variant. Having all those variants
at this point is just confusing for users. If I were you, I would just
close the bug without a fix.
Therefore I'm closing this.
I absolutely disagree with such bug treatment. Although Ihar does not use it this does not mean that nobody else uses it.
Of course it no easy to properly work with @tarask locale variant (incorrect spelling of some strings etc) because the bug was opened 12 (TWELVE!) years ago without any actions from glibc maintainers.
This is a trivial case, this is not posting glibc to brand new kernel. It's just a locale definitions, it's attached, what prevents glibc maintainers to simply copy it into source tree?
Is this how communication with community should be handled?
If a Unicode CLDR locale variant is existing, I think that glibc will absolutely follow suit. The simple reality is that the glibc project is primarily a collection of C-hackers and not linguists. The Unicode CLDR effort has a deeper bench of linguistic experience by virtue of developing Unicode representation. I personally volunteer my own efforts to assist in developing the CLDR locale (at least the translated bits) in support of a minority language community, but I don't think making glibc intervene on internicene conflicts is very productive. I can offer Pootle hosting of PO files representing CLDR regions, languages and scripts to assist a team in developing the core bits of a CLDR locale. Actions will speak much louder than ticket comments.
To start on a CLDR locale for be@tarask
Register as a translator here:
Work on these three CLDR related PO files. conveniently including Wikipedia links for those who are not geographers, linguists or orthographers.
When that is done we'll work on getting the rest of the Unicode CLDR Survey tool completed (plural forms, etc.)
You have my personal commitment of support in trying to develop a CLDR locale for be@tarask, as Sugar Labs Translation Team coordinator I work with many digitally disadvantaged languages and firmly believe in linguistic self-determination.
Does that sound like a fair alternative to making C-hackers get involved in an internal Belarusinan issue?
(In reply to Hleb Valoshka from comment #27)
> I absolutely disagree with such bug treatment. Although Ihar does not use it
> this does not mean that nobody else uses it.
If there are people who really want to use it today, we will add it.
> Of course it no easy to properly work with @tarask locale variant (incorrect
> spelling of some strings etc) because the bug was opened 12 (TWELVE!) years
> ago without any actions from glibc maintainers.
We are trying to improve, Rafał and me are currently going through the list
of open bugs related to locales and try to work through that backlog.
> This is a trivial case, this is not posting glibc to brand new kernel. It's
> just a locale definitions, it's attached, what prevents glibc maintainers to
> simply copy it into source tree?
Adding locales is not without cost, each locale needs about 2 MB in
the binary. Some distributions still install all available locales
always by default (for example openSUSE and Fedora do
this). Therefore, having more locales will make the default install
larger. That is OK for locales which are used by some people, but
adding stuff which nobody uses makes no sense.
And it is sometimes hard for us to figure out whether there are
really any users or not, especially for old bug reports where there
was no activity for a few years.
As Chris Leonard writes, if a locales exists in CLDR, this is also
an indication that it is really used by somebody.
Recently I added a ca_ES.utf8@valencia locale, there I also had
some doubts first whether there are people really interested in using this.
But when I saw that ca_ES_VALENCIA.xml exists in CLDR, I thought:
“OK, this proves that there is real user interest in that locale”.
> Is this how communication with community should be handled?
We want to be nice to the community, but it is sometimes hard for
us to find out which language communities are really active
and which are not.
Thank you Mike and Chris for replying while I was traveling. Indeed, we feel more comfortable to add or modify locale data if they are copied from CLDR. Actually our long term goal is to import locale data from CLDR automatically. Adding locales which are not present in CLDR is possible but always tricky: how to verify if a locale is correct? how to tell if a community willing to use the locale really exists? In this case I was told that it does not exist.
So, Hleb, please prove that the community (at least one person) exists and file a ticket asking to add the locale to CLDR. I suggest to add this locale to glibc (that means: I or someone else will add) as soon as it draws some reasonable attention from CLDR maintainers. I don't need to wait until it is closed and published. For now I reopen this bug report.