This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: de_DE has been using the wrong group separator for over 18 years


On 04/18/2018 11:10 PM, kdex wrote:
On Wednesday, April 18, 2018 11:35:31 AM CEST Florian Weimer wrote:
On 04/18/2018 10:30 AM, kdex wrote:
While the Federal Ministry of Finance may be an interesting (or even
ironic) source to point out, it is in no way normative, and their website
is mostly subject to their team of web developers.

»Rund 32.000 Experten aus Wirtschaft und Forschung, von Verbraucherseite
und der öffentlichen Hand bringen ihr Fachwissen in den Normungsprozess
ein, den DIN als privatwirtschaftlich organisierter Projektmanager steuert.«

<https://www.din.de/de/ueber-normen-und-standards/basiswissen>

And as that web page explains, DIN norms aren't normative, either.  Our
users expect that the locales follow actual practice, not what some
document says that they have never seen and nobody has read.  (For
example, I can't easily tell whether the DIN-proposed keyboard layout
for German provides a convenient way to enter the relevant space character.)

By the same argument, you could easily pitch abandoning all norms; most people
will likely not have seen or read any norm in their lives; the majority just
replicates what others do, or whatever Duden states.

Yes, but when it comes to natural languages, it is our job (as glibc maintainers updating locale definitions) to document the existing practice, and not to try to change it. For whatever reason, DIN standards are a poor guidance for maintaining locales, and so are the reference materials Duden publishes.

So yes, it is true that it there is no requirement to follow DIN norms, nor is
there a requirement to follow Duden's word spellings; though I don't see how a
seemingly arbitrary group separator with no normative grounds is any better.

I can't access the DIN process documents. We would have to review why they rejected the dot separator when it was widely used, sometimes when the standards were created for the first time. There has to be some rationale for the discrepancy.

Based on the publicly available information, the choice to make the dot or the space normative in this context appears to be totally arbitrary.

The purpose of a norm is to have a common system that everyone can follow.
Doesn't deviating from the norm defy the very purpose of having norms in the
first place?

For norms in the area of natural language, the norms should document existing practice. Everything else does not make sense and leads to poor adoption. If there is no consensus, you can document multiple options (see prototype and non-prototype function definitions in C90, for an example in the area of programming languages, where you would expect that more rigidness would be appropriate, not less).

Hence, the point is less that locale users need the ability to have U+2009
mapped on their keyboards somewhere, but rather that users should be able to
input regular numbers and rely on their software to use their system locale to
figure out how their numbers should be displayed according to the current
locale.

But that's not how people enter numbers in their word processor.

U+2009 also has the wrong line breaking property in the basic Unicode line breaking algorithm <https://unicode.org/reports/tr14/>, so it makes it quite hard for word processors to do the right thing even if the user managers to enter this character.

Ideally, we should adhere to "official" guidelines. And as has been stated
before; Duden is de-jure non-normative, but de-facto, it very much is.

That's a historical accident because a government body once referred to it as »maßgeblich in allen Zweifelsfällen« (“authoritative if there is any doubt”), but that referred to orthography and was before there was an official, government-issued list of spellings. When the official word list was finally published in 1996, Duden lost any claims to authority (and apparently removed it from their marketing materials).

Regarding number formatting (mainly in typesetting), my 1996 edition of the short Duden volume suggests that it is at least partially descriptive (»hat sich eingebürgert«). In this light, the omission of the dot as the separator (which was common at the time) looks like a mistake. Despite a decade or more of widespread use of word processors, it only has guidelines for typewriting, and does not address the matter of breaking spaces in numbers that arise in word processors.

Thanks,
Florian


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]