This is the mail archive of the libc-locales@sources.redhat.com mailing list for the GNU libc locales project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Handling numbers input/output in glibc

From: Behdad Esfahbod <behdad at cs dot toronto dot edu>
To: Bruno Haible <bruno at clisp dot org>
Cc: libc-alpha at sources dot redhat dot com, Hamed Malek <hamed at bamdad dot org>,Roozbeh Pournader <roozbeh at sharif dot edu>,Markus Kuhn <mgk25 at cl dot cam dot ac dot uk>, libc-locales at sources dot redhat dot com
Date: Tue, 2 Mar 2004 03:12:05 -0500
Subject: Re: Handling numbers input/output in glibc
References: <200402022033.23379.bruno@clisp.org>

On Mon, 2 Feb 2004, Bruno Haible wrote:

> Behdad Esfahbod wrote on 2004-01-10:
> > Problem statement:  In Persian (fa_IR) locale, we like to read
> > and write numbers with Persian numerals (U+06F0..U+06F9).
>
> To this I'd like to add the important additional explanation that you
> made on 2004-01-06:
>
> > The border between which numbers should be written with local
> > digits, which with latin digits, is not quite clear.  For example
> > in Persian we write every number with Persian digits, but I can
> > see how we may write a price with US dollar currency sign with
> > Latin digits.  Or Arab people may have their own desires about
> > which numbers they would like to see in their local digits, which
> > not.  So the decision better be left to each translation team,
>
> The solution that is implemented for this is:
>   - The application developer uses gettext() around all format strings that
>     contain "%d".
>   - gettext() looks up the translation in the Persian message catalog. It
>     may contains "%Id" instead of "%d".
>   - printf substitutes outdigits for those numbers that are output with "%Id".
>
> This should be sufficient, isn't it?

Yes.  That solves the problem effectively.  So digit output
problem is solved now.

Just one more thing, about digit input problem:

Right now iswdigit and scanf("%Id") both understand the "digit"
tag in locale definition.  So if you define two sets of decimal
digits in your locale under "digit" tag, scanf("%Id") (and not
scanf("%d")) would parse them as numerical data.  But since
iswdigit is defined to only accept ASCII digits in C99 standard,
the glibc locales only define ASCII digits under "digit" tag.  So
all the code in internationalization of scanf("%Id") is useless
now.

What I propose is either:

  * Change the code for iswdigit (and isdigit probably) to follow
the C99 standard and only accept ASCII digits.  Then we can
define all Unicode digit sets under "digit" tag in glic locales
and scanf("%Id") would work as expected.

Or:

  * If the above is not acceptable, define another tag parallel
to "digit", to be used by scanf("%Id").

So the point is to make scanf("%Id") work as expected
(internationalized) without breaking the standard compliance of
iswdigit.

behdad

[snip]
> Bruno

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]