This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: Is it OK to write ASCII strings directly into locale source files?
* Carlos O'Donell:
>>> However, I caution against throwing away the compatibility of our locales
>>> with POSIX, which doesn't seem to allow UTF-8 in the specification.
>>
>> It does, to some extent:
>>
>> | A character in the portable character set can be represented by the
>> | character itself, in which case the value of the character is
>> | implementation-defined. (Implementations may allow other characters
>> | to be represented as themselves, but such locale definitions are not
>> | portable.)
>>
>> You'll need a very hostile interpretation to say that this doesn't
>> allow multi-byte character sequences in localedef input.
>
> I see what you're saying, which is that we are *still* POSIX comliant,
> but not portable?
Right, and I think that's okay because the glibc locales are for
glibc.
> I assume we are focusing on the "()" text which allows some kind of escape
> hatch outside of the portable character set and allow us to use UTF-8?
Exactly.
>> But I found this in the guts of localedef:
>>
>> /* The standards leave it up to the implementation to decide
>> what to do with character which stand for themself. We
>> could jump through hoops to find out the value relative to
>> the charmap and the repertoire map, but instead we leave
>> it up to the locale definition author to write a better
>> definition. We assume here that every character which
>> stands for itself is encoded using ISO 8859-1. Using the
>> escape character is allowed. */
>>
>> So we currently hard-code ISO 8859-1 (not UTF-8) to avoid the
>> bootstrapping problem.
>
> We could just assume UTF-8, but yes, it looks like this needs a little bit
> more looking into.
Yes, and we don't have a real bootstrapping problem because while we
have charmap file for UTF-8, we have a separate UTF-8 implementation
in iconv/gconv, and we could use that to break the loop.
> Either way, I support using the portable character set today, and that's
> a step forward.
Agreed.