This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: Improved check-localedef script
On Fri, Aug 4, 2017 at 5:19 AM, Mike FABIAN <mfabian@redhat.com> wrote:
> Mike FABIAN <mfabian@redhat.com> wrote:
>
>> The ca_ES source file is not ASCII, it has
>>
>> % català
>> lang_name "<U0063><U0061><U0074><U0061><U006C><U00E0>"
>>
>> So maybe I could just convert the file to UTF-8
>> and change “% Charset: ISO-8859-1” into “% Charset: UTF-8”
>> to get rid of the check-localedef warning.
>
> Actually the file is already UTF-8 encoded, the “català”
> in the comment above is already in UTF-8.
I thought "% Charset: ..." meant the character set _that would be used
for the locale_, not the character set that the locale definition was
encoded in. So the script tries to convert all of the strings _into_
that character set, on the assumption that they need to be
representable there. I'm going to post a revised script that takes
this information from localedata/SUPPORTED instead.
Without exception, the existing locale definitions files are either
pure 7-bit ASCII or they are UTF-8. However, non-ASCII UTF-8
characters currently appear only in comments.
zw