This is the mail archive of the newlib@sourceware.org mailing list for the newlib project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] CJK ambiguous width for non-Unicode charsets


On 18 November 2010 11:03, Corinna Vinschen wrote:
> On Nov 17 21:34, Andy Koppe wrote:
>> On 16 November 2010 17:58, Corinna Vinschen wrote:
>> > On Nov Â9 22:06, Andy Koppe wrote:
>> >> The attached small patch affects character widths as reported by
>> >> wcwidth(). It addresses an obscure issue.
>> >>[...]
>> >> Â Â Â * libc/locale/locale.c: Fix ambigous width to one for singlebyte
>> >> Â Â Â charsets and two for non-Unicode multibyte charsets.
>> >
>> > This appears to make a lot of sense. ÂWould you mind to enhance your
>> > patch slightly to fix also the description in the locale.c
>> > documentation? ÂThere's a related paragraph starting with "This
>> > implementation also supports a single modifier, <<"cjknarrow">>..."
>>
>> Sorry, I hadn't seen that. Amended patch attached.
>>
>> Â Â Â * libc/locale/locale.c (loadlocale): Fix width of CJK ambigous
>> Â Â Â characters to 1 for singlebyte charsets and 2 for non-Unicode
>> Â Â Â multibyte charsets. Change documentation accordingly.
>
> Thank you. ÂApplied with a minor change. Â@ is a special character
> in the docs and has to be doubled ("@@") to be treated literally.
> I just removed it entirely since the @ is not part of the modifier
> itself.

Thanks.

In further testing I realised that the cjknarrow modifier wasn't
implemented for "C.<charset>" locales (since previously there was no
point in that). Patch attached to make it work.

	* libc/locale/locale.c (loadlocale): Recognise the "cjknarrow"
	modifier on "C.<charset>" locales too.

Here's a small test for this:

$ cat width.c
#include <wchar.h>
#include <locale.h>
#include <stdio.h>

int main(void) {
  setlocale(LC_CTYPE, "");
  puts(setlocale(LC_CTYPE, 0));
  puts(wcwidth(0xA1) == 1 ? "narrow" : "wide");
}

$ cc width.c

$ ./a
C.UTF-8
narrow

$ LANG=C.GBK ./a
C.GBK
wide

$ LANG=C.GBK@cjknarrow ./a
C.GBK@cjknarrow
narrow

$ LANG=ja_JP.UTF-8 ./a
ja_JP.UTF-8
wide

$ LANG=ja_JP.UTF-8@cjknarrow ./a
ja_JP.UTF-8@cjknarrow
narrow

$ LANG=de_DE.UTF-8 ./a
de_DE.UTF-8
narrow

Andy

Attachment: ambiwidth3.patch
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]