This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: EUC-JP and the Yen sign


From: "Martin v. Loewis" <martin@loewis.home.cs.tu-berlin.de>
Date: Mon, 16 Oct 2000 00:30:13 +0200
> > Open Group in Japan published advisory documentations:
> > http://www.opengroup.or.jp/jvc/cde/appendix.html
> > it also said that 0x5C is yen sign.
> 
> For those of us not fluent in Japanese, can you please explain the
> tables in http://www.opengroup.or.jp/jvc/cde/ucs-conv.html#ch3_1_2?
> There is one table saying that eucJP-open character 0x5C relates to
> U+00A5, and another saying that it relates to U+005C.
>
> Also, can you explain the relevance of the eucJP-*open* character name
> designation? The OpenGroup registry
> (ftp://ftp.opengroup.org/pub/code_set_registry/cs_registry1.2h)
> only knows of eucJP:1993; it comments on that character set
> 
> Comments
>         Implementation of the EUC (Extended UNIX Codes) encoding
>         method, with ISO 646:1991 IRV assigned to CS0, JIS X0208:1990
>         assigned to CS1, JIS X0201:1976 assigned to CS2, and
>         JIS X0212:1990 assigned to CS3.
> end
> 
> which, to me, says that the IRV is used for 05/12 (i.e. reverse
> solidus).

See,
http://www.opengroup.or.jp/jvc/cde/sjis-euc-e.html
"Detailed naming of code set".

Open Group named the standard eucJP as "eucJP-open",
because even between cooporations does not have same character
map joined in Open Group.
(See, http://www.opengroup.or.jp/jvc/cde/euc-e.html)
But, all Unices I know is not supported such a "eucJP-open" locale
or charset. It presents only the name of Open Group's eucJP by definition.

> Furthermore,
> http://www.y-adagio.com/public/standards/tr_xml_jpf/kaisetsu.htm lists
> a number of eucJP variants; it appears that x-eucjp-unicode-0.9,
> x-eucjp-jisx0221-1995, x-eucjp-open-19970715-ms all map character 5C
> to U+005C, whereas x-eucjp-open-19970715-0201 is listed as mapping it
> to U+00A5.

OK. I tranlate from Japanese to English in section 3.1:

   3.1 The range from 0x20 to 0x7E ([US-ASCII] or [JIS X 0201])
   
   x-eucjp-unicode-0.9, x-eucjp-jisx0221-1995, x-eucjp-open-19970715-ms,
   x-eucjp-open-19970715-ascii defines that the range form 0x20 to 0x7E 
   are translated as [US-ASCII], followed by Japanese EUC.
   The only exception is x-eucjp-open-19970715-0201 which is defined
   below translation rules followed by [JIS X 0201].
   
   Table 3.1 x-eucjp-open-19970715-0201
       Code Value in EUC         Translation To 
    0x5C(REVERSE SOLIDUS)        U+00A5(YEN SIGN)
    0x7E(TILDE)                  U+203E(OVERLINE)

Return to Open Group document,
http://www.opengroup.or.jp/jvc/cde/ucs-conv-e.html
section 3.1.2 (b) and (c).

3.1.2 Code Set Conversion Rules

(snip)

   b.Of the conversion specified in JIS X 0221, the conversion rules when
     it is used in conjunction with JIS X 0201. 
     In this case, the conversion of yen sign and backslash are performed 
     as follows. 
                       eucJP-open           UCS
                          0x5C         YEN SIGN (0x00A5)

(snip)

   c.Of the conversion specified by JIS X 0221, conversion rules when it is
     used in conjunction with ASCII. 
     In this case, the conversion of yen sign and backslash are performed
     as follows. 
                       eucJP-open           UCS
                          0x5C       REVERSE SOLIDUS (0x005C)

x-eucjp-open-19970715-"0201" directs 3.1.2 (b).
However, the G0 of EUC-JP directs ASCII, not JIS X 0201.
So, 3.1.2 (c) is appropriate to use as conversion rules.

Read http://www.opengroup.or.jp/jvc/cde/ucs-conv-e.html
section 3.1.1 (4) The Yen Sign problem.
You may see why this problem is occured.

> It may be clear to you; to me, it is not.

Ah, it's uncleared problem.
However, discussing more about this issue is not appropriate in this list.

Regards,
-- GOTO Masanori

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]