This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Behaviour difference when outputing Shift_JIS


Hip hei!

I'm outputing Shift-JIS encoded documents from my XSTL engine and trying use
Unicode characters in the Private Use and CJK Compatibility Ideographs
areas, more precisely U+F89F - U+F9AF. See example processing below:

[c:\temp]type test.xsl
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="html"
            encoding="Shift_JIS"
            doctype-public="-//W3C//DTD Compact HTML 1.0 Draft//EN" />

<xsl:template match="/">
  <html>
    <head>
      <title />
    </head>
    <body><xsl:text>&#xF89F;&#xF9AF;</xsl:text></body>
  </html>
</xsl:template>

</xsl:stylesheet>

[c:\temp]msxml test.xsl test.xsl
<!DOCTYPE html PUBLIC "-//W3C//DTD Compact HTML 1.0 Draft//EN"><html>
<head>
<META http-equiv="Content-Type" content="text/html; charset=Shift_JIS">
<title></title>
</head>
<body>&#63647;&#63919;</body>
</html>

[c:\temp]xalan_dist -in test.xsl -xsl test.xsl -q
<!DOCTYPE HTML PUBLIC "-//W3C//DTD Compact HTML 1.0 Draft//EN">
<html>
    <head>
        <title>
        </title>
    </head>
    <body>??</body>
</html>

So, as you can see, when i use MSXML3, the characters i mentioned before
come out as character entity references, but when i use Xalan (or Oracle's
processor), they come out as question marks (when viewed with a Japanese
text editor or other Shift-JIS enabled programs). Which one is correct
behaviour?

Jarno - Happy Happy Joy Joy Division




 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]