This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
Re: Apache Xalan 2.2 for Java problems with Unicode
- To: <xsl-list at lists dot mulberrytech dot com>
- Subject: Re: [xsl] Apache Xalan 2.2 for Java problems with Unicode
- From: "Rob Lugt" <roblugt at elcel dot com>
- Date: Thu, 9 Aug 2001 12:41:45 +0100
- References: <000601c120ba$c7bd1c70$e564a8c0@pandora>
- Reply-To: xsl-list at lists dot mulberrytech dot com
Jamie King wrote
> I'm trying to transform an XML file (encoded in UTF-8) using Apache's
Xalan
> 2.2 package for Java. It gives me the following exception:
>
> javax.xml.transform.TransformerException: An invalid XML character
(Unicode:
> 0xfc) was found in the element content of the document.
>
> Has anyone experienced this? Unicode 0xFC is a lowercase 'u' with an
umlaut
> (ü). It works fine when I remove those characters. Is there a way to set
> the encoding for the Transformer object in Java or something like that?
Jamie, I don't have experience of Apache's Xalan, so I'm unable to test my
hypothesis...
You are correct that Unicode U+00FC is a valid XML character. I'd be
surprised if the XML Parser you are using (xerces?) complained about this.
Is it possible that the error message is misleading and the encoding of your
input file is wrong? You say the file is encoded in UTF-8. The UTF-8
representation of U+00FC is 0x3CBC. Is this what you have in your file?
Regards
~Rob
--
Rob Lugt
ElCel Technology
http://www.elcel.com/
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list