This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: translating between character sets


Matthias O. Will wrote:
> My input DTD has encoding UTF-8, and my output DTD encodes according to
> ISO-8859-1. So, in the input, I use entities for umlauts, which I want
> to be umlauts in the output. Example:
> 
> input	output
> --------------
> ä	ä
> ö	ö
> ü	ü
> ß	ß
> 
> How would I achieve this mapping?

Why don't you use ISO 10646-1:1993 (~ Unicode) character references?
i.e. you input column should be
 ä
 ö
 ü
 ß

But if you must use entity references, declare the entities in the DTD for
the XML document that uses them, using these declarations:
  http://www.oasis-open.org/cover/xml-ISOents.txt

Example:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE myData SYSTEM "http://www.oasis-open.org/cover/xml-ISOents.txt">
<myData>
   ...somewhere in here will be &auml; &ouml; etc. ...
</myData>

Or (better this way; just declare what you need, and don't fetch over a
network):

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE myData [
  <!ENTITY auml    "&#228;" >
  <!ENTITY ouml    "&#246;" >
  <!ENTITY uuml    "&#252;" >
  <!ENTITY szlig   "&#223;" >
]>
<myData>
   ...somewhere in here will be &auml; &ouml; etc. ...
</myData>

The XML parser will replace the entities with the characters you want. The
*output* you get from an XSLT processor acting on these characters in an
XML document depends on the processor, but if you put

 <xsl:output method="xml" version="1.0" encoding="iso-8859-1"/>

in the stylesheet, you should get the literal iso-8859-1 bytes for the
characters, assuming you've copied them to the result tree. If the output
method must be "html" then you will probably get entity references in the
output.

   - Mike
____________________________________________________________________
Mike J. Brown, software engineer at         My XML/XSL resources:
webb.net in Denver, Colorado, USA           http://www.skew.org/xml/


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]