This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
Re: XSL and international characters
Marcin =?iso-8859-2?Q?K=B3os?= wrote at 4 Dec 2001 14:28:50 +0100:
> Orginal character was %C5%82 and the result was Å - one character and
> ‚ - second character :(
Your one-byte character is being represented using two bytes in UTF-8.
UTF-8 is a variable-length encoding, and characters may be represented
with up to four bytes (depending on the code point).
It's a fluke of the UTF-8 design that, for many Latin-1 characters,
looking at the UTF-8 representation in a Latin-1 system appears to
show the desired character plus random junk.
If you looked at the output with a UTF-8-aware viewer, you'd only see
the one character.
Regards,
Tony Graham
------------------------------------------------------------------------
XML Technology Center - Dublin mailto:tony.graham@sun.com
Sun Microsystems Ireland Ltd Phone: +353 1 8199708
Hamilton House, East Point Business Park, Dublin 3 x(70)19708
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list