This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: XSL and international characters


Marcin =?iso-8859-2?Q?K=B3os?= wrote at  4 Dec 2001 14:28:50 +0100:
 > Orginal character was %C5%82 and the result was Å - one character and 
 > ‚ - second character :(

Your one-byte character is being represented using two bytes in UTF-8.

UTF-8 is a variable-length encoding, and characters may be represented
with up to four bytes (depending on the code point).

It's a fluke of the UTF-8 design that, for many Latin-1 characters,
looking at the UTF-8 representation in a Latin-1 system appears to
show the desired character plus random junk.

If you looked at the output with a UTF-8-aware viewer, you'd only see
the one character.

Regards,


Tony Graham
------------------------------------------------------------------------
XML Technology Center - Dublin                mailto:tony.graham@sun.com
Sun Microsystems Ireland Ltd                       Phone: +353 1 8199708
Hamilton House, East Point Business Park, Dublin 3            x(70)19708

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]