This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
RE: 8bit ascii encoding
- From: "Andrew Welch" <awelch at piper-group dot com>
- To: <xsl-list at lists dot mulberrytech dot com>
- Date: Fri, 23 Aug 2002 11:56:10 +0100
- Subject: RE: [xsl] 8bit ascii encoding
- Reply-to: xsl-list at lists dot mulberrytech dot com
> You noticed I'd gone then:-)
Yeah... anywhere nice?
> Possibly the feature you are seeing that if you call msxsl via some
> methods it ignores the requested output encoding and always
> uses utf16.
Im using saxon6.5.2...
> The <meta> element has to specify the encoding that the document is
> in. If you change the specified encoding without changing the encoding
> then you are dead.
ha.. nice. After some testing it seems that char references display
fine, while characters themselves do not (after the utf-8 string has
gone through the ActiveX). This is because the refs are simply a string
of single byte ascii chars that get converted by IE, while the unicode
chars are multi-byte and therefore are being displayed as two single
chars. I think the reason IE isn't picking up that each char is two
bytes (utf-8) is because the BOM is getting overwritten/messed up by the
ActiveX - we can't simply write one in either because one of the
restrictions when hosting IE is that you can only write to the <body>
element (unless anyone can tell me different...)
So I guess I have two options...
1. persevere trying to get IE to treat the output as two byte chars
2. pass through all char refs to the output un-escaped, and let IE
escape them...
Are there any methods available for Saxon (Aelfred as well ;) to
disable-output-escaping across the board - parser and processor?
Is this the best option?
Cheers
andrew
> -----Original Message-----
> From: David Carlisle [mailto:davidc@nag.co.uk]
> Sent: 23 August 2002 10:58
> To: xsl-list@lists.mulberrytech.com
> Subject: Re: [xsl] 8bit ascii encoding
>
>
>
> > Hi David, you came back just in time... :)
>
> You noticed I'd gone then:-)
>
> > Yes, I tried this - the theory is that because the
> reference cannot be
> > escaped using the current encoding it will get passed through to the
> > output unchanged, right?
>
> Not really it isn't a refernce by the time XSLT sees it, the
> XML parser
> will have reported the reference as a character.
> The theory is that in the XML and HTML output methods, any characters
> not in the output encoding will be output as references.
>
> > However, the project Im working on hosts IE through a JNI activeX
> > control. The result of the transform goes through the JNI string
> > manipulation code ( in c ) which then gets to IE. At which
> point the
> > encoding seems to be wrong.
>
> Possibly the feature you are seeing that if you call msxsl via some
> methods it ignores the requested output encoding and always
> uses utf16.
> Because some flavour of microsoft string object is always utf16.
> Other methods of calling msxsl do use the requested encoding.
> (Someone more microsoft-literate may fill in the details;-)
>
> > Do you think that the BOM is going to be affected by this,
> Yes BOM is a utf16 thing (to tell whether you are big or little
> endian). So if the file is or is not utf16 you should or
> should not get
> a BOM at the beginning.
>
> > as the c code
> > is outputting single byte chars, or does that not matter to
> IE provided
> > the <meta> encoding states utf-8?
>
> The <meta> element has to specify the encoding that the document is
> in. If you change the specified encoding without changing the encoding
> then you are dead.
>
> David
>
> _____________________________________________________________________
> This message has been checked for all known viruses by Star Internet
> delivered through the MessageLabs Virus Scanning Service. For further
> information visit http://www.star.net.uk/stats.asp or
> alternatively call
> Star Internet for details on the Virus Scanning Service.
>
> XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
>
>
>
>
>
> ---
> Incoming mail is certified Virus Free.
> Checked by AVG anti-virus system (http://www.grisoft.com).
> Version: 6.0.381 / Virus Database: 214 - Release Date: 02/08/2002
>
>
---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.381 / Virus Database: 214 - Release Date: 02/08/2002
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list