This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

RE: MSXML Whitespace handling


Andy,

Thanks again for taking the time to reply.

>4. Stylesheet validation takes place during the XSLT compile phase, during a
>DOM walk performed by the XSL processor.  Remember that the DOM cannot
>validate the incoming stylesheet on load (or attempt to apply XSLT
>whitespace rules), since at that point it does not know that the incoming
>XML is a stylesheet.  In fact, it may not even be a valid stylesheet on load
>-- the user might modify it after loading it into the DOM in order to make
>it a valid stylesheet (for example, dynamically adding a named template that
>is referenced elsewhere in the stylesheet).

I think we're talking at cross-purposes due to a misunderstanding on my
part.  I was talking about *XML-level validation* of the stylesheet, not
*XSLT-level validation*.  In other words, identifying which elements should
have their whitespace preserved on the basis of the DTD/schema/namespace of
the XML document.

I was assuming that MS DOM did XML-level validation, and had some built-in
recognition for standard namespaces, like XSLT and HTML.  I thought that
tweaking the rules that MS DOM used for validating XSLT-type XML documents
might go part way to solving the problem.  Thinking about it, I'm not
surprised to be wrong on this point.

Given that MS DOM is not going to change, at least in the near future, the
practical problem here is that in the normal state of affairs we want to
simply open an XML document in IE and for it display the result of its
transformation according to the stylesheet specified in the xml-stylesheet
processing instruction.  We don't want to be fiddling around with scripting
the loading of the source and stylesheet documents into DOMs with the
required whitespace handling.  Documenting the utility of
preserveWhiteSpace is all very well, but in these situations it's not very
useful.

I've been trying to find workarounds that we can use in this situation.
[Andy, they're probably not particularly interesting to you, but others
following this thread might find them useful.]

The basic workaround is, obviously, to declare xml:space="preserve" on
every xsl:text element that holds significant whitespace.  This works, but
it could be a real pain if you have a stylesheet with lots of xsl:text
elements in it.  Preprocessing the stylesheet to add in these xsl:space
attributes (rather than adding them manually) is an option.

My next thought was to define the xml:space attribute on xsl:text within
the internal DTD:

<?xml version="1.0"?>
<!DOCTYPE xsl:stylesheet [
<!ELEMENT xsl:text (#PCDATA)>
<!ATTLIST xsl:text
  xml:space (default|preserve) #FIXED 'preserve'>
]>
<xsl:stylesheet ...>
...
</xsl:stylesheet>

However, it seems that MS DOM or MS XML doesn't take account of this
particular default attribute value.  This is somewhat confusing as
declaring default attribute values for other elements in the XSLT namespace
works perfectly, e.g.:

  <!ELEMENT xsl:value-of EMPTY>
  <!ATTLIST xsl:value-of
    select CDATA '.'>

(Of course there's no legislation to say that these declarations should be
used at all - it's just that it's confusing that they're not used
consistently.)

As is documented in the MS XML SDK, declaring whitespace entities won't
work -  they will be stripped.  However, it is possible to declare entities
such as:

<!ENTITY newline
  "<xsl:text xmlns:xsl='http://www.w3.org/1999/XSL/Transform'
             xml:space='preserve'>
</xsl:text>">

(You have to specify the 'xsl' namespace as the entity content is
interpreted without a context.)  You can then use &newline; within your
context to insert a newline that will be preserved in your output.

Of course none of these workarounds enable MS XSL to deal properly with
xsl:preserve-space and xsl:strip-space, which rely on the whitespace in the
source document/tree being available to the XSLT processor.  If you need to
use these, it looks as though you either have to preprocess the source
document based on the stylesheet, adding in xml:space attributes as
appropriate, or use the scripting option to load the page with
preserveWhiteSpace=true.

Cheers,

Jeni

Dr Jeni Tennison
Epistemics Ltd * Strelley Hall * Nottingham * NG8 6PE
tel: 0115 906 1301 * fax: 0115 906 1304 * email: jeni.tennison@epistemics.co.uk


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]