This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
XHTML to HTML 4
- To: "XSL - Mulberry (E-mail)" <XSL-List at mulberrytech dot com>
- Subject: XHTML to HTML 4
- From: Brian Dupras <briand at centera dot com>
- Date: Thu, 6 Apr 2000 08:51:30 -0600
- Reply-To: xsl-list at mulberrytech dot com
I've got an xhtml document similar to this fragment:
<html:table border="0" cellpadding="0" cellspacing="5px">
<html:tr>
<html:td>
<html:span class="StoryHeadline">b</html:span>
<html:br class="hack"/>
<html:span class="StoryTeaser">b</html:span>
<html:br class="hack" clear="left"/>
<html:br class="hack" clear="left"/>
<html:span class="StoryBody">this is a
paragraph:<p>paragraph text one<p>paragraph text two</html:span>
</html:td>
</html:tr>
</html:table>
Note that the <html:span> contains marked up text such as <p> (the xml
representation of the text <p>).
I need to convert this document to HTML 4, including:
- Convert all <html:* attr="value"> elements to <* attr="value">
- Minimize certain elements such as break:
<html:br class="hack" clear="left"/> => <br class="hack"
clear="left">
- Taking all embedded text entities and converting them to
non-escaped text
<p>paragraph => <p>paragraph
I've accomplished these goals with the following xslt, which basically does
everything the hard way. However, I would like to find a more elegant
solution. Any suggetions? Pardon the long lines - I wanted the output to
flow just right and wasn't getting results with ignoring white space. There
are only two templates other than the root - almost identical. The first is
for singular html tags. The second is for everything else.
<?xml version="1.0"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:html='http://www.w3.org/1999/xhtml'
exclude-result-prefixes="html"
>
<xsl:output method="text" />
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="html:br | html:area | html:base | html:basefont |
html:bgsound | html:button | html:col | html:colgroup | html:embed | html:hr
| html:img | html:input | html:isindex | html:keygen | html:link | html:meta
| html:object | html:plaintext | html:spacer |
html:wbr"><![CDATA[<]]><xsl:value-of
select="substring-after(name(.),':')"/><xsl:for-each
select="@*"><xsl:value-of select="' '"/><xsl:value-of
select="name(.)"/>="<xsl:value-of
select="."/>"</xsl:for-each><![CDATA[>]]><xsl:apply-templates/></xsl:templat
e>
<xsl:template match="html:*"><![CDATA[<]]><xsl:value-of
select="substring-after(name(.),':')"/><xsl:for-each
select="@*"><xsl:value-of select="' '"/><xsl:value-of
select="name(.)"/>="<xsl:value-of
select="."/>"</xsl:for-each><![CDATA[>]]><xsl:apply-templates/><![CDATA[</]]
><xsl:value-of
select="substring-after(name(.),':')"/><![CDATA[>]]></xsl:template>
</xsl:stylesheet>
Brian Dupras
Centera Information Systems, Inc.
phone 303.381.4420 (direct)
phone 303.939.0200 (operator)
fax 303.939.0111
web http://www.centera.com
email briand@centera.com
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list