This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

XHTML to HTML 4


I've got an xhtml document similar to this fragment:

 <html:table border="0" cellpadding="0" cellspacing="5px">
    <html:tr>
       <html:td>
          <html:span class="StoryHeadline">b</html:span>
          <html:br class="hack"/>
          <html:span class="StoryTeaser">b</html:span>
          <html:br class="hack" clear="left"/>
          <html:br class="hack" clear="left"/>
          <html:span class="StoryBody">this is a
paragraph:&lt;p&gt;paragraph text one&lt;p&gt;paragraph text two</html:span>
       </html:td>
    </html:tr>
 </html:table>

Note that the <html:span> contains marked up text such as &lt;p&gt; (the xml
representation of the text <p>).



I need to convert this document to HTML 4, including:
	- Convert all <html:* attr="value"> elements to <* attr="value">
	- Minimize certain elements such as break:
            <html:br class="hack" clear="left"/>  =>  <br class="hack"
clear="left">
	- Taking all embedded text entities and converting them to
non-escaped text
	      &lt;p&gt;paragraph  =>  <p>paragraph 











I've accomplished these goals with the following xslt, which basically does
everything the hard way.  However, I would like to find a more elegant
solution.  Any suggetions?  Pardon the long lines - I wanted the output to
flow just right and wasn't getting results with ignoring white space.  There
are only two templates other than the root - almost identical.  The first is
for singular html tags.  The second is for everything else.

<?xml version="1.0"?>

<xsl:stylesheet	
	xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    	xmlns:html='http://www.w3.org/1999/xhtml'
	exclude-result-prefixes="html"
	>
	
    <xsl:output method="text" />

<xsl:template match="/">
	<xsl:apply-templates/>
</xsl:template>

<xsl:template match="html:br | html:area | html:base | html:basefont |
html:bgsound | html:button | html:col | html:colgroup | html:embed | html:hr
| html:img | html:input | html:isindex | html:keygen | html:link | html:meta
| html:object | html:plaintext | html:spacer |
html:wbr"><![CDATA[<]]><xsl:value-of
select="substring-after(name(.),':')"/><xsl:for-each
select="@*"><xsl:value-of select="' '"/><xsl:value-of
select="name(.)"/>="<xsl:value-of
select="."/>"</xsl:for-each><![CDATA[>]]><xsl:apply-templates/></xsl:templat
e>
<xsl:template match="html:*"><![CDATA[<]]><xsl:value-of
select="substring-after(name(.),':')"/><xsl:for-each
select="@*"><xsl:value-of select="' '"/><xsl:value-of
select="name(.)"/>="<xsl:value-of
select="."/>"</xsl:for-each><![CDATA[>]]><xsl:apply-templates/><![CDATA[</]]
><xsl:value-of
select="substring-after(name(.),':')"/><![CDATA[>]]></xsl:template>

</xsl:stylesheet>


Brian Dupras
Centera Information Systems, Inc.
phone 303.381.4420 (direct)
phone 303.939.0200 (operator)
fax	303.939.0111
web	http://www.centera.com
email	briand@centera.com


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]