This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
Re: Transforming an incorrectly structured document...
- To: xsl-list at mulberrytech dot com
- Subject: Re: Transforming an incorrectly structured document...
- From: Wendell Piez <wapiez at mulberrytech dot com>
- Date: Mon, 04 Dec 2000 14:20:18 +0000
- Reply-To: xsl-list at mulberrytech dot com
Ben--
Since this is becoming a FAQ (how to get structure out of a flat document),
I thought I'd post my general levitating-a-flat-structure stylesheet. Note:
its design supports only regular structures (that is, flat files whose
nesting is "correct" although only implicit); if you need a more
general-purpose solution, one could be based on the same principles (the
key declarations would have to be modified).
Getting the "formatting" (that is, the transformation of element types)
along with the conversion (the levitation) is not hard, only a matter of
adapting and supplementing these templates.
I hope it helps --
Wendell
INPUT FILE:
<levels>
<h1>Header 1</h1>
<stuff>Content under 1</stuff>
<h2>header 1.1</h2>
<stuff>Content under 1.1</stuff>
<stuff>More content under 1.1</stuff>
<h2>header 1.2</h2>
<stuff>Content under 1.2</stuff>
<h2>header 1.3</h2>
<stuff>Content under 1.3</stuff>
<h3>header 1.3.1</h3>
<stuff>Content under 1.3.1</stuff>
<h4>header 1.3.1.1</h4>
<stuff>Content under 1.3.1.1</stuff>
<h5>header 1.3.1.1.1</h5>
<stuff>Content under 1.3.1.1.1</stuff>
<h1>Header 2</h1>
<stuff>Content under 2</stuff>
<h2>header 2.1</h2>
<stuff>Content under 2.1 </stuff>
<h2>header 2.2</h2>
<stuff>Content under 2.2</stuff>
<h3>header 2.2.1</h3>
<stuff>Content under 2.2.1</stuff>
<h2>header 2.3</h2>
<stuff>Content under 2.3</stuff>
</levels>
STYLESHEET:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<!-- this key should match any 'stuff-level' elements you have such
as paragraphs, lists etc. -->
<xsl:key name='stuffchildren' match="stuff"
use="generate-id((..|preceding-sibling::h1|preceding-sibling::h2|preceding-sibling::h3|preceding-sibling::h4|preceding-sibling::h5)[last()])"/>
<xsl:key name="h2children" match="h2"
use="generate-id(preceding-sibling::h1[1])"/>
<xsl:key name="h3children" match="h3"
use="generate-id(preceding-sibling::h2[1])"/>
<xsl:key name="h4children" match="h4"
use="generate-id(preceding-sibling::h3[1])"/>
<xsl:key name="h5children" match="h5"
use="generate-id(preceding-sibling::h4[1])"/>
<xsl:template match="levels">
<xsl:apply-templates select="key('stuffchildren', generate-id())"/>
<xsl:apply-templates select="h1"/>
</xsl:template>
<xsl:template match="h1">
<section level="1">
<head>
<xsl:apply-templates/>
</head>
<xsl:apply-templates select="key('stuffchildren', generate-id())"/>
<xsl:apply-templates select="key('h2children', generate-id())"/>
</section>
</xsl:template>
<xsl:template match="h2">
<section level="2">
<head>
<xsl:apply-templates/>
</head>
<xsl:apply-templates select="key('stuffchildren', generate-id())"/>
<xsl:apply-templates select="key('h3children', generate-id())"/>
</section>
</xsl:template>
<xsl:template match="h3">
<section level="3">
<head>
<xsl:apply-templates/>
</head>
<xsl:apply-templates select="key('stuffchildren', generate-id())"/>
<xsl:apply-templates select="key('h4children', generate-id())"/>
</section>
</xsl:template>
<xsl:template match="h4">
<section level="4">
<head>
<xsl:apply-templates/>
</head>
<xsl:apply-templates select="key('stuffchildren', generate-id())"/>
<xsl:apply-templates select="key('h5children', generate-id())"/>
</section>
</xsl:template>
<xsl:template match="h5">
<section level="5">
<head>
<xsl:apply-templates/>
</head>
<xsl:apply-templates select="key('stuffchildren', generate-id())"/>
</section>
</xsl:template>
<xsl:template match="stuff">
<data><xsl:apply-templates/></data>
</xsl:template>
</xsl:stylesheet>
RESULT (using SAXON):
<?xml version="1.0" encoding="utf-8"?>
<section level="1">
<head>Header 1</head>
<data>Content under 1</data>
<section level="2">
<head>header 1.1</head>
<data>Content under 1.1</data>
<data>More content under 1.1</data>
</section>
<section level="2">
<head>header 1.2</head>
<data>Content under 1.2</data>
</section>
<section level="2">
<head>header 1.3</head>
<data>Content under 1.3</data>
<section level="3">
<head>header 1.3.1</head>
<data>Content under 1.3.1</data>
<section level="4">
<head>header 1.3.1.1</head>
<data>Content under 1.3.1.1</data>
<section level="5">
<head>header 1.3.1.1.1</head>
<data>Content under 1.3.1.1.1</data>
</section>
</section>
</section>
</section>
</section>
<section level="1">
<head>Header 2</head>
<data>Content under 2</data>
<section level="2">
<head>header 2.1</head>
<data>Content under 2.1 </data>
</section>
<section level="2">
<head>header 2.2</head>
<data>Content under 2.2</data>
<section level="3">
<head>header 2.2.1</head>
<data>Content under 2.2.1</data>
</section> </section>
<section level="2">
<head>header 2.3</head>
<data>Content under 2.3</data>
</section>
</section>
Hope that helps!
At 01:08 PM 12/1/00 -0800, you wrote:
>...The XML is as follows:
>
><div>
> <p />
> <h1>Overview</h1>
> <p>Use Slater guards when a riser is subject to minor damage, such as in a
>walkway. Use split casings and/or protection posts when a riser is subject to
>heavy damage, such as in a</p>
> <h1>Factors</h1>
> <h2>What to look for</h2>
> <p>lane or docking bay.</p>
> <p>Slater guards and split casings come in 1 meter lengths, but may be
> cut or
>joined as necessary.</p>
> <p>Order the split casing attaching lugs separately, and weld them on.</p>
>
>The ideal would be to structure the document as follows:
>
><div><h1>Overview</h1><p>Use Slater guards when a riser is subject to minor
>damage, such as in a walkway. Use split casings and/or protection posts when a
>riser is subject to heavy damage, such as in a</p></div>
>
><div><h1>Factors</h1><h2>What to look for</h2><p>lane or docking
>bay.</p><p>Slater guards and split casings come in 1 meter lengths, but may be
>cut or joined as necessary.</p><p>Order the split casing attaching lugs
>separately, and weld them on.</p></div>
...
======================================================================
Wendell Piez mailto:wapiez@mulberrytech.com
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Direct Phone: 301/315-9635
Suite 207 Phone: 301/315-9631
Rockville, MD 20850 Fax: 301/315-8285
----------------------------------------------------------------------
Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list