This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Transforming an incorrectly structured document...


Ben--

Since this is becoming a FAQ (how to get structure out of a flat document), 
I thought I'd post my general levitating-a-flat-structure stylesheet. Note: 
its design supports only regular structures (that is, flat files whose 
nesting is "correct" although only implicit); if you need a more 
general-purpose solution, one could be based on the same principles (the 
key declarations would have to be modified).

Getting the "formatting" (that is, the transformation of element types) 
along with the conversion (the levitation) is not hard, only a matter of 
adapting and supplementing these templates.

I hope it helps --
Wendell

INPUT FILE:
<levels>
<h1>Header 1</h1>
<stuff>Content under 1</stuff>
<h2>header 1.1</h2>
<stuff>Content under 1.1</stuff>
<stuff>More content under 1.1</stuff>
<h2>header 1.2</h2>
<stuff>Content under 1.2</stuff>
<h2>header 1.3</h2>
<stuff>Content under 1.3</stuff>
<h3>header 1.3.1</h3>
<stuff>Content under 1.3.1</stuff>
<h4>header 1.3.1.1</h4>
<stuff>Content under 1.3.1.1</stuff>
<h5>header 1.3.1.1.1</h5>
<stuff>Content under 1.3.1.1.1</stuff>
<h1>Header 2</h1>
<stuff>Content under 2</stuff>
<h2>header 2.1</h2>
<stuff>Content under 2.1 </stuff>
<h2>header 2.2</h2>
<stuff>Content under 2.2</stuff>
<h3>header 2.2.1</h3>
<stuff>Content under 2.2.1</stuff>
<h2>header 2.3</h2>
<stuff>Content under 2.3</stuff>
</levels>

STYLESHEET:
<xsl:stylesheet version="1.0"
                 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" indent="yes"/>

<!-- this key should match any 'stuff-level' elements you have such
      as paragraphs, lists etc. -->
<xsl:key name='stuffchildren' match="stuff"
   use="generate-id((..|preceding-sibling::h1|preceding-sibling::h2|preceding-sibling::h3|preceding-sibling::h4|preceding-sibling::h5)[last()])"/>

<xsl:key name="h2children" match="h2"
   use="generate-id(preceding-sibling::h1[1])"/>

<xsl:key name="h3children" match="h3"
   use="generate-id(preceding-sibling::h2[1])"/>

<xsl:key name="h4children" match="h4"
   use="generate-id(preceding-sibling::h3[1])"/>

<xsl:key name="h5children" match="h5"
   use="generate-id(preceding-sibling::h4[1])"/>

<xsl:template match="levels">
   <xsl:apply-templates select="key('stuffchildren', generate-id())"/>
   <xsl:apply-templates select="h1"/>
</xsl:template>

<xsl:template match="h1">
   <section level="1">
     <head>
       <xsl:apply-templates/>
     </head>
     <xsl:apply-templates select="key('stuffchildren', generate-id())"/>
     <xsl:apply-templates select="key('h2children', generate-id())"/>
   </section>
</xsl:template>

<xsl:template match="h2">
   <section level="2">
     <head>
       <xsl:apply-templates/>
     </head>
     <xsl:apply-templates select="key('stuffchildren', generate-id())"/>
     <xsl:apply-templates select="key('h3children', generate-id())"/>
   </section>
</xsl:template>

<xsl:template match="h3">
   <section level="3">
     <head>
       <xsl:apply-templates/>
     </head>
     <xsl:apply-templates select="key('stuffchildren', generate-id())"/>
     <xsl:apply-templates select="key('h4children', generate-id())"/>
   </section>
</xsl:template>

<xsl:template match="h4">
   <section level="4">
     <head>
       <xsl:apply-templates/>
     </head>
     <xsl:apply-templates select="key('stuffchildren', generate-id())"/>
     <xsl:apply-templates select="key('h5children', generate-id())"/>
   </section>
</xsl:template>

<xsl:template match="h5">
   <section level="5">
     <head>
       <xsl:apply-templates/>
     </head>
     <xsl:apply-templates select="key('stuffchildren', generate-id())"/>
   </section>
</xsl:template>

<xsl:template match="stuff">
   <data><xsl:apply-templates/></data>
</xsl:template>

</xsl:stylesheet>

RESULT (using SAXON):
<?xml version="1.0" encoding="utf-8"?>
<section level="1">
    <head>Header 1</head>
    <data>Content under 1</data>
    <section level="2">
       <head>header 1.1</head>
       <data>Content under 1.1</data>
       <data>More content under 1.1</data>
    </section>
    <section level="2">
       <head>header 1.2</head>
       <data>Content under 1.2</data>
    </section>
    <section level="2">
       <head>header 1.3</head>
       <data>Content under 1.3</data>
       <section level="3">
          <head>header 1.3.1</head>
          <data>Content under 1.3.1</data>
          <section level="4">
             <head>header 1.3.1.1</head>
             <data>Content under 1.3.1.1</data>
             <section level="5">
                <head>header 1.3.1.1.1</head>
                <data>Content under 1.3.1.1.1</data>
             </section>
          </section>
       </section>
    </section>
</section>
<section level="1">
    <head>Header 2</head>
    <data>Content under 2</data>
    <section level="2">
       <head>header 2.1</head>
       <data>Content under 2.1 </data>
    </section>
    <section level="2">
       <head>header 2.2</head>
       <data>Content under 2.2</data>
       <section level="3">
          <head>header 2.2.1</head>
          <data>Content under 2.2.1</data>
       </section>   </section>
    <section level="2">
       <head>header 2.3</head>
       <data>Content under 2.3</data>
    </section>
</section>

Hope that helps!

At 01:08 PM 12/1/00 -0800, you wrote:

>...The XML is as follows:
>
><div>
>   <p />
>   <h1>Overview</h1>
>   <p>Use Slater guards when a riser is subject to minor damage, such as in a
>walkway. Use split casings and/or protection posts when a riser is subject to
>heavy damage, such as in a</p>
>   <h1>Factors</h1>
>   <h2>What to look for</h2>
>   <p>lane or docking bay.</p>
>   <p>Slater guards and split casings come in 1 meter lengths, but may be 
> cut or
>joined as necessary.</p>
>   <p>Order the split casing attaching lugs separately, and weld them on.</p>
>
>The ideal would be to structure the document as follows:
>
><div><h1>Overview</h1><p>Use Slater guards when a riser is subject to minor
>damage, such as in a walkway. Use split casings and/or protection posts when a
>riser is subject to heavy damage, such as in a</p></div>
>
><div><h1>Factors</h1><h2>What to look for</h2><p>lane or docking
>bay.</p><p>Slater guards and split casings come in 1 meter lengths, but may be
>cut or joined as necessary.</p><p>Order the split casing attaching lugs
>separately, and weld them on.</p></div>
...

======================================================================
Wendell Piez                            mailto:wapiez@mulberrytech.com
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
   Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]