This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
Some more stuff about selecting unique elements
- To: XSL List <xsl-list at lists dot mulberrytech dot com>
- Subject: [xsl] Some more stuff about selecting unique elements
- From: Joerg Pietschmann <joerg dot pietschmann at zkb dot ch>
- Date: Tue, 10 Apr 2001 14:19:39 +0200
- Organization: ZKB
- Reply-To: xsl-list at lists dot mulberrytech dot com
Hello,
the "Confused about preceding-sibling..." post inspired me to the
following question(s) at the end of this post.
I have a XML similar to
<level0>
<level1>
<level2>
<stuff>1</stuff>
<stuff>2</stuff>
<stuff>3</stuff>
</level2>
<level2>
<stuff>3</stuff>
<stuff>4</stuff>
<stuff>5</stuff>
</level2>
</level1>
<level1>
<level2>
<stuff>2</stuff>
<stuff>4</stuff>
<stuff>6</stuff>
</level2>
<level2>
<stuff>4</stuff>
<stuff>6</stuff>
<stuff>8</stuff>
</level2>
</level1>
</level0>
The levelN elements represent some kind of context. I want to copy
the structure while throwing out the level2 elements and the stuff-elements
that are duplicated within their level1 context, however, they may be
duplicated in different level1 contexts:
<level0>
<level1>
<stuff>1</stuff>
<stuff>2</stuff>
<stuff>3</stuff>
<stuff>4</stuff>
<stuff>5</stuff>
</level1>
<level1>
<stuff>2</stuff>
<stuff>4</stuff>
<stuff>6</stuff>
<stuff>8</stuff>
</level1>
</level0>
After some experiments, the following XSL seems to achieve this
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" encoding="ASCII"/>
<xsl:template match="*">
<xsl:element name="{name()}">
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
<xsl:template match="level1">
<level1>
<xsl:apply-templates select=".//stuff[
not(.=preceding::stuff[
generate-id(current())
=generate-id(ancestor::level1)])]"/>
</level1>
</xsl:template>
</xsl:stylesheet>
Explanation: select descendend stuff elements which do not have
the same content as a preceding stuff element which has the same
level1 element as ancestor as the current level1 element actually
is.
Well, the problem i dislike the preceding axis is performance, imagine
an XML file with some hundred or thousand level1 elements. I havn't
checked it with my file containing some 500+ level1 elements because
it is convenient for me for some other reasons to have a batch process
splitting the file in small files each containing a level1 element, process
them and merge them in a third step.
What optimisations do XSL processors (read: Saxon) while processing this
XSL? Are there other solutions (in pure XSLT 1.0) to the problem and that
are better suited to already implemented optimisations? Would it help to
use a xsl:key for selecting the preceding stuff elements with the same
ancestor?
I suppose in XSLT 1.1 (2.0) where RTFs are gone it would be prudent to
construct a copy of the stuff elements that descend from the given level1 element
and select from there, as i think the preceding axis will work on the node-set
with the copies only. Take the following snippet as illustration:
<xsl:template match="level1">
<xsl:variable name="stuff">
<xsl:for-each select=".//stuff">
<xsl:copy-of select="."/>
</xsl:for-each>
</xsl:variable>
<level1>
<xsl:apply-templates select="$stuff[not(.=preceding::stuff)]"/>
</level1>
</xsl:template>
Is this correct? Would this work as expected (with some slack as it is obviously
untested)? Could it be expected to be more performant than the solution above?
<grin/>
Regards
J.Pietschmann
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list