This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
Re: xsl
- To: Tom Melkonian <melkonian at entelos dot com>
- Subject: Re: [xsl] xsl
- From: Jeni Tennison <mail at jenitennison dot com>
- Date: Wed, 28 Mar 2001 07:38:04 +0100
- CC: "'xsl-list at lists dot mulberrytech dot com'" <xsl-list at lists dot mulberrytech dot com>
- Organization: Jeni Tennison Consulting Ltd
- References: <F8F3E22A3C207348A963006D534282162E1182@hippocrates.entelos.com>
- Reply-To: xsl-list at lists dot mulberrytech dot com
Hi Tom,
> I am trying to find out how to eliminate duplicate hits in a search
> results list which is contained in XML data:
>
> <Result id="100" name="tom" />
> <Result id="100" name="tom" />
> <Result id="100" name="tom" />
> <Result id="100" name="tom" />
Is it the id or the name that indicates a unique hit? I'll assume
it's the id.
XSLT doesn't have any great built-in distinct() function (although
there are extension functions like saxon:distinct() that you could
use) so you're *probably* going to be better off addressing the
problem in the search engine rather than using XSLT to do it.
Having said that, you can pick only the unique Result elements by
going through the list of them and only choosing those that don't have
a preceding sibling with the same id:
Result[not(preceding-sibling::Result/@id = @id)]
If the Result elements are sorted already for you, such that all the
Reseults with the same id are grouped together, then you can use the
more efficient:
Result[not(preceding-sibling::Result[1]/@id = @id)]
This is more efficient because it only checks the
immediately-preceding Result element rather than going through all the
preceding siblings.
If they're not sorted and you have a lot of Result elements, then you
may want to use the Muenchian method. This involves setting up a key
to index into the Result elements by their id:
<xsl:key name="results-by-id" match="Result" use="@id" />
You can then get all the Result elements with an id of 100, for
example, with:
key('results-by-id', '100')
And you can get the unique results by testing each Result to see
whether it is the first in the list you get when you use the key to
get Result elements with its ID, either using generate-id():
Result[generate-id() = generate-id(key('result-by-id', @id)[1])]
or through set logic:
Result[count(.|key('result-by-id', @id)[1]) = 1]
The Muenchian method is quicker than using preceding-sibling:: but it
takes up more memory because the key creates a hashtable.
I hope that helps,
Jeni
---
Jeni Tennison
http://www.jenitennison.com/
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list