This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

RE: Removing duplicates not preceding vs. keys


> a while ago I asked something about removing duplicates.
> Most of the answers I got concerned either using
>   <xsl:for-each select="//SPEECH[not(.=preceding::SPEECH)]">
> or
> <xsl:key name="sortKey" match="value" use="var" />
> 
> both of 'em work fine, but can anybody tell which one is more 
> favourable and why?

The "preceding" solution typically has O(n*n) performance, it involves
comparing each SPEECH with each SPEECH that precedes it, so as the number of
items doubles, elapsed time increases by a factor of four.

The "key" solution typically has O(n log(n)) performance, it involves a
adding each item to an index and looking up each item in an index. So when
the number of items doubles, elapsed time increases by a factor of only say
2.1

That means that the "preceding" solution may be faster for small files, but
as the files get bigger, the "key" solution will win.

Of course, this is all based on assumptions on how the implementations work:
assumptions that are reasonable, but not necessarily true of all products.

Mike Kay


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]