This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Re: A question about the expressive power and limitations of XPath 2.0


Hi Dimitre,

> I thought that you could achieve concatenation by a simple:
>
>  <xsl:value-of select="$sequence" separator="''"/>
>
> or am I wrong? (of course, this is not "pure XPath")

No, you're right (well, separator has to just be separator="" - it's
an attribute value template, not an expression, but yes.)

OK, let me think... a higher-order distinct function is an example.
You have structured identifiers of the form "group.subgroup" and you
want to return a unique set of nodes based on the "group" part of the
identifier (note that Mike said they were discussion how to support
this already, so perhaps there'll be a new 'distinct' clause added to
the for expression to solve it). A recursive solution would be:

<xsl:function name="my:distinct">
  <xsl:param name="nodes" type="node*" select="()" />
  <xsl:param name="distinct" type="node*" select="()" />
  <xsl:variable name="new-distinct"
    select="if ($nodes[1] and
                some $n in ($distinct)
                satisfies (substring-before($n/@id, '.') =
                           substring-before($nodes[1]/@id, '.')))
            then ($distinct | $n)
            else $distinct" />
  <xsl:result select="if ($nodes)
                      then my:distinct($nodes[position() > 2],
                                       $new-distinct)
                      else $distinct
</xsl:function>

Hmm... or a simple reverse function:

<xsl:function name="my:reverse">
  <xsl:param name="items" type="item*" select="()" />
  <xsl:param name="reversed" type="item*" select="()" />
  <xsl:result select="if ($items)
                      then my:reverse($items[position() > 2],
                                      ($items[1], $reversed))
                      else $reversed" />
</xsl:function>

I don't think you can currently do reverse even if you have a sort()
function or clause of some kind, because you can't get information
about the position of the item you're processing from within the
return clause of a for expression.

>> Are you after examples that indicate the shortfallings in the
>> regular expression syntax, the match() or replace() functions as
>> defined or something more general that illustrates that regular
>> expressions can't be used to process every kind of string?
>
> I just want to be sure that in case I decided to propose something,
> it would be based on solid cases that nobody could say could be
> easily solved doing this or that from XPath 2.0.

If you want a solid use case, I'd use David's example. Anything I come
up with will just be toy examples. A more accessible version of
David's example (arguably) would be the fairly common situation where
you get a source document where a snippet of HTML content is embedded
within a CDATA section within the XML:

  <description>
    <!CDATA[
      <img src="bionicle.gif" width=30 height=50>
      This product (<a href="details.html">details</a>) is great.
    ]]>
  </description>

You need to parse the HTML content into XHTML. To make the task doable
as an example, the embedded HTML can contain a restricted set of
elements from HTML - img, a, b and i. Naturally, b and i can nest
inside a and each other. (I expect that a stylesheet that did this
would be a really popular module!)

Subtasks from this are:

Creating a regular expression that matches start and end tags in
content (the regular expression syntax doesn't include backreferences,
so you can't check that the name in the end tag is the same as the
name in the start tag) Examples:

  a. Here's some <b>bold <i>and italic</i></b> text.
  b. Here's some <i>italic <b>and bold</b></i> text.

Create a regular expression that would pull out the following:

  a. <b>bold <i>and italic</i></b>
  b. <i>italic <b>and bold</b></i>

Doing a replace that creates element structure rather than strings
(replace() doesn't do this at the moment). Because of this, you have
to use match(), but match() doesn't give you access to subexpression
matches).

A simple example is if you have dates in the format:

  13/1/02

and you want to create:

  <date day="13" month="01" year="2002" />

This is achievable, but you end up running the same match three times
- once to get the day, once to get the month, once to get the year.
(Plus, I should note, there's no parse-date() function at the moment -
if there were it would be easier.)

These are the kinds of issues that David and I are trying to work
through at the moment - if you have some ideas we'd be really glad to
hear them.

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]