This is the mail archive of the
xsl-list@mulberrytech.com
mailing list .
Re: XPath grammar questions
- From: Dimitre Novatchev <dnovatchev at yahoo dot com>
- To: xsl-list at lists dot mulberrytech dot com
- Date: Sun, 17 Mar 2002 10:16:05 -0800 (PST)
- Subject: [xsl] Re: XPath grammar questions
- Reply-to: xsl-list at lists dot mulberrytech dot com
Hi Sean,
> I've written the XPath parser three times already; this
> fourth time, I broke down and just implemented a lexer (more or less)
> conforming to the XPath grammar. It works more or less properly, but
> I have a couple of places where it breaks down, and if there are any
> XPath gurus who can tell me how I'm misunderstanding the XPath spec,
> I'd appreciate the feedback.
>
> The first case is in a path submitted by Tobias Reif, that
> originated, as I recall, from someone on this list:
>
> *[* and not(*/node()) and not(*[not(@style)]) and not(*/@style !=
> */@style)]
>
> Specifically, it's the 'not(*/node())' that I'm having trouble with.
> The XPath spec states that:
>
> not( boolean ) -> boolean
>
> This would imply that '*/node()' evaluates to a boolean. However, it
> also states that paths such as:
>
> ancestor::node()
>
> evaluates to a set of matching nodes. Further, I had assumed that
> the path:
>
> */node()
>
> by itself would also result in a set of nodes.
>
> I have a group of theories about this, but I'm not quite grokking the
> intent of XPath. I don't see how the same path should evaluate to
> two different results. In any case, there have been a number of
> successful implementations of XPath, so I know I'm missing something.
>From the spec:
http://www.w3.org/TR/xpath#section-Boolean-Functions
"The boolean function converts its argument to a boolean as follows:
a number is true if and only if it is neither positive or negative zero
nor NaN
a node-set is true if and only if it is non-empty
a string is true if and only if its length is non-zero
an object of a type other than the four basic types is converted to a
boolean in a way that is dependent on that type
Function: boolean not(boolean)
The not function returns true if its argument is false, and false
otherwise."
What this means is if a node-set is passed as argument to not(), it is
first converted to boolean by using the rules for the boolean()
function above.
So:
not(expression) = not(boolean(expression))
In this specific case not(node-set) will be true only if the node-set
is the empty node-set.
> The second (and at this point, more critical) problem I'm having is
> with function names. Take:
>
> [normalize-space(@name)='x']
>
> If you follow the grammar, the evaluation is:
>
> Predicate->Expr->OrExpr->AndExpr->EqualityExpr->RelationalExpr->
> AdditiveExpr
>
> at which point it matches the rule:
>
> AdditiveExpr:: AdditiveExpr '-' MultiplicativeExpr
>
> where you effectively have "normalize" "-" "space(@name)='x'". What
> my code does at this point is hang; 'normalize' gets caught in an
> endless, recursive evaluation loop. The only way I think I can solve
> this at this point is for checking for endless recursion.
>
There's another rule -- for QName:
http://www.w3.org/TR/REC-xml-names/
and it uses NCName for the prefix and the local part.
The rule for NCName is:
[4] NCName ::= (Letter | '_') (NCNameChar)* /* An XML Name, minus
the ":" */
[5] NCNameChar ::= Letter | Digit | '.' | '-' | '_' | CombiningChar |
Extender
So, "-" is a legitimate character in every QName.
The rule for function names uses QName as well.
In order to perform correctly, any lexical analizer should match the
names greedily -- that is, should return the longest string that
matches a particular rule.
In your case, the lexer should return NCName for "normalize-space", and
not NCName "-" NCName.
It is a common mistake in XSLT/XPath to write expressions as $var-4 and
to complain that this was not parsed and evaluated as $var - 4
Cheers,
Dimitre Novatchev.
__________________________________________________
Do You Yahoo!?
Yahoo! Sports - live college hoops coverage
http://sports.yahoo.com/
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list