GNOME Bugzilla – Bug 695699
libxml 2.9.0 XPath evaluation issue
Last modified: 2016-07-27 15:35:15 UTC
libxml 2.9.0 behavior when evaluating xpath expressions involving `position()` is inconsistent with 2.8.0 behavior, and the behavior of other major XML libraries. Given an XML document: <root> <foo>1</foo> <foo>2</foo> <foo>3</foo> <foo>4</foo> </root> when I run the following on 2.8.0, I get consistent behavior: ./testXPath -i xpath-test.xml "//foo[position() = 1]" Object is a Node Set : Set contains 1 nodes: 1 ELEMENT foo ./testXPath -i xpath-test.xml "//*[position() = 1 and self::foo]" Object is a Node Set : Set contains 1 nodes: 1 ELEMENT foo However, when I run it on 2.9.0, I get inconsistent behavior: ./testXPath -i xpath-test.xml "//foo[position() = 1]" Object is a Node Set : Set contains 1 nodes: 1 ELEMENT foo ./testXPath -i xpath-test.xml "//*[position() = 1 and self::foo]" Object is a Node Set : Set contains 0 nodes: This appears to be an off-by-one (or perhaps the initial child text node is being counted) as ./testXPath -i xpath-test.xml "//*[position() = 2 and self::foo]" Object is a Node Set : Set contains 1 nodes: 1 ELEMENT foo
As explained on the list //* picks also the /root node and is sorted by document order so it got first in the list, doesn't seems to be a bug in the current version. Daniel
Thank you for your response. So, just to be clear, the 2.9.0 behavior is considered correct, and the 2.8.0 (and pre-2.8.0) behavior is incorrect?
yes it was a bug in older versions
Hmm. Sorry to bug you all on this again, but I don't understand why this isn't a bug in 2.9.0. Given this document: <root> <set> <foo>1</foo> <foo>2</foo> <foo>3</foo> <foo>4</foo> </set> <set> <foo>5</foo> <foo>6</foo> <foo>7</foo> <foo>8</foo> </set> </root> if I evaluate the query "//foo[position()=3]" on any of the following implementations: - chrome browser (using $x) - libxml 2.8.x and earlier - javax - saxon 9.4 then I get <foo>3</foo> <foo>7</foo> but with libxml 2.9.0 I get: <foo>3</foo> As best as I can tell, the definition of "context position" is the node's position relative to its parent, which is why two nodes should match "position()=3". Any insights here?
since //foo[position()=3] actually means /descendant-or-self::node()/child::foo[position()=3] then yes it seems we introduced an optimization bug there, but it is different from the original report, Daniel
Thanks for the response. I appreciate your patience.
(In reply to comment #5) > since //foo[position()=3] actually means > /descendant-or-self::node()/child::foo[position()=3] > then yes it seems we introduced an optimization bug there, > but it is different from the original report, Does that mean this should be filed as a separate bug?
No, I will keep track of it here, the bug title is fine, Daniel
*** Bug 703580 has been marked as a duplicate of this bug. ***
Sorry, this was my fault. Fixed in commit b4bcba23: https://git.gnome.org/browse/libxml2/commit/?id=b4bcba23f64b71105514875f165a63d4cc720609
*** Bug 707476 has been marked as a duplicate of this bug. ***
Unfortunately this *fix* triggers another issue which I have commented here: https://bugzilla.redhat.com/show_bug.cgi?id=1357971