GNOME Bugzilla – Bug 705413
Apparent bug in xpath handling of namespace declarations
Last modified: 2021-07-05 13:27:16 UTC
Created attachment 250772 [details] Bug demo code Many of the pointer fields of nodes of type XML_NAMESPACE_DECL in a nodeset returned by evaluation of an XPath expression such as "//namespace::*" appear to be corrupted. The attached example code illustrates the problem; xmlNodeDump works as expected for a nodeset derived from XPath "//*", but segfaults with "//namespace::*". Furthermore, such nodesets appear to contain two occurrences of each actual namespace declaration on the document (each of which has a different "doc" pointer, neither of which match the actual parent doc address). If this is not a bug, an explanation of correct usage should be added to the documentation, since the observed behaviour is so unexpected. This bug was first reported with respect to libxml2 2.7.6 (see bug report 636420) and I have just confirmed that it still exists in 2.9.1.
Created attachment 250828 [details] [review] Allocate xmlNode for case future wrong usage of xpath result For case allocate xmlNs - we can have function that try to receive access to common xmlNode fields that will be filled by garbage.
Try use: - xmlNodeDump(buffer, node->doc, node, 0, 1); + xmlNodeDump(buffer, doc, node, 0, 1); because xmlNs nodes do not have doc field(and filled by garbage before patch).
I explained the issue about libxml2 namespace node return values there: https://mail.gnome.org/archives/xml/2010-November/msg00032.html W.r.t. dumping the namespace node before accessing any field of a node the node type must be checked. Since you code dereference without checking the type first it crashes, Denis suggestion is right. On the other hand I don't think that the change of allocation is a right strategy. Daniel
yes, i am agree, its more hack than solution. Can we use in resulted sets some list of abstract structs that can be xmlNode and xmlNs in same time?
Thanks for the helpful responses. Denis: Your suggested change from node->doc to doc does indeed fix the segfault, but the resulting code produces an empty string as the dumped value for node 0 of the namespace node set. It's not obvious which fields of the node to check to predict when the dumped value will be empty (for example, the content pointer is not null for the node that produces and empty string). Any ideas? Daniel: By "Since you code dereference without checking the type first it crashes" do you mean that one should not attempt to dereference node fields (such as node->doc) without first checking node->type to determine whether the node is of type XML_NAMESPACE_DECL? As I stated in the initial report, this behaviour is sufficiently non-obvious that it couldn't hurt to provide at least a brief explanation in the documentation.
Definitely, you should check the type first. Where in the documentation do you need this written ? Daniel
That's a reasonable question. If xmlBufNodeDump doesn't warrant an it's own example under Libxml2 set of examples > Tree, then at least a sentence pointing out the need to check the type in the interface documentation (at http://xmlsoft.org/html/libxml-tree.html#xmlBufNodeDump in the online docs).
*** Bug 636420 has been marked as a duplicate of this bug. ***
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled). If you can still reproduce the situation described in this ticket in a recent and supported software version, then please follow https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines and create a new ticket at https://gitlab.gnome.org/GNOME/libxml2/-/issues/ Thank you for your understanding and your help.