GNOME Bugzilla – Bug 701330
Crash after xmlReplaceNode within different xmlDocs
Last modified: 2021-07-05 13:27:02 UTC
Created attachment 245702 [details] Buggy main.cpp (compile with g++) I'm trying to use xmlReplaceNode within different xml documents. If I'll create document structure manually (as in main-no-dict.cpp, no crash will occur). But if I'll use some parsers and xpath expressions to locate target nodes, a program will crash during xmlFreeDoc call. Call stack look this way (valgrind output): ==20310== Invalid free() / delete / delete[] ==20310== at 0x4C270BD: free (vg_replace_malloc.c:366) ==20310== by 0x4E85018: xmlFreeNodeList (tree.c:3669) ==20310== by 0x4E84F4A: xmlFreeNodeList (tree.c:3641) ==20310== by 0x4E84D77: xmlFreeDoc (tree.c:1239) ==20310== by 0x401112: main (main.cpp:50) I digged a little while. It seems that the problem lays within the xmlDict structure — no dict structures move occurs when replacing the node.
Created attachment 245703 [details] Not crushable main.cpp
Created attachment 245704 [details] Makefile (use make ok/make crash — valgrind will be used)
I'm using libxml2-2.8.0.
When you create a document with xmNewDoc() etc. it doesn't by default create a dictionary for the document. When you parse a serialized XML file it will make a dictionary by default. The simplest in your situation is to ask when you parse to not use a dictionary, that can be done with the XML_PARSE_NODICT option when creating the parser. You will loose some space but that won't get in the way. The other option is to make sure all documents you use for exchanging fragments of document use the same dictionary (a dict can be used by multiple document, there is reference counting), that's a bit harder to set-up, e.g. if mydict is the existing dictionary of a first document and you want to parse another one: pctxt = xmlNewParserCtxt(); if (pctxt == NULL) return(NULL); if ((dict != NULL) && (pctxt->dict != NULL)) { xmlDictFree(pctxt->dict); pctxt->dict = NULL; } if (mydict != NULL) { pctxt->dict = mydict; xmlDictReference(pctxt->dict); } then use xmlCtxtReadDoc()/ xmlCtxtReadMemory()/ etc. with the given pctxt parser context, Daniel
Yep. I've invented temporary workaround for this using nodeCopy = xmlCopyNode(node2); xmlAddNextSibling(node1, nodeCopy); xmlUnlinkNode(node2); xmlFreeNode(node2); This seems to work well, but is a bit unoptimal. I'm using my own C++ wrappers and therefore can't get access to other document's dictionary. Disabling dictionary seems to be very memory consuming solution. Are you going to fix this in any of future versions (by implementing dictMove of some kind)?
The problem is that if a document was created with a dictionary, then most of the strings in the document are allocated there, and we can't really change this once the document is created, or well that would be a very expensive operation, so just copying the fragment you want is simpler. Note that during the copy operation all new strings are not allocated from a dictionary anymore. The API could make sense though, something like: xmlDictPtr xmlNodeGetDict(xmlNodePtr node); int xmlNodeSwitchDict(xmlNodePtr, xmlDictPtr new); or something along those lines. Daniel
Sounds reasonable to me. Alternatively, you could just change xmlReplaceNode logic to copy/remove node (remove will require dictionary lookups in any way).
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled). If you can still reproduce the situation described in this ticket in a recent and supported software version, then please follow https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines and create a new ticket at https://gitlab.gnome.org/GNOME/libxml2/-/issues/ Thank you for your understanding and your help.