GNOME Bugzilla – Bug 508662
dereference of possibly NULL Pointer
Last modified: 2008-04-01 08:00:37 UTC
Please describe the problem: in xpath.c I see this code: xmlXPathObjectPtr xmlXPathNewNodeSetList(xmlNodeSetPtr val) { xmlXPathObjectPtr ret; int i; if (val == NULL) ret = NULL; else if (val->nodeTab == NULL) ret = xmlXPathNewNodeSet(NULL); else { ret = xmlXPathNewNodeSet(val->nodeTab[0]); for (i = 1; i < val->nodeNr; ++i) xmlXPathNodeSetAddUnique(ret->nodesetval, val->nodeTab[i]); } return (ret); } The problem with it is that xmlXPathNewNodeSet can return NULL in the case that xmlMalloc returns an out-of-memory condition. However, in the for() loop, the pointer is dereferenced without first checking that case. This could cause the application to crash. This code should probably be patched with a "if (ret != NULL) for (...)". I noticed this problem by skimming through the code. I have not investigated if this is a more general problem. Steps to reproduce: Actual results: Expected results: Does this happen every time? Other information:
xmlXPathNodeSetCreate, which could return NULL on xmlMalloc failure, is also seen to be called and then its result value dereferenced without checking.
Out of memory condition are edge case, they should be fixed, but this cannot be flagged as 'Critical' unless you have a clear reproducer. For #2 where ? Please provide contextual patches that will speed things up ! Daniel
Created attachment 102594 [details] [review] prooposed patch for xpath.c in the original report
Created attachment 102595 [details] [review] incomplete patch for the bug reported in #2
OK, both patches attached. Note that the second patch does not attempt to fix all the problems with the called function -- I got tired about half a dozen of them. Note that I found these problems while investigating a crasher issue for PostgreSQL, on which we use the libxml2 library for our (partial) SQL/XML support; see http://thread.gmane.org/gmane.comp.db.postgresql.bugs/15854/focus=15889 Our question is in case of out-of-memory failure: is it better to return NULL to libxml2 and let it clean up for itself, or should we trap the error ourselves? We are already using our own memory allocator: http://developer.postgresql.org/cvsweb.cgi/pgsql/src/backend/utils/adt/xml.c?rev=1.64;content-type=text%2Fx-cvsweb-markup so this is not difficult for us. But doing this could mean letting libxml2 internal state remain corrupted because of not giving it the chance to clean it up. However, it seems to me as libxml2 is not at all tidy about how to deal with out-of-memory, so if left to its own devices it could cause serious trouble (== crashes) in the database system. Hence we're deciding that it is better to slip control from under libxml2 and try to clean up as best as possible. _Hopefully_ that will cause no serious harm.
Any word on this? It is a real problem. You cannot wave it away just because I don't have "a clear reproducer".
Just found out your comment #6... no it's not that i drop the issue, it's that i have a day to day job on something else, a big pile of bugs and for reactive answer we use the list, bugzilla is for archival of problem to make sure it's caught if possible before next release... if you have a better way or know how to pack 48 hours into a day, I'm all ears ! But let's focus on the issue. I understand the issue, and the patch looks fine, except a set1 turned into a val1 which made the resulting module not compile at all, did you tried to compile after the change ? That part looks fishy in the second part: @@ -3852,7 +3856,9 @@ xmlNodePtr n1, n2; if (set1 == NULL) - set1 = xmlXPathNodeSetCreate(NULL); + set1 = xmlXPathNodeSetCreate(NULL); + if (val1 == NULL) + return (NULL); initNbSet1 = set1->nodeNr; for (i = 0;i < set2->nodeNr;i++) { using the -p flag to diff helps making sure it's not a error at the patch level... I applied both patches they will be in SVN head shortly, marking bug as FIXEd even if this may be considered an incomplete patch. For out of memory failure mode, it completely depend on your model. The problem if you catch it and short circuit libxml2 is that you may leak the data allocated by libxml2 routines (assuming a setjmp/longjmp kind of handling). Or you could wait in the allocator until some data is available again. Really depend on the kind of application, context etc ... i don't see why you need XPath for an XML export, I would do a minimal export from the database process and do any kind of filtering and XML processing outside of the process based on the XML flow. By doing a XML tree (needed for XPath) for database data you're really getting the risk of exploding the max system memory, I would never do that within the database process itself, really (or limit XPath to a streamable subset which could be handled at a SAX like level see http://xmlsoft.org/html/libxml-pattern.html ) But bugzilla is *really* not a proper place for design discussions, use the mailing-list ! Daniel Daniel