GNOME Bugzilla – Bug 495995
Double-free of nodeset / incorrect local RVT usage
Last modified: 2007-11-13 21:18:11 UTC
Email dump commences: I have found a double free of a nodeset in global variables. Here is a truncated stack of the problem. The values in parenthesis refer to variables in the attached testcase. xsltEvalGlobalVariables xsltEvalGlobalVariable (xml) xsltCopyOf xsltEvalGlobalVariable (tokenized) exsltStrTokenizeFunction The str:tokenize function creates and registers a local RVT [1]. The tokenized result is saved in that RVT. The global variable evaluation function then saves the computed value of xml in the global eval structure. However, when the xsl:copy-of function exits, it releases the RVT, as it does not match the previous local RVT [2]. This causes the nodeset generated by str:tokenize to be free'd, even though the global variable hash still has a value for it, and will continue to use it. This will be a problem at two points of execution: 1/ If the variable is used again. 2/ During destruction of the globals hash. Attached is a test case that exhibits this problem. Run it against itself, preferably in valgrind, to see the issues. An example run (truncated for clarity): $ sudo valgrind xsltproc testcase.xsl testcase.xsl Invalid read of size 4 at 0x544BC05: xmlXPathNodeSetMerge by 0x54524A0: xmlXPathObjectCopy by 0x4B3B5E1: xsltXPathVariableLookup by 0x5456787: (within /usr/lib64/libxml2.so.2.6.30) by 0x5455CBE: (within /usr/lib64/libxml2.so.2.6.30) by 0x5455C2B: (within /usr/lib64/libxml2.so.2.6.30) by 0x545735D: (within /usr/lib64/libxml2.so.2.6.30) by 0x545B466: (within /usr/lib64/libxml2.so.2.6.30) by 0x545B638: xmlXPathCompiledEval by 0x4B4AEDB: xsltCopyOf by 0x4B48267: (within /usr/lib64/libxslt.so.1.1.22) by 0x4B3B09B: (within /usr/lib64/libxslt.so.1.1.22) Address 0x5D35EF0 is 8 bytes inside a block of size 120 free'd at 0x4A1F87E: free by 0x54230A6: xmlFreeNodeList by 0x4B3A70C: xsltReleaseRVT by 0x4B47F76: (within /usr/lib64/libxslt.so.1.1.22) by 0x4B48287: (within /usr/lib64/libxslt.so.1.1.22) by 0x4B3B09B: (within /usr/lib64/libxslt.so.1.1.22) by 0x542751E: xmlHashScanFull by 0x542756B: xmlHashScan by 0x4B3A510: xsltEvalGlobalVariables by 0x4B4CED4: (within /usr/lib64/libxslt.so.1.1.22) by 0x40227C: (within /usr/bin/xsltproc) by 0x402BA4: (within /usr/bin/xsltproc) Versions used: $ xsltproc -V Using libxml 20630, libxslt 10122 and libexslt 813 xsltproc was compiled against libxml 20630, libxslt 10122 and libexslt 813 libxslt 10122 was compiled against libxml 20630 libexslt 813 was compiled against libxml 20630 Thanks in advance for your help in fixing this! -Jake Goulding [1] libexslt/strings.c:72-74 [2] libxslt/transform.c:2588
Created attachment 98943 [details] Testcase stylesheet run with 'xsltproc testcase.xsl testcase.xsl'
The underlying problem is the fact that exslt functions such as str:tokenize (and others!) are using temporary RVT's which are "cleaned up" after evaluation. This logic works well most of the time, but when a global variable needs to evaluate another global variable things get twisted... A simplified test case (which fails under valgrind) is: <?xml version="1.0" ?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" xmlns:str="http://exslt.org/strings" extension-element-prefixes="str" > <xsl:variable name="tokenized" select="str:tokenize('a,b', ',')" /> <xsl:variable name="xml"> <xsl:copy-of select="$tokenized[1]" /> </xsl:variable> <xsl:template match="/" /> In this instance, the failure is during final cleanup of global variables. I'm looking into it further, but it may take a while to fully understand how to fix it. Any suggestions are welcome :-).
I think I understood enough to fix this specific problem. I added a call to xsltExtensionInstructionResultRegister() to mark the results of both str:tokenize and str:split, which should prevent their deletion when local RVT's are cleaned up. Fixed code (libexslt/strings.c) is in SVN; both your test case and my (reduced) test run OK under valgrind. Please try it out on your original problem stylesheet, and close the bug when if you are confident the problem is solved.
Bill, thanks for making this patch! We'll try it out on our local error and let you know if it is fixed. I'm wondering though, you have modified only str:tokenize and str:split. Should other extensions functions (like exsl:node-set) which call xsltRegisterLocalRVT also call the function you added above? Additionally, we have a large number of our own XSL extensions. Should those that return an RVT also call this function? Thanks again!
I had exactly that same concern when I read through the libexslt source modules, and I really can't give you a definitive answer. Your sample stylesheet used exsl:node-set, and it obviously wasn't necessary to add in an explicit call in that particular instance. The comments in the xsltExtensionInstructionResultRegister function seem to indicate it's only necessary when the results are "treated like a function return" (note that, before my change, it was only used in functions.c). Basically, the function just sets a flag in the RVT to prevent it's automatic deletion by the garbage collection routines (see transform.c:2130). Note also that running xsltproc under valgrind on a specific stylesheet is a sufficient test to prove whether a user's extension required it (assuming that, in fact, the extension has been exercised). If you can spare the processing time for running such a test, I would be very interested in the results.
Bill, we've confirmed that this fixes the problem in our application. We'll make sure to do some testing on our own extension functions and will let you know what happens.
Thinking a little further on when xsltExtensionInstructionResultRegister() is necessary within an external function, I believe it isn't dependent upon whether an RVT is involved, but rather whether xsltRegisterLocalRVT() has been called (since that's what triggers the "garbage collection" that caused this bug). I should point out that most of the code involved is not written by me, and my "thoughts", "beliefs" and "understandings" are solely based upon reading through the code and running it under the gdb debugger. I highly encourage others to confirm all of this, and let me know if there are other interpretations :-).
Bill, taking your comment to heart and then checking the code for exsl:node-set, I tried valgrinding the following stylesheet and got the same error. There's only an error when exsl:node-set is called on a string. Looks like dyn:map may have a similar problem... <?xml version="1.0" ?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0" xmlns:exsl="http://exslt.org/common" > <xsl:variable name="tokenized" select="exsl:node-set('foo')" /> <xsl:variable name="xml"> <xsl:copy-of select="$tokenized[1]" /> </xsl:variable> <xsl:template match="/" /> </xsl:stylesheet>
Yes, you are correct - the node-set only fails on a string. The dyn:map seems a bit more complex; I've added code to both of those modules (libexslt/[common.c,dynamic.c]) which (I hope) will fix it. Fixed code is in SVN. Thanks!