GNOME Bugzilla – Bug 746048
parsing an unclosed comment can result in `Conditional jump or move depends on uninitialised value(s)` and unsafe memory access
Last modified: 2017-08-14 05:32:34 UTC
Created attachment 299127 [details] [review] suggested patch from Francois Chagnon The following code, when compiled and linked against libxml2 master (I'm on 02b252d), will result in valgrind errors. ``` #include "string.h" #include <libxml/HTMLparser.h> #include <libxml/HTMLtree.h> int main(int argc, char** argv) { // Nokogiri::HTML::fragment("<!-- ") htmlDocPtr doc ; int options = HTML_PARSE_RECOVER | HTML_PARSE_NOERROR | HTML_PARSE_NOWARNING | HTML_PARSE_NONET ; char* HTMLFRAG_GOOD = "<html><body><!-- x" ; char* HTMLFRAG_BAD = "<html><body><!--" ; char* HTMLFRAG = HTMLFRAG_BAD ; xmlInitParser(); doc = htmlReadMemory(HTMLFRAG, strlen(HTMLFRAG), NULL, "UTF-8", options); xmlFreeDoc(doc); } ``` There are 13 different error contexts reported by valgrind on my system, but all are variations on "Conditional jump or move depends on uninitialised value(s)", for example: ``` ==32338== Conditional jump or move depends on uninitialised value(s) ==32338== at 0x43C377: htmlCurrentChar (HTMLparser.c:440) ==32338== by 0x43CF4A: htmlParseComment (HTMLparser.c:3250) ==32338== by 0x44012E: htmlParseContentInternal (HTMLparser.c:4559) ==32338== by 0x440C0E: htmlParseDocument (HTMLparser.c:4735) ==32338== by 0x443741: htmlDoRead (HTMLparser.c:6707) ==32338== by 0x402B39: main (in /home/miked-public/code/oss/nokogiri/issue-xxx-libxml/repro) ``` This issue was originally reported by Florian Weingarten. A patch was suggested by Francois Chagnon, which prevents the errors seen by valgrind on my system and thus might be of use to you. I have attached it as `read-past-buffer.diff`. Thanks for your time and consideration, -mike
Possibly also worth noting, for prioritization purposes, is that the original reporter was seeing real unsafe memory access (not simply a use of uninitialized pointers) in a production system. I suspect there may be a DOS vector here, but couldn't produce on on my local system in a reasonable amount of time.
Hum, I need to look at this thanks for raising this, and the patch :-) ! Daniel
Dear All, I'm founder of this bug. https://hackerone.com/reports/57125 In Shopify, I was able to see previous http request which includes Email address, session_hash, and even last 4 digit of credit card number. I reported this bug on 8th of March and reported to Shopify but I didn't know that it was problem of this. Just FYI. Thanks, Jun
Dohh it seems I dropped the ball ... Looking at it ! Daniel
I think the fundamental problem is that a lot of the code assumes 0 terminated buffers, the problem is that the HTML parser there forgot to check for the termination when reading the 2 first characters in the comment. As a result I ended up with a rather different patch https://git.gnome.org/browse/libxml2/commit/?id=e724879d964d774df9b7969fc846605aa1bac54c thanks for the report and the initial patch, sorry it took so long ! Daniel
Great, thank you!
This has been assigned CVE-2015-8710 by MITRE.