GNOME Bugzilla – Bug 760263
Heap use-after-free in htmlParsePubidLiteral and htmlParseSystemLiteral
Last modified: 2016-08-22 20:24:06 UTC
Created attachment 318411 [details] Minimized crasher Hi, The following heap use-after-free can be observed on an ASAN build of Libxml2 stable release 2.9.3: $ ./xmllint --html c00_id00_min.html ================================================================= ==22783== ERROR: AddressSanitizer: heap-use-after-free on address 0xb4900116 at pc 0x8507fac bp 0xbfb10308 sp 0xbfb102fc READ of size 1 at 0xb4900116 thread T0 #0 0x8507fab in memcpy /usr/include/i386-linux-gnu/bits/string3.h:51 #1 0x8507fab in xmlStrndup ~/libxml2-2.9.3/xmlstring.c:50 #2 0x804d369 in htmlParsePubidLiteral ~/libxml2-2.9.3/HTMLparser.c:2841 #3 0x804d369 in htmlParseExternalID ~/libxml2-2.9.3/HTMLparser.c:3108 #4 0x804d369 in htmlParseDocTypeDecl ~/libxml2-2.9.3/HTMLparser.c:3439 #5 0x83153bf in htmlParseContentInternal ~/libxml2-2.9.3/HTMLparser.c:4585 #6 0x831d5ae in htmlParseDocument ~/libxml2-2.9.3/HTMLparser.c:4769 #7 0x833c8d6 in htmlDoRead ~/libxml2-2.9.3/HTMLparser.c:6741 #8 0x833c8d6 in htmlReadFile ~/libxml2-2.9.3/HTMLparser.c:6799 #9 0x80592f5 in parseAndPrintFile ~/libxml2-2.9.3/xmllint.c:2248 #10 0x8051352 in main ~/libxml2-2.9.3/xmllint.c:3759 #11 0xb5f13a82 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287 #12 0x80555c0 in _start (~/libxml2-2.9.3/libxml293/bin/xmllint+0x80555c0) 0xb4900116 is located 22 bytes inside of 8194-byte region [0xb4900100,0xb4902102) freed by thread T0 here: #0 0xb61059b4 in __interceptor_realloc (/usr/lib/i386-linux-gnu/libasan.so.0+0x169b4) #1 0x85150b2 in xmlBufGrowInternal ~/libxml2-2.9.3/buf.c:486 #2 0x85150b2 in xmlBufGrow ~/libxml2-2.9.3/buf.c:515 previously allocated by thread T0 here: #0 0xb6105854 in malloc (/usr/lib/i386-linux-gnu/libasan.so.0+0x16854) #1 0x8511a90 in xmlBufCreateSize ~/libxml2-2.9.3/buf.c:172 SUMMARY: AddressSanitizer: heap-use-after-free /usr/include/i386-linux-gnu/bits/string3.h:51 ?? Shadow bytes around the buggy address: 0x3691ffd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x3691ffe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x3691fff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x36920000: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x36920010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa =>0x36920020: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd 0x36920030: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x36920040: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x36920050: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x36920060: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x36920070: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Heap righ redzone: fb Freed Heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack partial redzone: f4 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 ASan internal: fe ==22783== ABORTING Aborted Found with american fuzzy lop, written by lcamtuf. The Libxml2 HTML parser is used by PHP in DOMDocument::load(), but it does not seem to affect the current PHP release in so far what was tested. This particular issue seems is limited to a denial of service in ASAN build only. A minimized crasher that works for Libxml2-2.9.3 ASAN build is attached as c00_id00_min.html.
Also affects a similar path involving htmlParseSystemLiteral, with info-discl capacity. Please restrict the report, thanks !
Created attachment 322923 [details] [review] Proposed Patch v1
Hi, thanks for the work ! A parallel bug exists in htmlParseSystemLiteral(), should be reproducible by changing the string "PUBLIC" in the crasher to "SYSTEM". This seems not touched by the patch. Please let me know if you cannot reproduce, I will get the original crasher for this bug. There could be other vectors other tham these two though.
Created attachment 322935 [details] Minimized crasher for the parallel bug htmlParseSystemLiteral
$ $GITXML --html c03_id00_min.html c03_id00_min.html:1: HTML parser error : Misplaced DOCTYPE declaration 0<!DOCTYPE r000 SYSTEM '00000000000000000000000000000000000000000000000000000000 ^ ================================================================= ==13334== ERROR: AddressSanitizer: heap-use-after-free on address 0xb4a00118 at pc 0x850872c bp 0xbfde2c08 sp 0xbfde2bfc READ of size 1 at 0xb4a00118 thread T0 #0 0x850872b in memcpy /usr/include/i386-linux-gnu/bits/string3.h:51 #1 0x850872b in xmlStrndup /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/xmlstring.c:50 #2 0x804b9dd in htmlParseSystemLiteral /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/HTMLparser.c:2792 #3 0x804c381 in htmlParseExternalID /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/HTMLparser.c:3094 #4 0x804c381 in htmlParseDocTypeDecl /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/HTMLparser.c:3439 #5 0x831458f in htmlParseContentInternal /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/HTMLparser.c:4585 #6 0x831be86 in htmlParseDocument /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/HTMLparser.c:4769 #7 0x833b2bb in htmlDoRead /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/HTMLparser.c:6741 #8 0x833b2bb in htmlReadFile /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/HTMLparser.c:6799 #9 0x8059367 in parseAndPrintFile /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/xmllint.c:2248 #10 0x8050fdd in main /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/xmllint.c:3759 #11 0xb5f90a82 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287 #12 0x80553e0 in _start (/home/weilei/fuzz_home/_xml_fuzz/bin/xmllint+0x80553e0) 0xb4a00118 is located 24 bytes inside of 8194-byte region [0xb4a00100,0xb4a02102) freed by thread T0 here: #0 0xb61829b4 in __interceptor_realloc (/usr/lib/i386-linux-gnu/libasan.so.0+0x169b4) #1 0x85159bf in xmlBufGrowInternal /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/buf.c:486 #2 0x85159bf in xmlBufGrow /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/buf.c:515 previously allocated by thread T0 here: #0 0xb6182854 in malloc (/usr/lib/i386-linux-gnu/libasan.so.0+0x16854) #1 0x851230c in xmlBufCreateSize /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/buf.c:172 SUMMARY: AddressSanitizer: heap-use-after-free /usr/include/i386-linux-gnu/bits/string3.h:51 ?? Shadow bytes around the buggy address: 0x3693ffd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x3693ffe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x3693fff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x36940000: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x36940010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa =>0x36940020: fd fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd 0x36940030: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x36940040: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x36940050: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x36940060: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x36940070: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Heap righ redzone: fb Freed Heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack partial redzone: f4 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 ASan internal: fe ==13334== ABORTING Aborted
(In reply to Wei Lei from comment #3) > Hi, thanks for the work ! > > A parallel bug exists in htmlParseSystemLiteral(), should be reproducible by > changing the string "PUBLIC" in the crasher to "SYSTEM". This seems not > touched by the patch. Please let me know if you cannot reproduce, I will get > the original crasher for this bug. > > There could be other vectors other tham these two though. Thanks! Working on the fix to SYSTEM as well. We are able to reproduce with the alternate test case. I'm not sure we know the code base well enough to find every instance of this code pattern. Maybe Daniel knows?
Created attachment 323287 [details] [review] Proposed Patch v2 Adds test case and fix for htmlParseSystemLiteral().
Comment on attachment 323287 [details] [review] Proposed Patch v2 >From b34b8dc69637f0a2b6c1bad533dd01b37e86efba Mon Sep 17 00:00:00 2001 >From: Pranjal Jumde <pjumde@apple.com> >Date: Wed, 2 Mar 2016 15:52:24 -0800 >Subject: [PATCH] Bug 760263: Heap use-after-free in htmlParsePubidLiteral and > htmlParseSystemiteral <https://bugzilla.gnome.org/show_bug.cgi?id=760263> Typo: "htmlParseSystemiteral" => "htmlParseSystemLiteral"
David, Pranjal, when applying patch v2 on to of the current head I'm having issues with "make check" as the errors I get don't seems to be stable: ## SAX2 callbacks regression tests ## XML push regression tests ## HTML regression tests Result for ./test/HTML/760263-2.html failed File ./test/HTML/760263-2.html generated an error Result for ./test/HTML/760263.html failed File ./test/HTML/760263.html generated an error ## Push HTML regression tests ## HTML SAX regression tests Got a difference for ./test/HTML/760263-2.html File ./test/HTML/760263-2.html generated an error Got a difference for ./test/HTML/760263.html File ./test/HTML/760263.html generated an error ## Valid documents regression tests ## Validity checking regression tests ## Streaming validity checking regression tests I don't really understand where this comes from ATM: thinkpad2:~/XML -> ./xmllint --html ./test/HTML/760263-2.html > res 2> res2 thinkpad2:~/XML -> diff res result/HTML/760263-2.html thinkpad2:~/XML -> diff res2 result/HTML/760263-2.html.err thinkpad2:~/XML -> ./xmllint --push --html ./test/HTML/760263-2.html > res 2> res2 thinkpad2:~/XML -> diff res result/HTML/760263-2.html thinkpad2:~/XML -> diff res2 result/HTML/760263-2.html.err thinkpad2:~/XML -> If you could look at this this week, I would rather have the tests when pushing the new release next w.e. thanks ! Daniel
(In reply to Daniel Veillard from comment #9) > David, Pranjal, > > when applying patch v2 on to of the current head I'm having issues with > "make check" as the errors I get don't seems to be stable: > > ## SAX2 callbacks regression tests > ## XML push regression tests > ## HTML regression tests > Result for ./test/HTML/760263-2.html failed > File ./test/HTML/760263-2.html generated an error > Result for ./test/HTML/760263.html failed > File ./test/HTML/760263.html generated an error > ## Push HTML regression tests > ## HTML SAX regression tests > Got a difference for ./test/HTML/760263-2.html > File ./test/HTML/760263-2.html generated an error > Got a difference for ./test/HTML/760263.html > File ./test/HTML/760263.html generated an error > ## Valid documents regression tests > ## Validity checking regression tests > ## Streaming validity checking regression tests > > > I don't really understand where this comes from ATM: > thinkpad2:~/XML -> ./xmllint --html ./test/HTML/760263-2.html > res 2> res2 > thinkpad2:~/XML -> diff res result/HTML/760263-2.html > thinkpad2:~/XML -> diff res2 result/HTML/760263-2.html.err > thinkpad2:~/XML -> ./xmllint --push --html ./test/HTML/760263-2.html > res > 2> res2 > thinkpad2:~/XML -> diff res result/HTML/760263-2.html > thinkpad2:~/XML -> diff res2 result/HTML/760263-2.html.err > thinkpad2:~/XML -> > > > If you could look at this this week, I would rather have the tests > when pushing the new release next w.e. When developing these fixes, we have an internal tree that we commit the fixes to as we go, so it's possible that these test results depend on some previous fix. In addition to that, though, I do try to build and test each patch individually on a plain libxml2.git repository, so I'm a little surprised these tests are causing issues. One thing you might try is to apply the "runtest -u" patch for Bug 611807 which creates more definitive test results than xmllint (since it just tells runtest to always write out its results when run with the "-u" switch). Then compare the results to xmllint, or (better yet) simply run "runtests -u" once, then "runtests" to make sure the results match for all the various parsing modes. I suppose it's possible that high-bit characters could have been mangled somehow in the patch. I can try sending you a zip file with the patch if that would help.
Tried that that didn't work: thinkpad2:~/XML -> find result -name 760263\* -exec rm {} \; thinkpad2:~/XML -> ./runtest -u ## XML regression tests ## XML regression tests on memory ## XML entity subst regression tests ## XML Namespaces regression tests ## Error cases regression tests ## Error cases stream regression tests ## Reader regression tests ## Reader entities substitution regression tests ## Reader on memory regression tests ## Walker regression tests ## SAX1 callbacks regression tests ## SAX2 callbacks regression tests ## XML push regression tests ## HTML regression tests ## Push HTML regression tests ## HTML SAX regression tests ## Valid documents regression tests ## Validity checking regression tests ## Streaming validity checking regression tests ## Streaming validity error checking regression tests ## General documents valid regression tests ## XInclude regression tests ## XInclude xmlReader regression tests ## XInclude regression tests stripping include nodes ## XInclude xmlReader regression tests stripping include nodes ## XPath expressions regression tests ## XPath document queries regression tests ## XPointer document queries regression tests ## xml:id regression tests ## URI parsing tests ## URI base composition tests ## Path URI conversion tests ## Schemas regression tests ## Relax-NG regression tests ## Relax-NG streaming regression tests ## Pattern regression tests ## C14N with comments regression tests ## C14N without comments regression tests ## C14N exclusive without comments regression tests ## C14N 1.1 without comments regression tests ## Catalog and Threads regression tests Total 2963 tests, no errors thinkpad2:~/XML -> ./runtest ## XML regression tests ## XML regression tests on memory ## XML entity subst regression tests ## XML Namespaces regression tests ## Error cases regression tests ## Error cases stream regression tests ## Reader regression tests ## Reader entities substitution regression tests ## Reader on memory regression tests ## Walker regression tests ## SAX1 callbacks regression tests ## SAX2 callbacks regression tests ## XML push regression tests ## HTML regression tests Result for ./test/HTML/760263-2.html failed File ./test/HTML/760263-2.html generated an error Result for ./test/HTML/760263.html failed File ./test/HTML/760263.html generated an error ## Push HTML regression tests .... Total 2963 tests, 2 errors, 0 leaks thinkpad2:~/XML -> I added more debugging to runtest : ## HTML regression tests file result/HTML/760263-2.html is 64037 bytes, result is 127958 bytes Result for ./test/HTML/760263-2.html failed in result/HTML/760263-2.html File ./test/HTML/760263-2.html generated an error file result/HTML/760263.html is 64038 bytes, result is 127959 bytes Result for ./test/HTML/760263.html failed in result/HTML/760263.html File ./test/HTML/760263.html generated an error ## Push HTML regression tests ## Push HTML regression tests ## HTML SAX regression tests ## Valid documents regression tests I'm puzzled because Push test works, both generate their output from the built tree using htmlDocDumpMemory() but in the case of non push the size returned is 127958 instead of 64037 bytes. I looked at the content, the output is correct except padded with zeroes after byte 64037 . I think ./runtest -u writes the file twice, once in normal mode, then in push mode which is why the final content of the result file is correct. If I don't find out the culprit I will push the patch without the added tests for the release, and that will need to be sorted out asynchronously. at least that doesn't sound a security issue. Daniel
Fixed by commit 11ed4a7a90d5ce156a18980a4ad4e53e77384852: <https://git.gnome.org/browse/libxml2/commit/?id=11ed4a7a90d5ce156a18980a4ad4e53e77384852>
Removing security group as commit is public