After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 760263 - (CVE-2016-1837) Heap use-after-free in htmlParsePubidLiteral and htmlParseSystemLiteral
(CVE-2016-1837)
Heap use-after-free in htmlParsePubidLiteral and htmlParseSystemLiteral
Status: RESOLVED FIXED
Product: libxml2
Classification: Platform
Component: general
git master
Other All
: Normal normal
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2016-01-07 11:40 UTC by Wei Lei
Modified: 2016-08-22 20:24 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Minimized crasher (62.50 KB, text/html)
2016-01-07 11:40 UTC, Wei Lei
  Details
Proposed Patch v1 (191.91 KB, patch)
2016-03-03 00:10 UTC, David Kilzer
none Details | Review
Minimized crasher for the parallel bug (62.50 KB, text/html)
2016-03-03 01:59 UTC, Wei Lei
  Details
Proposed Patch v2 (382.78 KB, patch)
2016-03-07 14:54 UTC, David Kilzer
none Details | Review

Description Wei Lei 2016-01-07 11:40:49 UTC
Created attachment 318411 [details]
Minimized crasher

Hi,

The following heap use-after-free can be observed on an ASAN build of Libxml2 stable release 2.9.3:

$ ./xmllint --html c00_id00_min.html

=================================================================
==22783== ERROR: AddressSanitizer: heap-use-after-free on address 0xb4900116 at pc 0x8507fac bp 0xbfb10308 sp 0xbfb102fc
READ of size 1 at 0xb4900116 thread T0
    #0 0x8507fab in memcpy /usr/include/i386-linux-gnu/bits/string3.h:51
    #1 0x8507fab in xmlStrndup ~/libxml2-2.9.3/xmlstring.c:50
    #2 0x804d369 in htmlParsePubidLiteral ~/libxml2-2.9.3/HTMLparser.c:2841
    #3 0x804d369 in htmlParseExternalID ~/libxml2-2.9.3/HTMLparser.c:3108
    #4 0x804d369 in htmlParseDocTypeDecl ~/libxml2-2.9.3/HTMLparser.c:3439
    #5 0x83153bf in htmlParseContentInternal ~/libxml2-2.9.3/HTMLparser.c:4585
    #6 0x831d5ae in htmlParseDocument ~/libxml2-2.9.3/HTMLparser.c:4769
    #7 0x833c8d6 in htmlDoRead ~/libxml2-2.9.3/HTMLparser.c:6741
    #8 0x833c8d6 in htmlReadFile ~/libxml2-2.9.3/HTMLparser.c:6799
    #9 0x80592f5 in parseAndPrintFile ~/libxml2-2.9.3/xmllint.c:2248
    #10 0x8051352 in main ~/libxml2-2.9.3/xmllint.c:3759
    #11 0xb5f13a82 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287
    #12 0x80555c0 in _start (~/libxml2-2.9.3/libxml293/bin/xmllint+0x80555c0)
0xb4900116 is located 22 bytes inside of 8194-byte region [0xb4900100,0xb4902102)
freed by thread T0 here:
    #0 0xb61059b4 in __interceptor_realloc (/usr/lib/i386-linux-gnu/libasan.so.0+0x169b4)
    #1 0x85150b2 in xmlBufGrowInternal ~/libxml2-2.9.3/buf.c:486
    #2 0x85150b2 in xmlBufGrow ~/libxml2-2.9.3/buf.c:515
previously allocated by thread T0 here:
    #0 0xb6105854 in malloc (/usr/lib/i386-linux-gnu/libasan.so.0+0x16854)
    #1 0x8511a90 in xmlBufCreateSize ~/libxml2-2.9.3/buf.c:172
SUMMARY: AddressSanitizer: heap-use-after-free /usr/include/i386-linux-gnu/bits/string3.h:51 ??
Shadow bytes around the buggy address:
  0x3691ffd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x3691ffe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x3691fff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x36920000: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x36920010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x36920020: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x36920030: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x36920040: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x36920050: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x36920060: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x36920070: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:     fa
  Heap righ redzone:     fb
  Freed Heap region:     fd
  Stack left redzone:    f1
  Stack mid redzone:     f2
  Stack right redzone:   f3
  Stack partial redzone: f4
  Stack after return:    f5
  Stack use after scope: f8
  Global redzone:        f9
  Global init order:     f6
  Poisoned by user:      f7
  ASan internal:         fe
==22783== ABORTING
Aborted

Found with american fuzzy lop, written by lcamtuf.

The Libxml2 HTML parser is used by PHP in DOMDocument::load(), but it does not seem to affect the current PHP release in so far what was tested. This particular issue seems is limited to a denial of service in ASAN build only. A minimized crasher that works for Libxml2-2.9.3 ASAN build is attached as c00_id00_min.html.
Comment 1 Wei Lei 2016-01-07 15:58:08 UTC
Also affects a similar path involving htmlParseSystemLiteral, with info-discl capacity. Please restrict the report, thanks !
Comment 2 David Kilzer 2016-03-03 00:10:21 UTC
Created attachment 322923 [details] [review]
Proposed Patch v1
Comment 3 Wei Lei 2016-03-03 00:57:22 UTC
Hi, thanks for the work !

A parallel bug exists in htmlParseSystemLiteral(), should be reproducible by changing the string "PUBLIC" in the crasher to "SYSTEM". This seems not touched by the patch. Please let me know if you cannot reproduce, I will get the original crasher for this bug.

There could be other vectors other tham these two though.
Comment 4 Wei Lei 2016-03-03 01:59:29 UTC
Created attachment 322935 [details]
Minimized crasher for the parallel bug

htmlParseSystemLiteral
Comment 5 Wei Lei 2016-03-03 02:01:19 UTC
$ $GITXML --html c03_id00_min.html 
c03_id00_min.html:1: HTML parser error : Misplaced DOCTYPE declaration
0<!DOCTYPE r000 SYSTEM '00000000000000000000000000000000000000000000000000000000
 ^
=================================================================
==13334== ERROR: AddressSanitizer: heap-use-after-free on address 0xb4a00118 at pc 0x850872c bp 0xbfde2c08 sp 0xbfde2bfc
READ of size 1 at 0xb4a00118 thread T0
    #0 0x850872b in memcpy /usr/include/i386-linux-gnu/bits/string3.h:51
    #1 0x850872b in xmlStrndup /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/xmlstring.c:50
    #2 0x804b9dd in htmlParseSystemLiteral /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/HTMLparser.c:2792
    #3 0x804c381 in htmlParseExternalID /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/HTMLparser.c:3094
    #4 0x804c381 in htmlParseDocTypeDecl /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/HTMLparser.c:3439
    #5 0x831458f in htmlParseContentInternal /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/HTMLparser.c:4585
    #6 0x831be86 in htmlParseDocument /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/HTMLparser.c:4769
    #7 0x833b2bb in htmlDoRead /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/HTMLparser.c:6741
    #8 0x833b2bb in htmlReadFile /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/HTMLparser.c:6799
    #9 0x8059367 in parseAndPrintFile /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/xmllint.c:2248
    #10 0x8050fdd in main /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/xmllint.c:3759
    #11 0xb5f90a82 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287
    #12 0x80553e0 in _start (/home/weilei/fuzz_home/_xml_fuzz/bin/xmllint+0x80553e0)
0xb4a00118 is located 24 bytes inside of 8194-byte region [0xb4a00100,0xb4a02102)
freed by thread T0 here:
    #0 0xb61829b4 in __interceptor_realloc (/usr/lib/i386-linux-gnu/libasan.so.0+0x169b4)
    #1 0x85159bf in xmlBufGrowInternal /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/buf.c:486
    #2 0x85159bf in xmlBufGrow /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/buf.c:515
previously allocated by thread T0 here:
    #0 0xb6182854 in malloc (/usr/lib/i386-linux-gnu/libasan.so.0+0x16854)
    #1 0x851230c in xmlBufCreateSize /home/weilei/libxml2-0f84ee239f2fb076231386f1d1ce8dec541df28e/buf.c:172
SUMMARY: AddressSanitizer: heap-use-after-free /usr/include/i386-linux-gnu/bits/string3.h:51 ??
Shadow bytes around the buggy address:
  0x3693ffd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x3693ffe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x3693fff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x36940000: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x36940010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x36940020: fd fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd
  0x36940030: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x36940040: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x36940050: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x36940060: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x36940070: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:     fa
  Heap righ redzone:     fb
  Freed Heap region:     fd
  Stack left redzone:    f1
  Stack mid redzone:     f2
  Stack right redzone:   f3
  Stack partial redzone: f4
  Stack after return:    f5
  Stack use after scope: f8
  Global redzone:        f9
  Global init order:     f6
  Poisoned by user:      f7
  ASan internal:         fe
==13334== ABORTING
Aborted
Comment 6 David Kilzer 2016-03-03 19:40:08 UTC
(In reply to Wei Lei from comment #3)
> Hi, thanks for the work !
> 
> A parallel bug exists in htmlParseSystemLiteral(), should be reproducible by
> changing the string "PUBLIC" in the crasher to "SYSTEM". This seems not
> touched by the patch. Please let me know if you cannot reproduce, I will get
> the original crasher for this bug.
> 
> There could be other vectors other tham these two though.

Thanks!  Working on the fix to SYSTEM as well.  We are able to reproduce with the alternate test case.

I'm not sure we know the code base well enough to find every instance of this code pattern.  Maybe Daniel knows?
Comment 7 David Kilzer 2016-03-07 14:54:33 UTC
Created attachment 323287 [details] [review]
Proposed Patch v2

Adds test case and fix for htmlParseSystemLiteral().
Comment 8 David Kilzer 2016-05-03 22:15:10 UTC
Comment on attachment 323287 [details] [review]
Proposed Patch v2

>From b34b8dc69637f0a2b6c1bad533dd01b37e86efba Mon Sep 17 00:00:00 2001
>From: Pranjal Jumde <pjumde@apple.com>
>Date: Wed, 2 Mar 2016 15:52:24 -0800
>Subject: [PATCH] Bug 760263: Heap use-after-free in htmlParsePubidLiteral and
> htmlParseSystemiteral <https://bugzilla.gnome.org/show_bug.cgi?id=760263>

Typo:  "htmlParseSystemiteral" => "htmlParseSystemLiteral"
Comment 9 Daniel Veillard 2016-05-09 02:05:56 UTC
David, Pranjal,

when applying patch v2 on to of the current head I'm having issues with
"make check" as the errors I get don't seems to be stable:

## SAX2 callbacks regression tests
## XML push regression tests
## HTML regression tests
Result for ./test/HTML/760263-2.html failed
File ./test/HTML/760263-2.html generated an error
Result for ./test/HTML/760263.html failed
File ./test/HTML/760263.html generated an error
## Push HTML regression tests
## HTML SAX regression tests
Got a difference for ./test/HTML/760263-2.html
File ./test/HTML/760263-2.html generated an error
Got a difference for ./test/HTML/760263.html
File ./test/HTML/760263.html generated an error
## Valid documents regression tests
## Validity checking regression tests
## Streaming validity checking regression tests


I don't really understand where this comes from ATM:
thinkpad2:~/XML -> ./xmllint --html ./test/HTML/760263-2.html > res 2> res2
thinkpad2:~/XML -> diff res result/HTML/760263-2.html
thinkpad2:~/XML -> diff res2 result/HTML/760263-2.html.err
thinkpad2:~/XML -> ./xmllint --push --html ./test/HTML/760263-2.html > res 2> res2
thinkpad2:~/XML -> diff res result/HTML/760263-2.html
thinkpad2:~/XML -> diff res2 result/HTML/760263-2.html.err
thinkpad2:~/XML -> 


  If you could look at this this week, I would rather have the tests
when pushing the new release next w.e.

   thanks !

Daniel
Comment 10 David Kilzer 2016-05-17 18:02:29 UTC
(In reply to Daniel Veillard from comment #9)
> David, Pranjal,
> 
> when applying patch v2 on to of the current head I'm having issues with
> "make check" as the errors I get don't seems to be stable:
> 
> ## SAX2 callbacks regression tests
> ## XML push regression tests
> ## HTML regression tests
> Result for ./test/HTML/760263-2.html failed
> File ./test/HTML/760263-2.html generated an error
> Result for ./test/HTML/760263.html failed
> File ./test/HTML/760263.html generated an error
> ## Push HTML regression tests
> ## HTML SAX regression tests
> Got a difference for ./test/HTML/760263-2.html
> File ./test/HTML/760263-2.html generated an error
> Got a difference for ./test/HTML/760263.html
> File ./test/HTML/760263.html generated an error
> ## Valid documents regression tests
> ## Validity checking regression tests
> ## Streaming validity checking regression tests
> 
> 
> I don't really understand where this comes from ATM:
> thinkpad2:~/XML -> ./xmllint --html ./test/HTML/760263-2.html > res 2> res2
> thinkpad2:~/XML -> diff res result/HTML/760263-2.html
> thinkpad2:~/XML -> diff res2 result/HTML/760263-2.html.err
> thinkpad2:~/XML -> ./xmllint --push --html ./test/HTML/760263-2.html > res
> 2> res2
> thinkpad2:~/XML -> diff res result/HTML/760263-2.html
> thinkpad2:~/XML -> diff res2 result/HTML/760263-2.html.err
> thinkpad2:~/XML -> 
> 
> 
>   If you could look at this this week, I would rather have the tests
> when pushing the new release next w.e.

When developing these fixes, we have an internal tree that we commit the fixes to as we go, so it's possible that these test results depend on some previous fix.

In addition to that, though, I do try to build and test each patch individually on a plain libxml2.git repository, so I'm a little surprised these tests are causing issues.

One thing you might try is to apply the "runtest -u" patch for Bug 611807 which creates more definitive test results than xmllint (since it just tells runtest to always write out its results when run with the "-u" switch).  Then compare the results to xmllint, or (better yet) simply run "runtests -u" once, then "runtests" to make sure the results match for all the various parsing modes.

I suppose it's possible that high-bit characters could have been mangled somehow in the patch.  I can try sending you a zip file with the patch if that would help.
Comment 11 Daniel Veillard 2016-05-22 03:39:31 UTC
Tried that that didn't work:

thinkpad2:~/XML -> find result -name 760263\* -exec rm {} \;
thinkpad2:~/XML -> ./runtest -u
## XML regression tests
## XML regression tests on memory
## XML entity subst regression tests
## XML Namespaces regression tests
## Error cases regression tests
## Error cases stream regression tests
## Reader regression tests
## Reader entities substitution regression tests
## Reader on memory regression tests
## Walker regression tests
## SAX1 callbacks regression tests
## SAX2 callbacks regression tests
## XML push regression tests
## HTML regression tests
## Push HTML regression tests
## HTML SAX regression tests
## Valid documents regression tests
## Validity checking regression tests
## Streaming validity checking regression tests
## Streaming validity error checking regression tests
## General documents valid regression tests
## XInclude regression tests
## XInclude xmlReader regression tests
## XInclude regression tests stripping include nodes
## XInclude xmlReader regression tests stripping include nodes
## XPath expressions regression tests
## XPath document queries regression tests
## XPointer document queries regression tests
## xml:id regression tests
## URI parsing tests
## URI base composition tests
## Path URI conversion tests
## Schemas regression tests
## Relax-NG regression tests
## Relax-NG streaming regression tests
## Pattern regression tests
## C14N with comments regression tests
## C14N without comments regression tests
## C14N exclusive without comments regression tests
## C14N 1.1 without comments regression tests
## Catalog and Threads regression tests
Total 2963 tests, no errors
thinkpad2:~/XML -> ./runtest 
## XML regression tests
## XML regression tests on memory
## XML entity subst regression tests
## XML Namespaces regression tests
## Error cases regression tests
## Error cases stream regression tests
## Reader regression tests
## Reader entities substitution regression tests
## Reader on memory regression tests
## Walker regression tests
## SAX1 callbacks regression tests
## SAX2 callbacks regression tests
## XML push regression tests
## HTML regression tests
Result for ./test/HTML/760263-2.html failed
File ./test/HTML/760263-2.html generated an error
Result for ./test/HTML/760263.html failed
File ./test/HTML/760263.html generated an error
## Push HTML regression tests
....
Total 2963 tests, 2 errors, 0 leaks
thinkpad2:~/XML ->

I added more debugging to runtest :

## HTML regression tests
file result/HTML/760263-2.html is 64037 bytes, result is 127958 bytes
Result for ./test/HTML/760263-2.html failed in result/HTML/760263-2.html
File ./test/HTML/760263-2.html generated an error
file result/HTML/760263.html is 64038 bytes, result is 127959 bytes
Result for ./test/HTML/760263.html failed in result/HTML/760263.html
File ./test/HTML/760263.html generated an error
## Push HTML regression tests
## Push HTML regression tests
## HTML SAX regression tests
## Valid documents regression tests

  I'm puzzled because Push test works, both generate their output from the
built tree using htmlDocDumpMemory() but in the case of non push the
size returned is 127958 instead of 64037 bytes.

  I looked at the content, the output is correct except padded with zeroes
after byte 64037 . I think ./runtest -u writes the file twice, once in normal
mode, then in push mode which is why the final content of the result file
is correct.

  If I don't find out the culprit I will push the patch without the added
tests for the release, and that will need to be sorted out asynchronously.
at least that doesn't sound a security issue.

Daniel
Comment 13 Olav Vitters 2016-08-22 20:24:06 UTC
Removing security group as commit is public