After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 756263 - Buffer overead with XML parser in xmlNextChar, causes segfault when compiled with ASAN
Buffer overead with XML parser in xmlNextChar, causes segfault when compiled ...
Status: RESOLVED FIXED
Product: libxml2
Classification: Platform
Component: general
git master
Other Linux
: Normal normal
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2015-10-08 21:12 UTC by Hugh Davenport
Modified: 2015-12-14 21:18 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Input file causing crash when compiled with ASAN (20 bytes, application/octet-stream)
2015-10-08 21:21 UTC, Hugh Davenport
  Details
Patch to fix (1.26 KB, patch)
2015-10-08 21:22 UTC, Hugh Davenport
none Details | Review

Description Hugh Davenport 2015-10-08 21:12:59 UTC
This was found on git master, commit cf77e60515045bdd66f2c59c69a06e603b470eae

When compiled with ASAN, it gives the following crash

 $ ./xmllint ~/fuzzing/xml-tmin/id\:000000*                                                                                                                          
/root/fuzzing/xml-tmin/id:000000,sig:06,src:002541,op:flip1,pos:19:1: parser error : Input is not proper UTF-8, indicate encoding !
Bytes: 0x80 0x5B 0x3C 0x21
<!DOCTYPE[<!ENTITYJ
         ^
/root/fuzzing/xml-tmin/id:000000,sig:06,src:002541,op:flip1,pos:19:1: parser error : xmlParseDocTypeDecl : no DOCTYPE name !
<!DOCTYPE[<!ENTITYJ
         ^
/root/fuzzing/xml-tmin/id:000000,sig:06,src:002541,op:flip1,pos:19:1: parser error : DOCTYPE improperly terminated
<!DOCTYPE[<!ENTITYJ
         ^
/root/fuzzing/xml-tmin/id:000000,sig:06,src:002541,op:flip1,pos:19:1: parser error : Space required after '<!ENTITY'
<!DOCTYPE[<!ENTITYJ
                   ^
/root/fuzzing/xml-tmin/id:000000,sig:06,src:002541,op:flip1,pos:19:1: parser error : Space required after the entity name
<!DOCTYPE[<!ENTITYJ
                    ^
/root/fuzzing/xml-tmin/id:000000,sig:06,src:002541,op:flip1,pos:19:1: parser error : Entity value required
<!DOCTYPE[<!ENTITYJ
                    ^
/root/fuzzing/xml-tmin/id:000000,sig:06,src:002541,op:flip1,pos:19:1: parser error : Space required before 'NDATA'
<!DOCTYPE[<!ENTITYJ
                    ^
/root/fuzzing/xml-tmin/id:000000,sig:06,src:002541,op:flip1,pos:19:1: parser error : xmlParseEntityDecl: entity J not terminated
<!DOCTYPE[<!ENTITYJ
                    ^
/root/fuzzing/xml-tmin/id:000000,sig:06,src:002541,op:flip1,pos:19:1: parser error : internal error: xmlParseInternalSubset: error detected in Markup declaration


^
/root/fuzzing/xml-tmin/id:000000,sig:06,src:002541,op:flip1,pos:19:1: parser error : DOCTYPE improperly terminated

^
=================================================================
==18412== ERROR: AddressSanitizer: global-buffer-overflow on address 0x000000c136a1 at pc 0x457370 bp 0x7ffd5ee33770 sp 0x7ffd5ee33768
READ of size 1 at 0x000000c136a1 thread T0
    #0 0x45736f in xmlNextChar /root/libxml2/parserInternals.c:535
    #1 0x4fd936 in xmlParseInternalSubset /root/libxml2/parser.c:8447
    #2 0x52717b in xmlParseDocument /root/libxml2/parser.c:10836
    #3 0x55b5b6 in xmlDoRead /root/libxml2/parser.c:15324
    #4 0x55b5b6 in xmlReadFile /root/libxml2/parser.c:15386
    #5 0x4183b7 in parseAndPrintFile /root/libxml2/xmllint.c:2401
    #6 0x40de46 in main /root/libxml2/xmllint.c:3759
    #7 0x7f920fcc7ec4 (/lib/x86_64-linux-gnu/libc.so.6+0x21ec4)
    #8 0x40fad9 in _start (/root/libxml2/xmllint+0x40fad9)
0x000000c136a1 is located 0 bytes to the right of global variable '*.LC25 (parser.c)' (0xc136a0) of size 1
  '*.LC25 (parser.c)' is ascii string ''
0x000000c136a1 is located 63 bytes to the left of global variable '*.LC26 (parser.c)' (0xc136e0) of size 40
  '*.LC26 (parser.c)' is ascii string 'Name %s is not XML Namespace compliant
'
SUMMARY: AddressSanitizer: global-buffer-overflow /root/libxml2/parserInternals.c:529 xmlNextChar
Shadow bytes around the buggy address:
  0x00008017a680: f9 f9 f9 f9 00 00 00 00 02 f9 f9 f9 f9 f9 f9 f9
  0x00008017a690: 00 00 00 07 f9 f9 f9 f9 07 f9 f9 f9 f9 f9 f9 f9
  0x00008017a6a0: 00 00 00 00 00 00 00 03 f9 f9 f9 f9 00 00 02 f9
  0x00008017a6b0: f9 f9 f9 f9 00 07 f9 f9 f9 f9 f9 f9 00 00 02 f9
  0x00008017a6c0: f9 f9 f9 f9 00 01 f9 f9 f9 f9 f9 f9 00 00 00 02
=>0x00008017a6d0: f9 f9 f9 f9[01]f9 f9 f9 f9 f9 f9 f9 00 00 00 00
  0x00008017a6e0: 00 f9 f9 f9 f9 f9 f9 f9 00 02 f9 f9 f9 f9 f9 f9
  0x00008017a6f0: 07 f9 f9 f9 f9 f9 f9 f9 06 f9 f9 f9 f9 f9 f9 f9
  0x00008017a700: 00 00 00 00 00 00 f9 f9 f9 f9 f9 f9 05 f9 f9 f9
  0x00008017a710: f9 f9 f9 f9 00 00 00 01 f9 f9 f9 f9 06 f9 f9 f9
  0x00008017a720: f9 f9 f9 f9 00 00 00 00 02 f9 f9 f9 f9 f9 f9 f9
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:     fa
  Heap righ redzone:     fb
  Freed Heap region:     fd
  Stack left redzone:    f1
  Stack mid redzone:     f2
  Stack right redzone:   f3
  Stack partial redzone: f4
  Stack after return:    f5
  Stack use after scope: f8
  Global redzone:        f9
  Global init order:     f6
  Poisoned by user:      f7
  ASan internal:         fe
==18412== ABORTING
Aborted

The testfile I used was generated with afl-fuzz, and then minimized. It's hex representation is as follows:

 $ xxd -g 1 ~/fuzzing/xml-tmin/id\:000000*
0000000: 3c 21 44 4f 43 54 59 50 45 80 5b 3c 21 45 4e 54  <!DOCTYPE.[<!ENT
0000010: 49 54 59 4a                                      ITYJ


I git bisected it back to the following commit, which is meant to stop the parser on an unterminated entity

commit a7dfab7411cbf545f359dd3157e5df1eb0e7ce31
Author: Daniel Veillard <veillard@redhat.com>
Date:   Mon Feb 23 11:17:35 2015 +0800

    Stop parsing on entities boundaries errors
    
    For https://bugzilla.gnome.org/show_bug.cgi?id=744980
    
    There are times, like on unterminated entities that it's preferable to
    stop parsing, even if that means less error reporting. Entities are
    feeding the parser on further processing, and if they are ill defined
    then it's possible to get the parser to bug. Also do the same on
    Conditional Sections if the input is broken, as the structure of
    the document can't be guessed.

diff --git a/parser.c b/parser.c
index a8d1b67..bbe97eb 100644
--- a/parser.c
+++ b/parser.c
@@ -5658,6 +5658,7 @@ xmlParseEntityDecl(xmlParserCtxtPtr ctxt) {
        if (RAW != '>') {
            xmlFatalErrMsgStr(ctxt, XML_ERR_ENTITY_NOT_FINISHED,
                    "xmlParseEntityDecl: entity %s not terminated\n", name);
+           xmlStopParser(ctxt);
        } else {
            if (input != ctxt->input) {
                xmlFatalErrMsg(ctxt, XML_ERR_ENTITY_BOUNDARY,


Running it under gdb, with a breakpoint on xmlStopParser, and then showing the backtrace there, and when it crashed:

 $ gdb -ex "break xmlStopParser" -ex "run" -ex "backtrace" -ex "continue" -ex "backtrace" --args ./xmllint ~/fuzzing/xml-tmin/id\:000000*                            
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./xmllint...done.
Breakpoint 1 at 0x4e6ae6: xmlStopParser. (5 locations)
Starting program: /root/libxml2/xmllint /root/fuzzing/xml-tmin/id:000000,sig:06,src:002541,op:flip1,pos:19
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
/root/fuzzing/xml-tmin/id:000000,sig:06,src:002541,op:flip1,pos:19:1: parser error : Input is not proper UTF-8, indicate encoding !
Bytes: 0x80 0x5B 0x3C 0x21
<!DOCTYPE[<!ENTITYJ
         ^
/root/fuzzing/xml-tmin/id:000000,sig:06,src:002541,op:flip1,pos:19:1: parser error : xmlParseDocTypeDecl : no DOCTYPE name !
<!DOCTYPE[<!ENTITYJ
         ^
/root/fuzzing/xml-tmin/id:000000,sig:06,src:002541,op:flip1,pos:19:1: parser error : DOCTYPE improperly terminated
<!DOCTYPE[<!ENTITYJ
         ^
/root/fuzzing/xml-tmin/id:000000,sig:06,src:002541,op:flip1,pos:19:1: parser error : Space required after '<!ENTITY'
<!DOCTYPE[<!ENTITYJ
                   ^
/root/fuzzing/xml-tmin/id:000000,sig:06,src:002541,op:flip1,pos:19:1: parser error : Space required after the entity name
<!DOCTYPE[<!ENTITYJ
                    ^
/root/fuzzing/xml-tmin/id:000000,sig:06,src:002541,op:flip1,pos:19:1: parser error : Entity value required
<!DOCTYPE[<!ENTITYJ
                    ^
/root/fuzzing/xml-tmin/id:000000,sig:06,src:002541,op:flip1,pos:19:1: parser error : Space required before 'NDATA'
<!DOCTYPE[<!ENTITYJ                                                                                                                                              [50/9271]
                    ^
/root/fuzzing/xml-tmin/id:000000,sig:06,src:002541,op:flip1,pos:19:1: parser error : xmlParseEntityDecl: entity J not terminated
<!DOCTYPE[<!ENTITYJ
                    ^

Breakpoint 1, xmlParseEntityDecl (ctxt=ctxt@entry=0x603c0000fc80) at parser.c:5672
5672                xmlStopParser(ctxt);
  • #0 xmlParseEntityDecl
    at parser.c line 5672
  • #1 xmlParseMarkupDecl
    at parser.c line 6954
  • #2 xmlParseInternalSubset
    at parser.c line 8420
  • #3 xmlParseDocument
    at parser.c line 10836
  • #4 xmlDoRead
    at parser.c line 15324
  • #5 xmlReadFile
    at parser.c line 15386
  • #6 parseAndPrintFile
    at xmllint.c line 2401
  • #7 main
    at xmllint.c line 3759
  • #0 __GI_raise
    at ../nptl/sysdeps/unix/sysv/linux/raise.c line 56
  • #1 __GI_abort
    at abort.c line 89
  • #2 ??
    from /usr/lib/x86_64-linux-gnu/libasan.so.0
  • #3 ??
    from /usr/lib/x86_64-linux-gnu/libasan.so.0
  • #4 ??
    from /usr/lib/x86_64-linux-gnu/libasan.so.0
  • #5 __asan_report_error
    from /usr/lib/x86_64-linux-gnu/libasan.so.0
  • #6 __asan_report_load1
    from /usr/lib/x86_64-linux-gnu/libasan.so.0
  • #7 xmlNextChar
    at parserInternals.c line 535
  • #8 xmlParseInternalSubset
    at parser.c line 8447
  • #9 xmlParseDocument
    at parser.c line 10836
  • #10 xmlDoRead
    at parser.c line 15324
  • #11 xmlReadFile
    at parser.c line 15386
  • #12 parseAndPrintFile
    at xmllint.c line 2401
  • #13 main
    at xmllint.c line 3759


This shows that there the parser is running through xmlParseMarkupDecl, and then instate is set to XML_PARSER_EOF (in xmlStopParser), but looking at the source for xmlParseMarkupDecl, on line 6990, the instate is changed to XML_PARSER_DTD. This then causes an issue when the macro NEXT is called twice more, the first time it will read EOF, then the second time it will overread the buffer into global memory space, causing a crash in ASAN. Without ASAN, there is potential to get input that could cause out of bounds memory to be returned to userspace through the use of libxml2, which could be used to cause denial of service attacks, or gain sensitive information.

The fix for this would be to check the instate in xmlParseMarkupDecl after it parses an entry, and if EOF is reached, then stop. A patch to do this is below:

diff --git a/parser.c b/parser.c
index a65e4cc..04b0e58 100644
--- a/parser.c
+++ b/parser.c
@@ -6970,6 +6970,15 @@ xmlParseMarkupDecl(xmlParserCtxtPtr ctxt) {
            xmlParsePI(ctxt);
        }
     }
+
+    /*
+     * If the parser eached end of file, then stop while we know it is failing.
+     * This stops the parser being put into a bad state later on when there is no
+     * more input.
+     */
+    if (ctxt->instate == XML_PARSER_EOF)
+        return;
+
     /*
      * This is only for internal subset. On external entities,
      * the replacement is done before parsing stage
Comment 1 Hugh Davenport 2015-10-08 21:21:50 UTC
Created attachment 312925 [details]
Input file causing crash when compiled with ASAN

Added minimized fuzzing input causing crash.
Comment 2 Hugh Davenport 2015-10-08 21:22:55 UTC
Created attachment 312926 [details] [review]
Patch to fix

Added a patch to fix this bug, as shown in the main description. For future bug reports, is this the best way to get you patches? Or should I push it to a public git server somewhere?

Cheers,

Hugh
Comment 3 Hugh Davenport 2015-10-08 21:26:02 UTC
Also, I couldn't see an option to specify this as a potential security issue. Let me know if there is a different process needed for further bugs (I've got a few in the pipeline).

Cheers,

Hugh
Comment 4 Hugh Davenport 2015-10-08 21:46:19 UTC
Also if someone could point me to documentation on how to write test cases, and I'll make a patch for a test case as well :D.

Cheers,

Hugh
Comment 5 Daniel Veillard 2015-10-09 16:28:36 UTC
Hi Hugh

sorry been busy all day with other stuff,
simplest is to make the bug accessible only to libxml2 developpers until
the analysis is made. At first glance the patch sounds good but I would
need to really get into the code.

For tests, well they take input in test/ subdirs and check for output
in results/ it;'s a bit raw but works, it can be a bit complex to generate

For future possible security bugs, I suppose you can toggle the "accessible
only to libxml2 developpers" flag when creating the bug,

  half past midnight, can't look at it today though,

    thanks !

Daniel
Comment 6 Hugh Davenport 2015-10-09 17:32:02 UTC
Cool thanks. Will do that for the next few. Will try to get another one in today (6am here ;-)).
Comment 7 Hugh Davenport 2015-10-09 18:20:02 UTC
Hmm, there isn't an option to set it only for libxml2 devs when creating a bug, even under advanced options.
Comment 8 Hugh Davenport 2015-10-26 04:16:22 UTC
Hey Daniel,

Just seeing whether you've had a chance to look at this yet. Is there anything I can do to help?

Cheers,

Hugh
Comment 9 Daniel Veillard 2015-11-03 12:44:43 UTC
  Hi Hugh,

no thanks, patch looks fine, applied and pushed:
https://git.gnome.org/browse/libxml2/commit/?id=ab2b9a93ff19cedde7befbf2fcc48c6e352b6cbe

  thanks a lot !

Daniel
Comment 10 Hugh Davenport 2015-11-04 03:03:00 UTC
Hi Daniel,

Thanks for that. Any chance this report can become public now?

I've also got another bug 756372 that is currently public. I've been meaning to file some more with patches as well, when I get a free moment! 

Cheers,

Hugh
Comment 11 Daniel Veillard 2015-11-04 03:14:26 UTC
yes you are right, for 756372 I have that as part of the issues I'm looking at
right now. Will travel until Friday, but hopefully can close that one too soon.

  thanks !

Daniel
Comment 12 Gaurav 2015-11-16 08:44:42 UTC
Hi Hugh,
===================================================================
The testfile I used was generated with afl-fuzz, and then minimized. It's hex representation is as follows:

 $ xxd -g 1 ~/fuzzing/xml-tmin/id\:000000*
0000000: 3c 21 44 4f 43 54 59 50 45 80 5b 3c 21 45 4e 54  <!DOCTYPE.[<!ENT
0000010: 49 54 59 4a                                      ITYJ
===================================================================
Could you tell me steps to generate testfiles using afl-fuzz.
Thanks.
Comment 13 Gaurav 2015-11-19 02:19:12 UTC
CVE Assigned - CVE-2015-8241
Comment 14 Hugh Davenport 2015-11-19 02:33:03 UTC
Hi Gaurav,

Sorry I didn't see your reply earlier.

I used a wrapper called afl-gcc as my CC environment variable, and also set AFL_USE_ASAN=1 (so that it compiles with ASAN). I then built libxml as usual.

I then ran the afl-fuzz tool like so:

afl-fuzz -i ~/fuzzing/libxml2/xmllint/in -o ~/fuzzing/libxml2/xmllint/out -- ~/builds/libxml2/xmllint @@

~/fuzzing/libxml2/xmllint/in is a directory that contains a small xml file, and the output of the fuzzing goes into ~/fuzzing/libxml2/xmllint/out/. The @@ is a placeholder for a file.

After running for a while, the tool found some "crashes", which were inputs that caused xmllint to produce a segfault. In this case, it only does the segfault when compiled with ASAN, but that shows a memory issue (in this case buffer-overread).

Hope that helps, thanks for the CVE number as well.

Cheers,

Hugh
Comment 15 Gaurav 2015-11-19 02:51:05 UTC
I want to know file in how you generated input files [~/fuzzing/libxml2/xmllint/in] ?
Comment 16 Hugh Davenport 2015-11-19 02:59:11 UTC
The file was:

<a b="c">d</a>

Which came from testcases/others/xml/small_document.xml in the afl tarball

I may have also had the output of xmllint --auto as well, which is 

<?xml version="1.0"?>
<info>abc</info>

Cheers,

Hugh
Comment 17 Gaurav 2015-11-19 03:05:25 UTC
Keeping test xml file at location ./afl_test/in/ & running below command outputs as below is there something wrong ?

afl-fuzz -i ./afl_test/in/ -o ./afl_test/out/ -- ./xmllint @@
afl-fuzz 1.95b by <lcamtuf@google.com>
[+] You have 40 CPU cores and 1 runnable tasks (utilization: 2%).
[+] Try parallel jobs - see docs/parallel_fuzzing.txt.
[*] Checking core_pattern...

[-] Hmm, your system is configured to send core dump notifications to an
    external utility. This will cause issues due to an extended delay
    between the fuzzed binary malfunctioning and this information being
    eventually relayed to the fuzzer via the standard waitpid() API.

    To avoid having crashes misinterpreted as hangs, please log in as root
    and temporarily modify /proc/sys/kernel/core_pattern, like so:

    echo core >/proc/sys/kernel/core_pattern

[-] PROGRAM ABORT : Pipe at the beginning of 'core_pattern'
         Location : check_crash_handling(), afl-fuzz.c:6933
Comment 18 Hugh Davenport 2015-11-19 03:23:26 UTC
Hi Gaurav,

Ah yes, you need a bit more setup first time you use afl. You kinda just have to run all the commands it suggests, then eventually it should start running and you should see a screen similar to http://lcamtuf.coredump.cx/afl/afl_screen.png

Cheers,

Hugh
Comment 19 Gsunde Orangen 2015-11-22 23:05:41 UTC
(In reply to Gaurav from comment #13)
> CVE Assigned - CVE-2015-8241
The release notes mention: "CVE-2015-8242 Buffer overead with HTML parser in push mode"
Which one is correct (8241 or 8242), or is there a different, second bug?
Comment 20 Gsunde Orangen 2015-11-22 23:31:59 UTC
Apologies - found that second one (CVE-2015-8242): bug 756372.
So I suggest that CVE-2015-8241 should be listed in the release notes for 2.9.3, too; thanks!
Gsunde
Comment 21 Gaurav 2015-11-23 00:38:32 UTC
Yes this CVE should be mentioned against commit id - https://git.gnome.org/browse/libxml2/commit/?id=ab2b9a93ff19cedde7befbf2fcc48c6e352b6cbe
Comment 22 Alexander Pyhalov 2015-12-14 14:21:53 UTC
After applying this patch to libxml2 2.9.2 I see the following regression during tests:

Ran 2269 tests, 16 errors, 0 leaks
Total 2269 tests, 16 errors, 0 leaks
See runxmlconf.log for detailed output

diff to previous (successfull runxmlconf.log):

diff -u  ~/srcs/tests/oi-userland/components/libxml2/build/i86/runxmlconf.log  build/i86/runxmlconf.log 
--- /export/home/alp/srcs/tests/oi-userland/components/libxml2/build/i86/runxmlconf.log 2015-12-14 16:49:35.711325386 +0300
+++ build/i86/runxmlconf.log    2015-12-14 17:16:51.471941259 +0300
@@ -13,6 +13,9 @@
 
 
 ------------
+test valid-not-sa-004 : xmlconf/xmltest//valid/not-sa/004.xml failed to validate a valid document
+
+------------
 Skipping error test pr-xml-euc-jp 
 
 ------------
Comment 23 Hugh Davenport 2015-12-14 21:15:56 UTC
Hi,

I get 15 errors, but I don't think this is a regression from 2.9.2 as I get the same on a clean checkout of 2.9.2 without this patch, and no change (still errors) with the patch.

Cheers,

Hugh
Comment 24 Alexander Pyhalov 2015-12-14 21:18:08 UTC
Hi, with 2.9.3 I see 15 errors. With 2.9.2 and this patch - 16. As it's not reproducible in the latest version, it's not so important.