After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 321545 - xmlParseDocument crashes on deeply-nested xml
xmlParseDocument crashes on deeply-nested xml
Status: RESOLVED FIXED
Product: libxml2
Classification: Platform
Component: general
2.6.22
Other All
: Normal normal
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2005-11-15 20:13 UTC by Ben Darnell
Modified: 2006-09-19 12:45 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Ben Darnell 2005-11-15 20:13:41 UTC
Please describe the problem:
On deeply-nested xml (say a 50KB file of nothing but "<p><p><p>..."),
xmlParseDocument overflows the stack and crashes due to recursive calls to
xmlParseElement and xmlParseContent.  The xmlParserMaxDepth limit is not
enforced on this code path.

Steps to reproduce:
#include <stdio.h>
#include <string.h>
#include <sys/resource.h>
#include <unistd.h>
#include "libxml/parser.h"
#include "libxml/parserInternals.h"

static void SetStackLimit(int limit) {
  struct rlimit stack_rlimit;
  getrlimit(RLIMIT_STACK, &stack_rlimit);
  if (stack_rlimit.rlim_cur > limit) {
    printf("limiting stack size to %d\n", limit);
    stack_rlimit.rlim_cur = limit;
    setrlimit(RLIMIT_STACK, &stack_rlimit);
  } else {
    printf("current stack limited to %d bytes, which is smaller than %d\n",
           (int)stack_rlimit.rlim_cur, limit);
  }
}

const char* CreateInputData(int depth) {
  char *result = malloc(depth * 3 + 1);
  int i;
  for (i = 0; i < depth * 3; i+=3) {
    result[i] = '<';
    result[i+1] = 'p';
    result[i+2] = '>';
  }
  result[depth * 3] = '\0';
  return result;
}

void Parse(const char *input_data) {
  xmlParserCtxt *ctxt = xmlCreateMemoryParserCtxt(
      input_data, strlen(input_data));
  xmlParseDocument(ctxt);
}

int main() {
  SetStackLimit(2048 * 1024);

  const char *input_data = CreateInputData(16000);

  Parse(input_data);

  return 0;
}


Actual results:
The test program segfaults (on linux/x86).

Expected results:
The test program should complete successfully.

Does this happen every time?
Yes.

Other information:
This patch fixes the problem, although I don't know if it's the best solution:

--- libxml2-2.6.22.orig/parser.c        2005-11-15 12:06:15.000000000 -0800
+++ libxml2-2.6.22/parser.c     2005-11-15 12:10:32.000000000 -0800
@@ -8414,6 +8414,14 @@
     xmlNodePtr ret;
     int nsNr = ctxt->nsNr;

+    if ((unsigned int) ctxt->nameNr > xmlParserMaxDepth) {
+        xmlFatalErrMsgInt(ctxt, XML_ERR_INTERNAL_ERROR,
+                          "Excessive depth in document: change
xmlParserMaxDepth = %d\n",
+                          xmlParserMaxDepth);
+        ctxt->instate = XML_PARSER_EOF;
+        return;
+    }
+
     /* Capture start position */
     if (ctxt->record_info) {
         node_info.begin_pos = ctxt->input->consumed +
Comment 1 Ben Darnell 2005-11-15 20:23:02 UTC
> Expected results:
> The test program should complete successfully.

By "successfully", of course I mean "run to completion without dying" - it
should still fail to parse this file when it hits the depth limit.
Comment 2 Ben Darnell 2005-12-07 08:26:40 UTC
The patch I proposed for parser.c doesn't work - for some inputs the parser will
go into an infinite loop (minimal example: thousands of "<p>"s followed by
"&#8127;")
Comment 3 Daniel Veillard 2005-12-07 08:50:52 UTC
I'm a bit busy ATM, this looks like a potential serious problem though.
I will try to get this fixed during at the end of the month. I hope it
can wait till then.

Daniel
Comment 4 Ben Darnell 2006-09-18 19:45:47 UTC
Ping.  Any updates on this?  I have a workaround that has been working
for me, but it would be nice to have an official fix.  My workaround
is a small modification to the patch I proposed last time (diff is to
2.6.26 version of parser.c).  Note that I'm not sure that this patch
is completely correct, only that it's been working for me.

--- parser.c.orig       2006-09-12 18:32:47.000000000 -0700
+++ parser.c    2006-09-12 18:26:06.000000000 -0700
@@ -8349,7 +8349,8 @@
 xmlParseContent(xmlParserCtxtPtr ctxt) {
    GROW;
    while ((RAW != 0) &&
-          ((RAW != '<') || (NXT(1) != '/'))) {
+          ((RAW != '<') || (NXT(1) != '/')) &&
+           (ctxt->instate != XML_PARSER_EOF)) {
       const xmlChar *test = CUR_PTR;
       unsigned int cons = ctxt->input->consumed;
       const xmlChar *cur = ctxt->input->cur;
@@ -8442,6 +8443,14 @@
    xmlNodePtr ret;
    int nsNr = ctxt->nsNr;

+    if ((unsigned int) ctxt->nameNr > xmlParserMaxDepth) {
+        xmlFatalErrMsgInt(ctxt, XML_ERR_INTERNAL_ERROR,
+                          "Excessive depth in document: change
xmlParserMaxDepth = %d\n",
+                          xmlParserMaxDepth);
+        ctxt->instate = XML_PARSER_EOF;
+        return;
+    }
+
    /* Capture start position */
    if (ctxt->record_info) {
        node_info.begin_pos = ctxt->input->consumed +

Thanks,
-Ben
Comment 5 Daniel Veillard 2006-09-19 12:45:15 UTC
I think that got fixed in some ways, or I'm unable to reproduce with
the current CVS head:

(gdb) r
Starting program: /u/veillard/XML/tst
Reading symbols from shared object read from target memory...done.
Loaded system supplied DSO at 0xd41000
[Thread debugging using libthread_db enabled]
[New Thread -1209034544 (LWP 24537)]
limiting stack size to 2097152
Entity: line 1: parser error : Excessive depth in document: change xmlParserMaxDepth = 1024
<p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p><p                                                                               ^
Program exited normally.
(gdb)

I don't see a segfault myself, hum ... I also tried to compile and link against
the installed library on Fedora Core (Linux/i386) without seing the problem.

Now that said, the patch looks just sane to me and it's probably better to be
safe than sorry and make the check twice. So I applied the patch (manually)
to CVS head, ran verious regression tests and since it looks fine, commited it.

 So fixed, thanks a lot !

Daniel