GNOME Bugzilla – Bug 427622
make xmlParserMaxDepth configurable at runtime
Last modified: 2021-07-05 13:21:12 UTC
It would be nice to have xmlParserMaxDepth configurable at runtime, so that people who want to use very deep xml files can (we got a report on debian from libgdome2-ocaml users using files with a depth of almost 2000...)
I want to confirm this bug, and show how to easily reproduce. The Ubuntu repositories contain sword module... sword-text-kjv includes an xml version of the KJV bible. Use the command mod2osis KJV to extract a 23 MB xml file which causes the reproduceable bug. (The mod2osis cmd line util is in package libsword6, also in the ubuntu repositories.) To force a crash try parsing the the kjv.osis file, I'm using php5 which uses libxml2, the following short php file that uses the php command line environment that will trigger this bug reliably, with a php wrapper around the original libxml2 error indicating that this parm is to small. (I'm using php 5.1.6 but that should not matter here, any 5.1.X will use libxml2) --------------------- #!/usr/bin/php <?php // Set $in to whatever location you put kjv.osis // $in = '/tmp/kjv.osis'; $in = '~/kjv.osis'; $reader = new XMLReader(); $reader->open($in); while ($reader->read()) { switch ($reader->nodeType) { default: // do nothing, just waiting for a crash. break; } } ?> --- Clearly xml files that are very deep are being distributed with current linux systems, and the 1024 depth is not big enough to handle them. I would recommend a quick fix, setting it to, say 4096 or whatever, to cover normal use and I would also recommend a parm somewhere to set this.
I would argue about the sanity of whoever produced that XML ... It has to be a recursive kind of structure and I don't see how a normal text, no matter the size can be that deep with a significant structure, I mean something which makes sense to an human. What is the DTD for that kind of document ? How are you sure it's not an instance building error or a logical error in the design. I'm not sure I want to add yet another global variable to libxml2. I'm not sure going from 1024 to 4096 would solve the problem in any way either. Daniel
Actually this is fixed, cause xmlParserMaxDepth is available trough parserInternals.h. I've got another improvement request on this. Will it be accepted to add per parser context depth limit? (I mean if I send a patch for that). It's useful to do this some times when we have several contexts and want to limit depth a particular one. This is a question to Daniel obviously.
I'm not sure, on one hand it's cleaner to use per parser context than a global variable, on the other hand it's yet another knob at the parser level, I think it's acceptable though, Daniel
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled). If you can still reproduce the situation described in this ticket in a recent and supported software version, then please follow https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines and create a new ticket at https://gitlab.gnome.org/GNOME/libxml2/-/issues/ Thank you for your understanding and your help.