After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 595128 - Regression when using read callbacks with xmlReadIO breaking inkscape
Regression when using read callbacks with xmlReadIO breaking inkscape
Status: RESOLVED FIXED
Product: libxml2
Classification: Platform
Component: general
git master
Other All
: Normal critical
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2009-09-14 07:20 UTC by Mike Hommey
Modified: 2009-09-15 18:34 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Mike Hommey 2009-09-14 07:20:16 UTC
With libxml2 2.7.4, this is what the startup of inkscape looks like:

/usr/share/inkscape/extensions/render_barcode.inx:1: parser error : XML
/ declaration allowed only at the start of the document
<?xml version="1.0"?>
     ^

** (inkscape:32006): CRITICAL **: Inkscape::Extension::Extension*
** Inkscape::Extension::build_from_reprdoc(Inkscape::XML::Document*,
** Inkscape::Extension::Implementation::Implementation*): assertion `doc !=
** NULL' failed

** (inkscape:32006): WARNING **: Unable to create extension from definition
** file /usr/share/inkscape/extensions/render_barcode.inx.

/usr/share/inkscape/templates/default.svg:1: parser error : XML declaration
/ allowed only at the start of the document
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
     ^

** (inkscape:32006): CRITICAL **: SPDesktop* sp_file_new(const
** Glib::ustring&): assertion `doc != NULL' failed

The reason for this happening is that only 4 bytes of input are in the input buffer when checking for <?xml<blank> in parser.c:10135. This, in turn, is caused by a change introduced in commit 7e385bd:

diff --git a/parser.c b/parser.c
index 0a10b0f..0d856b7 100644
--- a/parser.c
+++ b/parser.c
@@ -10109,8 +10109,9 @@ xmlParseDocument(xmlParserCtxtPtr ctxt) {
 
     /*
      * Check for the XMLDecl in the Prolog.
+     * do not GROW here to avoid the detected encoder to decode more
+     * than just the first line
      */
-    GROW;
     if ((CMP5(CUR_PTR, '<', '?', 'x', 'm', 'l')) && (IS_BLANK_CH(NXT(5)))) {
 
        /*

Since the buffer is not grown, and the input only has 4 bytes at this moment, the l and the following blank character can't be detected.

I don't exactly understand why this GROW has been removed here, since we possibly have not enough input for the test to be done properly, but the added comment leads to think there is some reason behind this, except it breaks the read callback case.

If it is not possible to add the GROW back, maybe more than 4 bytes should be read through the callback, then.

What do you think?
Comment 1 Daniel Veillard 2009-09-15 18:34:09 UTC
Patched in git ...

http://git.gnome.org/cgit/libxml2/commit/?id=9d3d141c412baa5c713ad3df48f1a4d179e
07b05

in practice if we didn't got the full first line it's a balance between
the risk of corrupting data by using the wrong encoding as in

https://bugzilla.gnome.org/show_bug.cgi?id=566012

and possibly failing to parse. Somehow I prefer a parsing failure
error which is easy to detect than a silent data corruption. I think
the code in HEAD is the right balance.

Daniel