After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 760183 - REGRESSION (v2.9.3): XML push parser fails with bogus UTF-8 encoding error when multi-byte character in large CDATA section is split across buffer
REGRESSION (v2.9.3): XML push parser fails with bogus UTF-8 encoding error wh...
Status: RESOLVED FIXED
Product: libxml2
Classification: Platform
Component: general
git master
Other All
: Normal normal
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on: 754947
Blocks:
 
 
Reported: 2016-01-05 21:36 UTC by David Kilzer
Modified: 2016-04-29 17:30 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch v1 (72.30 KB, patch)
2016-01-05 23:01 UTC, David Kilzer
none Details | Review

Description David Kilzer 2016-01-05 21:36:12 UTC
When the XML push parser encounters a multi-byte UTF-8 character that is split across a buffer, it now thinks there is an encoding error instead of returning characters parsed up to that point, and letting the next parsing pass validate the multi-byte UTF-8 character.

This regressed with the fix for Bug 754947 in v2.9.3:

Heap-buffer overread in push mode, parser.c xmlParseTryOrFinish
<https://bugzilla.gnome.org/show_bug.cgi?id=754947>
<https://git.gnome.org/browse/libxml2/commit/?id=4a5d80aded1da94cd55294e7207109712201b75b>
Comment 1 David Kilzer 2016-01-05 23:01:39 UTC
Created attachment 318298 [details] [review]
Patch v1

* parser.c:
(xmlCheckCdataPush): Add 'complete' argument to describe whether the buffer passed in is the whole CDATA buffer, or if there is more data to parse.  If there is more data to parse, don't return a negative value for an invalid multi-byte UTF-8 character that is split between buffers.
(xmlParseTryOrFinish): Pass 'complete' argument to xmlCheckCdataPush() as appropriate.

* result/cdata-2-byte-UTF-8.xml: Added.
* result/cdata-2-byte-UTF-8.xml.rde: Added.
* result/cdata-2-byte-UTF-8.xml.rdr: Added.
* result/cdata-2-byte-UTF-8.xml.sax: Added.
* result/cdata-2-byte-UTF-8.xml.sax2: Added.
* result/cdata-3-byte-UTF-8.xml: Added.
* result/cdata-3-byte-UTF-8.xml.rde: Added.
* result/cdata-3-byte-UTF-8.xml.rdr: Added.
* result/cdata-3-byte-UTF-8.xml.sax: Added.
* result/cdata-3-byte-UTF-8.xml.sax2: Added.
* result/cdata-4-byte-UTF-8.xml: Added.
* result/cdata-4-byte-UTF-8.xml.rde: Added.
* result/cdata-4-byte-UTF-8.xml.rdr: Added.
* result/cdata-4-byte-UTF-8.xml.sax: Added.
* result/cdata-4-byte-UTF-8.xml.sax2: Added.
* result/noent/cdata-2-byte-UTF-8.xml: Added.
* result/noent/cdata-3-byte-UTF-8.xml: Added.
* result/noent/cdata-4-byte-UTF-8.xml: Added.
* test/cdata-2-byte-UTF-8.xml: Added.
* test/cdata-3-byte-UTF-8.xml: Added.
* test/cdata-4-byte-UTF-8.xml: Added.
- Add tests and results.  Only 'make Readertests XMLPushtests' fails prior to the fix.
Comment 2 Bruno G. 2016-04-28 23:18:29 UTC
This bug affects PHP since 5.5.32 and 5.6.18 (see https://bugs.php.net/bug.php?id=71805) and seems to have fallen by the wayside...  Any chance of having the patch applied in the near future?
Comment 3 David Kilzer 2016-04-29 17:30:10 UTC
This was fixed in 4f8606c13cb7f2684839f850b83de5ce647d3ca7.

<https://git.gnome.org/browse/libxml2/commit/?id=4f8606c13cb7f2684839f850b83de5ce647d3ca7>