GNOME Bugzilla – Bug 315617
serious mistake in function xmlCheckCdataPush, parser.c
Last modified: 2005-09-12 14:09:06 UTC
codes (line 9192): } else if ((c & 0xf0) == 0xe0) {/* 3-byte code, starts with 1110 */ if (ix + 3 > len) return(ix); if (((utf[ix+1] & 0xc0) != 0x80) || ((utf[ix+2] & 0xc0) != 0x80)) return(-ix); codepoint = (utf[0] & 0xf) << 12; //not utf[0], but utf[ix+0]... codepoint |= (utf[1] & 0x3f) << 6; codepoint |= utf[2] & 0x3f; if (!xmlIsCharQ(codepoint)) return(-ix); ix += 3; // //WE NEED TO VALIDATE utf[ix]~utf[ix+2], NOT THE CHARACTER utf[0]~utf[2]!!! same mistake in libxml2-2.6.21 oftenly, this bug makes libxml2 failed to parse CDATA block.
Right this is a bug, fixed in CVS. It doesn't appear to arise that often as it was never reported before. And you should provide an example demonstrating the problem with xmllint --push to be complete and be sure it is fixed. Note that using capitals is interpreted as yelling at me :-(, it's also not critical (does not lead to crash) though it is a serious problem (but again not reported ever before). Daniel
Daniel, Sorry for my rudeness. This phenomenon appears after I upgrade libxml2 from version 2.6.19 to 2.6.21. Since I also did much changes to my codes at same time, it took me about 1 hour to debug for this reason. I felt a little unhappy in the moment I submitted this bug. No matter whether you can understand me or not, I appologized for previous overstating description. This bug is not critical^^, and only arise after 2.6.20. Below, is the example cdata block: <![CDATA[@@@ "any 3-bytes utf8 characters here"]]> utblog
I know the frustration of debugging :-) That code was added in 2.6.20 to fix another bug, and as usual, if one add new code, new bugs are likely to be added too :-\ Sorry for being a bit heavy, but I prefer you provide the example as an attachment to this bug, that way we are 100% sure this got fixed, and I will add it to the regression test suite. You can also fetch the cvs snapshot from ftp://xmlsoft.org/ to verify it's fixed, but I would prefer a test case. thanks in advance, Daniel
Test XML is here: http://viewcvs.php.net/viewcvs.cgi/*checkout*/presentations/slides/i18nl10n/string-escape.xml?rev=1.1&content-type=text/plain if you remove the newline after CDATA[ so that it shows as: [CDATA[<?php then it works. And yes, I did encounter this myself :)