GNOME Bugzilla – Bug 698550
Backward Compatibility break
Last modified: 2013-05-23 05:36:42 UTC
Created attachment 242122 [details] XML Hi, In latest version of libxml, I am not able to load the attached XML where as the same was parsed parsed in libxml 2.6. If I add a space at the end of the tag instead of carriage return, it is working properly in both the versions Regards, S.C.Harish
I can reproduce this: (gdb) b __xmlRaiseError Breakpoint 1 at 0x412ccf: file error.c, line 460. (gdb) r ../JapaneseElementTest.xml Starting program: /home/veillard/XML/xmllint ../JapaneseElementTest.xml [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Breakpoint 1, __xmlRaiseError (schannel=0, channel=0, data=0x0, ctx=0x7a3500, nod=0x0, domain=1, code=76, level=XML_ERR_FATAL, file=0x0, line=0, str1=0x7aeb77 "\205\245力メッセージ\r", str2=0x7aebe4 "入力メッセージ", str3=0x0, int1=1, col=0, msg=0x54d348 "Opening and ending tag mismatch: %s line %d and %s\n") at error.c:460 460 xmlParserCtxtPtr ctxt = NULL; Missing separate debuginfos, use: debuginfo-install xz-libs-5.1.2-1alpha.fc17.x86_64 zlib-1.2.5-7.fc17.x86_64 (gdb) Look how the string look different when serialized by gdb. Also the pointers are different while the 2 string should be interned and both pointer being actually the same... Looking at an octal dump of the file though: thinkpad:~/XML -> od -c ../JapaneseElementTest.xml 0000000 357 273 277 < 345 205 245 345 212 233 343 203 241 343 203 203 0000020 343 202 273 343 203 274 343 202 270 \r \n \t x m l n 0000040 s = " h t t p : / / s c h e m a 0000060 s . c o r d y s . c o m / w e b 0000100 a p p s / 1 . 0 / b p m / c 8 c 0000120 8 b 8 2 a - 0 a c 0 - 3 d 1 9 - 0000140 0 1 e 2 - b d a 7 4 a f 9 b 8 2 0000160 6 " > \r \n \t < c 8 c : E l e \r \n 0000200 \t \t x m l n s : c 8 c = " h t t 0000220 p : / / s c h e m a s . c o r d 0000240 y s . c o m / w e b a p p s / 1 0000260 . 0 / b p m / c 8 c 8 b 8 2 a - 0000300 0 a c 0 - 3 d 1 9 - 0 1 e 2 - b 0000320 d a 7 4 a f 9 b 8 2 6 " \r \n \t / 0000340 > \r \n < / 345 205 245 345 212 233 343 203 241 343 203 0000360 203 343 202 273 343 203 274 343 202 270 > 0000373 thinkpad:~/XML -> the opening and closing tags use the same sequence, so yes it seems there is a bug there ... apparently it is in 2.9.0 too. thanks for the report ! Daniel
This defect is in latest libxml2 also. Any idea when this fix will be available?
Created attachment 244294 [details] Test XML Guess the parsing of this XML also fails because of the same reason
One of our Japanese customer is blocked with this bug. Is there any update on when the fix might be available ?
Okay, found and fixed in git: https://git.gnome.org/browse/libxml2/commit/?id=dcc19503193c71596278a252064a8ce66331b3cd patch is rather simple, solves the issue and adds the example in the regression tests so that this can't break in the future. And the problem in comment 3 is indeed the same one. thanks ! Daniel
Thanks for the quick response :)