GNOME Bugzilla – Bug 169838
XML_PARSE_NOBLANKS difference regarding CRLF and LF
Last modified: 2009-08-15 18:40:50 UTC
Please describe the problem: I noticed a difference between parsing with XML_PARSE_NOBLANKS. Given the following XML: <?xml version="1.0" encoding="iso-8859-1"?> <foo> <bar/> </foo> in the first variant, which has only a LF after the tags: kbu@librax:/data/home/kbuchcik/gnomecvs/tests$ od -a test-lf.xml 0000000 < ? x m l sp v e r s i o n = " 1 0000020 . 0 " sp e n c o d i n g = " i s 0000040 o - 8 8 5 9 - 1 " ? > cr nl < f o 0000060 o > sp nl < b a r / > sp nl < / f o 0000100 o > 0000102 and the second variant, which has a CRLF after the tags: kbu@librax:/data/home/kbuchcik/gnomecvs/tests$ od -a test-crlf.xml 0000000 < ? x m l sp v e r s i o n = " 1 0000020 . 0 " sp e n c o d i n g = " i s 0000040 o - 8 8 5 9 - 1 " ? > cr nl < f o 0000060 o > sp cr nl < b a r / > sp cr nl < / 0000100 f o o > 0000104 results in the tree having text-nodes with the latter variant. Steps to reproduce: Compare the debug output of the files I will attach with xmllint: xmllint --noblanks --debug test-lf.xml xmllint --noblanks --debug test-crlf.xml Actual results: kbu@librax:/data/home/kbuchcik/gnomecvs/tests$ xmllint --noblanks --debug test- lf.xml DOCUMENT version=1.0 encoding=iso-8859-1 URL=test-lf.xml standalone=true ELEMENT foo ELEMENT bar kbu@librax:/data/home/kbuchcik/gnomecvs/tests$ xmllint --noblanks --debug test- crlf.xml DOCUMENT version=1.0 encoding=iso-8859-1 URL=test-crlf.xml standalone=true ELEMENT foo TEXT content= ELEMENT bar TEXT content= Expected results: Does this happen every time? Other information:
Created attachment 38508 [details] test-lf.xml, with LF
Created attachment 38509 [details] test-crlf.xml, with CRLF
*** This bug has been marked as a duplicate of 166777 ***