GNOME Bugzilla – Bug 319279
DOS-style line endings (CR-LF) sometimes read as double newline
Last modified: 2006-01-09 14:39:49 UTC
Distribution/Version: Ubuntu Breezy When reading text node contents with the xmlTextReader interface, some lines with DOS-style line endings (\r\n) become double newlines (\n\n) in the read string. 1. Using xmlTextReader, read the attached sample XML file, which consists of 6- digit numbers with \r\n end-of-lines. 2. Output the string value from the text node. There will be an extra newline in output between lines 000063 and 000064. Extending the file, this will repeat every 64 lines (at 000127, etc). Tested on Linux (Ubuntu Breezy) and Mac OS X, problem still present in CVS HEAD. Confirmed with raw C as well as Python and PHP 5.1.
Created attachment 53673 [details] Sample XML file with CR-LF newlines
Created attachment 53674 [details] Sample read & output using Python libxml2 module Output from this Python program reading the sample file will include the extra newlines. Equivalent program in C or PHP 5.1 does the same.
Fixed in cvs (parser.c) - thanks for the report. Bill
need to be revisited, the fix is at the wrong level and broke entity processing apparently c.f. bug http://bugzilla.gnome.org/show_bug.cgi?id=326295 Daniel
Okay fixed in a different way in xmlParseChunk() now. Daniel