After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 608333 - Different HTML and XML output for HTML and XML when xincluding a text file with \r\n end line markers
Different HTML and XML output for HTML and XML when xincluding a text file wi...
Status: RESOLVED NOTABUG
Product: libxml2
Classification: Platform
Component: general
git master
Other Windows
: Normal normal
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2010-01-28 12:10 UTC by boris
Modified: 2010-02-01 13:10 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description boris 2010-01-28 12:10:09 UTC
When a text file is included with 

<include href="text.txt" parse="text" xmlns="http://www.w3.org/2001/XInclude"/> 

and the text file uses \r\n end line markers (as it's common on Windows) the output file generated by xmllint and xsltproc contains &#13; characters for every \r. 

There is an exception though: When xsltproc is used to generate HTML output (with a style sheet which contains <xsl:output method="html">) no &#13; characters are inserted. According to http://article.gmane.org/gmane.comp.gnome.lib.xslt/3917 libxml2 treats HTML output indeed differently. 

A style sheet to generate XHTML must contain <xsl:output method="xml">. As XHTML is a XML grammar and libxml2 only treats HTML differently XHTML output contains again these &#13; characters. 

The question is whether it makes sense at all to generate &#13; characters when a text file is xincluded? Why is HTML treated differently than XML? 

As generating &#13; characters is a problem for at least one popular reading system for eBook files (ePub format) it might make sense not to generate &#13; characters at all when text files which are included. But then I don't know if there are other use cases which require to generate &#13; characters for XML output. 

The problem was discussed first on the DocBook mailing list. Here's the relevant message which made me turn to the libxml2 project: http://lists.oasis-open.org/archives/docbook/201001/msg00065.html
Comment 1 Daniel Veillard 2010-02-01 13:10:30 UTC
http://www.w3.org/TR/REC-xml/#sec-line-ends

The only way to have \r\n be available after parsing of the XML
resulting document is to have \r escaped. This is a mandatory
rule in XML parsing.

  Not a bug,

Daniel