GNOME Bugzilla – Bug 353032
xsltproc: Strange treatment of ENTITY in internal DOCTYPE
Last modified: 2021-07-05 11:00:46 UTC
Please describe the problem: The trick described at the bottom of http://www.xml.com/pub/a/2001/12/05/whitespace.html of encapsulating <xsl:text> in entities in order to prettify/lower the verbosity of xslt source does not work in xsltproc. xsltproc --version: Using libxml 20624, libxslt 10115 and libexslt 812 xsltproc was compiled against libxml 20622, libxslt 10115 and libexslt 812 libxslt 10115 was compiled against libxml 20622 libexslt 812 was compiled against libxml 20622 Steps to reproduce: Run the following stylesheet: <!DOCTYPE xsl:stylesheet [ <!ENTITY sp "<xsl:text> </xsl:text>"> <!ENTITY cr "<xsl:text> </xsl:text>"> ]> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="@*|node()"> <xsl:copy> &cr;&sp;<xsl:apply-templates select="@*|node()"/>&sp;&cr; </xsl:copy> </xsl:template> </xsl:stylesheet> On the following xml: <?xml version="1.0"?> <root><tag1>ab<tag2>cddc</tag2>ba</tag1></root> Actual results: namespace error : Namespace prefix xsl on text is not defined <xsl:text> ^ namespace error : Namespace prefix xsl on text is not defined <xsl:text> </xsl:text> ^ <?xml version="1.0"?> <root><text/><text/><tag1><text/><text/>ab<tag2><text/><text/>cddc<text/><text/></tag2>ba<text/><text/></tag1><text/><text/></root> Expected results: <?xml version="1.0"?> <root> <tag1> ab<tag2> cddc </tag2>ba </tag1> </root> Does this happen every time? This does *not* happen if there is any non-whitespace character between the > and the whitespace in the entity definition. Instead, if x is the extra character, sp is replaced by <text>x </text> and cr is replaced with <text>x </text> (<text>xlinefeed</text>) in the final output. Other information: Xalan version 1.10.0 using Xerces version 2.7.0 does this correctly (if disregarding the esthetic blemish of no newline after <?xml version="1.0" encoding="UTF-8"?> and no newline after </root>...)
Right mixing entities and namespace usage without declaration in XML fragments may break libxml2 parsing. If you redeclare the xsl namespace on the elements in the entities then it's likely to work. That "trick" is really ugly. Daniel
(In reply to comment #1) > Right mixing entities and namespace usage without declaration in > XML fragments may break libxml2 parsing. > If you redeclare the xsl namespace on the elements in the entities > then it's likely to work. That "trick" is really ugly. Changing the stylesheet to: <!DOCTYPE xsl:stylesheet [ <!ENTITY xsl:sp "<xsl:text> </xsl:text>"> <!ENTITY xsl:cr "<xsl:text> </xsl:text>"> ]> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:template match="@*|node()"> <xsl:copy> &xsl:cr;&xsl:sp;<xsl:apply-templates select="@*|node()"/>&xsl:sp;&xsl:cr; </xsl:copy> </xsl:template> </xsl:stylesheet> That is: changing cr to xsl:cr and sp to xsl:sp throughout, gives the exact same, wrong result. Did you mean to change something else? The problem isn't the spurious <text>-nodes, the problem is the missing whitespace. The spurious <text>-nodes can be removed later, by sed if necessary. As for the trick being ugly: if it prevents thousands of occurances of <xsl:text> </xsl:text> in the xslt-source, it is a good trick. You can't really tell at a glance what's after that <xsl:text> can you? Just a newline or also spaces and tabs? Factoid: running the original stylesheet in saxon8-b also gives the desired result.
: Did you mean to change something else? yes <!ENTITY xsl:sp "<xsl:text xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> < /xsl:text>"> Daniel
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled). If you can still reproduce the situation described in this ticket in a recent and supported software version, then please follow https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines and create a new ticket at https://gitlab.gnome.org/GNOME/libxslt/-/issues/ Thank you for your understanding and your help.