After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 353032 - xsltproc: Strange treatment of ENTITY in internal DOCTYPE
xsltproc: Strange treatment of ENTITY in internal DOCTYPE
Status: RESOLVED OBSOLETE
Product: libxslt
Classification: Platform
Component: general
unspecified
Other All
: Normal normal
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2006-08-26 21:17 UTC by hanne.moa
Modified: 2021-07-05 11:00 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description hanne.moa 2006-08-26 21:17:18 UTC
Please describe the problem:
The trick described at the bottom of  
http://www.xml.com/pub/a/2001/12/05/whitespace.html
of encapsulating <xsl:text> in entities in order to prettify/lower the verbosity of xslt source does not work in xsltproc.

xsltproc --version:
Using libxml 20624, libxslt 10115 and libexslt 812
xsltproc was compiled against libxml 20622, libxslt 10115 and libexslt 812
libxslt 10115 was compiled against libxml 20622
libexslt 812 was compiled against libxml 20622


Steps to reproduce:
Run the following stylesheet:

<!DOCTYPE xsl:stylesheet [
<!ENTITY sp "<xsl:text> </xsl:text>">
<!ENTITY cr "<xsl:text>
</xsl:text>"> ]>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="@*|node()">
<xsl:copy>
&cr;&sp;<xsl:apply-templates select="@*|node()"/>&sp;&cr;
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

On the following xml:

<?xml version="1.0"?>
<root><tag1>ab<tag2>cddc</tag2>ba</tag1></root>

Actual results:
namespace error : Namespace prefix xsl on text is not defined
<xsl:text>
         ^
namespace error : Namespace prefix xsl on text is not defined
<xsl:text> </xsl:text>
         ^
<?xml version="1.0"?>
<root><text/><text/><tag1><text/><text/>ab<tag2><text/><text/>cddc<text/><text/></tag2>ba<text/><text/></tag1><text/><text/></root>

Expected results:
<?xml version="1.0"?>
<root>
 <tag1>
 ab<tag2>
 cddc 
</tag2>ba 
</tag1> 
</root>

Does this happen every time?
This does *not* happen if there is any non-whitespace character between the > and the whitespace in the entity definition. Instead, if x is the extra character, sp is replaced by <text>x </text> and cr is replaced with <text>x
</text> (<text>xlinefeed</text>) in the final output.

Other information:
Xalan version 1.10.0 using Xerces version 2.7.0 does this correctly (if disregarding the esthetic blemish of no newline after <?xml version="1.0" encoding="UTF-8"?> and no newline after </root>...)
Comment 1 Daniel Veillard 2006-08-27 07:25:26 UTC
Right mixing entities and namespace usage without declaration in 
XML fragments may break libxml2 parsing.
If you redeclare the xsl namespace on the elements in the entities
then it's likely to work. That "trick" is really ugly.

Daniel
Comment 2 hanne.moa 2006-08-28 10:13:39 UTC
(In reply to comment #1)
> Right mixing entities and namespace usage without declaration in 
> XML fragments may break libxml2 parsing.
> If you redeclare the xsl namespace on the elements in the entities
> then it's likely to work. That "trick" is really ugly.

Changing the stylesheet to:

<!DOCTYPE xsl:stylesheet [
<!ENTITY xsl:sp "<xsl:text> </xsl:text>">
<!ENTITY xsl:cr "<xsl:text>
</xsl:text>"> ]>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="@*|node()">
<xsl:copy>
&xsl:cr;&xsl:sp;<xsl:apply-templates select="@*|node()"/>&xsl:sp;&xsl:cr;
</xsl:copy>
</xsl:template>
</xsl:stylesheet>

That is: changing cr to xsl:cr and sp to xsl:sp throughout, gives the exact same, wrong result. Did you mean to change something else?

The problem isn't the spurious <text>-nodes, the problem is the missing whitespace. The spurious <text>-nodes can be removed later, by sed if necessary.

As for the trick being ugly: if it prevents thousands of occurances of
<xsl:text>
</xsl:text>
in the xslt-source, it is a good trick. You can't really tell at a glance what's after that <xsl:text> can you? Just a newline or also spaces and tabs?

Factoid: running the original stylesheet in saxon8-b also gives the desired result.
Comment 3 Daniel Veillard 2006-08-28 10:27:34 UTC
: Did you mean to change something else?

yes 

<!ENTITY xsl:sp "<xsl:text xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> < /xsl:text>">

Daniel
Comment 4 GNOME Infrastructure Team 2021-07-05 11:00:46 UTC
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org.
As part of that, we are mass-closing older open tickets in bugzilla.gnome.org
which have not seen updates for a longer time (resources are unfortunately
quite limited so not every ticket can get handled).

If you can still reproduce the situation described in this ticket in a recent
and supported software version, then please follow
  https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines
and create a new ticket at
  https://gitlab.gnome.org/GNOME/libxslt/-/issues/

Thank you for your understanding and your help.