After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 302529 - <script> contains cdata when output document has xhtml doctype (strict or transitional)
<script> contains cdata when output document has xhtml doctype (strict or tra...
Status: RESOLVED NOTABUG
Product: libxslt
Classification: Platform
Component: general
1.1.14
Other All
: Normal normal
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2005-04-30 10:23 UTC by Michael S.
Modified: 2006-01-12 12:30 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Michael S. 2005-04-30 10:23:45 UTC
When the input stylesheet specifies that the output has an xhtml strict or transitional doctype, <script> 
output nodes always contain a CDATA section:

mop:~/Desktop$ xsltproc default.xsl root.xml 
<?xml version="1.0"?>
<!DOCTYPE script SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<script type="text/javascript"><![CDATA[
document.write("hullo");
]]></script>
mop:~/Desktop$ cat default.xsl 
<?xml version="1.0"?> 

<xsl:stylesheet 
  version="1.0" 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>

<xsl:output doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"/>

<xsl:template match="@*|node()|processing-instruction()">
  <xsl:copy>
    <xsl:apply-templates select="@*|node()|processing-instruction()"/>
  </xsl:copy>
</xsl:template>

</xsl:stylesheet>

mop:~/Desktop$ cat root.xml 
<script type="text/javascript">
document.write("hullo");
</script>

(I don't know if in fact this is the correct behaviour, but both xalan and saxon do not emit CDATA.)
Comment 1 Daniel Veillard 2005-04-30 11:58:02 UTC
libxml2 is just following the serialization suggestion from the XHTML1 spec
  http://www.w3.org/TR/xhtml1/#h-4.8

on purpose, not a bug.

Daniel
Comment 2 Michael S. 2005-04-30 14:50:58 UTC
Whilst it might in some circumstances be a good idea to escape <script> elements via cdata sections 
(though not in my case; it was confusing my browser, even though it otherwise understood xml), I don't 
think it makes sense for an *XSL processor* to be making the decision to escape individual elements.

This is because:

- There already exists a good, documented way to get this behaviour, should you want it (cdata-
section-elements, for which there is no corresponding inverse)

- It's unexpected behaviour.  (Note that the output document above does not (apart from the doctype) 
even look like HTML, nor does it validate.)

- Other XSL processors do not automatically escape <script> elements.

- It makes the output doctype-specific.  (Are there any other doctype-specific behaviours?  What are 
they?)
Comment 3 Daniel Veillard 2005-05-01 08:51:47 UTC
XHTML1 *is* doctype specific and only defined by doctype specific serialization.
XHTML1 also means that <br/> got an extra space as <br /> etc ...
This is the libxml2 serializer which does it, either the serializer supports
XHTML1 or not, libxml2/libxslt supports XHTML1.

Per xmllint, using the XHTML1 DTD, the document you pasted is valid following
XML-1.0 validity checks

paphio:~/XML -> xmllint --valid tst.xml
<?xml version="1.0"?>
<!DOCTYPE script SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<script type="text/javascript"><![CDATA[
document.write("hullo");
]]></script>
paphio:~/XML ->

  For XML-1.0 DTD validity, Relax-NG and XSD validity, all consider
a CDATA section as being a text node (whose content is merged with 
adjacent text/CDATA nodes). I see no reasons why this specific change can
raise a validity error and you didn't provide any, you are actually wrong
it *is* valid per the XHTML1 DTD.

  I disagree with " It's unexpected behaviour." it is expected behaviour 
for XHTML1 documents, as the XHTML1 spec says.

  "it was confusing my browser" means the browser is not XHTML1 compliant.
As a result the only option is to not feed it XHTML1.

  XHTML1 is the only spec supported by libxml2 which requires a specific
non-standard XML serialization. Not doing it would mean libxml2 would not
be XHTML1 compliant, this is the only spec I know which requires a non
standard behaviour that way.

Daniel
Comment 4 Jop Brocker 2006-01-12 12:25:13 UTC
The XHTML1 specs states: the script and style elements are declared as having #PCDATA content. If this is not what you want, you should add <![CDATA[ within the script element.

It is NOT the job of the XSLT processor to start adding <![CDATA[ inside the script tags! If I want this, I should do this myself!

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:template name="header">
  <script>
   alert('hi');
  </script>
 </xsl:template>
</xsl:stylesheet>

should output:

  <script>
   alert('hi');
  </script>

NOT (as is the case with libxslt):

  <script><![CDATA[
   alert('hi');
  ]]></script>

If i wanted this, the xsl should be:

 <xsl:template name="header">
  <script>&lt;![CDATA[
   alert('hi');
  ]]&gt;</script>
 </xsl:template>

Anyways, for the time being we found a not so clean work-around:

<xsl:text disable-output-escaping="yes">
<![CDATA[
  <script>
    alert('hi');
  </script>
]]>
</xsl:text>