GNOME Bugzilla – Bug 302529
<script> contains cdata when output document has xhtml doctype (strict or transitional)
Last modified: 2006-01-12 12:30:07 UTC
When the input stylesheet specifies that the output has an xhtml strict or transitional doctype, <script> output nodes always contain a CDATA section: mop:~/Desktop$ xsltproc default.xsl root.xml <?xml version="1.0"?> <!DOCTYPE script SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <script type="text/javascript"><![CDATA[ document.write("hullo"); ]]></script> mop:~/Desktop$ cat default.xsl <?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > <xsl:output doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"/> <xsl:template match="@*|node()|processing-instruction()"> <xsl:copy> <xsl:apply-templates select="@*|node()|processing-instruction()"/> </xsl:copy> </xsl:template> </xsl:stylesheet> mop:~/Desktop$ cat root.xml <script type="text/javascript"> document.write("hullo"); </script> (I don't know if in fact this is the correct behaviour, but both xalan and saxon do not emit CDATA.)
libxml2 is just following the serialization suggestion from the XHTML1 spec http://www.w3.org/TR/xhtml1/#h-4.8 on purpose, not a bug. Daniel
Whilst it might in some circumstances be a good idea to escape <script> elements via cdata sections (though not in my case; it was confusing my browser, even though it otherwise understood xml), I don't think it makes sense for an *XSL processor* to be making the decision to escape individual elements. This is because: - There already exists a good, documented way to get this behaviour, should you want it (cdata- section-elements, for which there is no corresponding inverse) - It's unexpected behaviour. (Note that the output document above does not (apart from the doctype) even look like HTML, nor does it validate.) - Other XSL processors do not automatically escape <script> elements. - It makes the output doctype-specific. (Are there any other doctype-specific behaviours? What are they?)
XHTML1 *is* doctype specific and only defined by doctype specific serialization. XHTML1 also means that <br/> got an extra space as <br /> etc ... This is the libxml2 serializer which does it, either the serializer supports XHTML1 or not, libxml2/libxslt supports XHTML1. Per xmllint, using the XHTML1 DTD, the document you pasted is valid following XML-1.0 validity checks paphio:~/XML -> xmllint --valid tst.xml <?xml version="1.0"?> <!DOCTYPE script SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <script type="text/javascript"><![CDATA[ document.write("hullo"); ]]></script> paphio:~/XML -> For XML-1.0 DTD validity, Relax-NG and XSD validity, all consider a CDATA section as being a text node (whose content is merged with adjacent text/CDATA nodes). I see no reasons why this specific change can raise a validity error and you didn't provide any, you are actually wrong it *is* valid per the XHTML1 DTD. I disagree with " It's unexpected behaviour." it is expected behaviour for XHTML1 documents, as the XHTML1 spec says. "it was confusing my browser" means the browser is not XHTML1 compliant. As a result the only option is to not feed it XHTML1. XHTML1 is the only spec supported by libxml2 which requires a specific non-standard XML serialization. Not doing it would mean libxml2 would not be XHTML1 compliant, this is the only spec I know which requires a non standard behaviour that way. Daniel
The XHTML1 specs states: the script and style elements are declared as having #PCDATA content. If this is not what you want, you should add <![CDATA[ within the script element. It is NOT the job of the XSLT processor to start adding <![CDATA[ inside the script tags! If I want this, I should do this myself! <?xml version="1.0" encoding="UTF-8" ?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template name="header"> <script> alert('hi'); </script> </xsl:template> </xsl:stylesheet> should output: <script> alert('hi'); </script> NOT (as is the case with libxslt): <script><![CDATA[ alert('hi'); ]]></script> If i wanted this, the xsl should be: <xsl:template name="header"> <script><![CDATA[ alert('hi'); ]]></script> </xsl:template> Anyways, for the time being we found a not so clean work-around: <xsl:text disable-output-escaping="yes"> <![CDATA[ <script> alert('hi'); </script> ]]> </xsl:text>