GNOME Bugzilla – Bug 377440
exclude-result-prefixes not honored (regression)
Last modified: 2013-03-04 12:43:56 UTC
Please describe the problem: The exclude-result-prefixes in one of my XSLT file is not honored. This was working with libxslt 1.1.15, but no longer with libxslt 1.1.18 (Debian). Steps to reproduce: 1. wget -q -O - http://www.mpfr.org/faq.html | xsltproc --nodtdattr faq.xsl - with the faq.xsl that will be attached. Actual results: I get in particular: <h:style xmlns:h="http://www.w3.org/1999/xhtml">test</h:style> Expected results: I should have got: <style>test</style> Does this happen every time? Yes. Other information:
Created attachment 76920 [details] XSLT file faq.xsl
AFAICS exclude-result-prefixes isn't supposed to simply remove all of the given namespace prefixes. It's only needed to remove unneeded namespace declarations (e.g. for namespaces that are only used to lookup certain nodes, but that don't appear in the output). But I still think the current behaviour is a problem. Let's use another example to illustrate it. Stylesheet is: <?xml version="1.0" encoding="iso-8859-1"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:html="http://www.w3.org/1999/xhtml"> <xsl:output indent="yes"/> <xsl:template match="root"> <html:div/> </xsl:template> <xsl:template match="/"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Test</title> </head> <body> <xsl:apply-templates/> </body> </html> </xsl:template> </xsl:stylesheet> Source XML is: <?xml version="1.0" encoding="iso-8859-1"?> <root/> Running "xsltproc bug.xsl bug.xml" with libxslt 1.1.19 we get: <?xml version="1.0"?> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:html="http://www.w3.org/1999/xhtml"> <head> <title>Test</title> </head> <body> <html:div/> </body> </html> While the result may be correct, there's a superfluous namespace declaration and prefix for the div element. This breaks non-XML-conformant HTML browsers, for example. Of course, one could simply use the HTML output method to fix this.
I just realized that my example above produces the same result with libxml 1.1.17. But if you specify exclude-result-prefixes="html" with 1.1.17 the superflous namespace prefix disappears.
I had some time to look into this a little more and found that this regression is caused by the code in xsltGetSpecialNamespace in libxslt/namespaces.c. It's the #if block at line 552 (in libxslt 1.1.20): /* * Either no matching ns-prefix was found or the namespace is * shadowed. * Create a new ns-decl on the current result element. * * Hmm, we could also try to reuse an in-scope * namespace with a matching ns-name but a different * ns-prefix. * What has higher priority? * 1) If keeping the prefix: create a new ns-decl. * 2) If reusal: first lookup ns-names; then fallback * to creation of a new ns-decl. * REVISIT: this currently uses case 1) although * the old way was use xmlSearchNsByHref() and to let change * the prefix. */ #if 0 ns = xmlSearchNsByHref(target->doc, target, nsName); if (ns != NULL) return(ns); #endif Changing this to #if 1 enables the old behavior. I think libxslt should at least check if the default namespace of target matches nsName and simply use the default namespace in that case.
(In reply to comment #2) > Of course, one could simply use the HTML output method to fix this. This would not be a satisfactory solution as the goal is to be readable by HTML parsers *and* XML parsers. The solution of using xmlns="http://www.w3.org/1999/xhtml" and avoiding prefixes for HTML elements works. Perhaps that's the correct way of doing, but I'm not sure.
(In reply to comment #5) > The solution of using > xmlns="http://www.w3.org/1999/xhtml" and avoiding prefixes for HTML elements > works. Perhaps that's the correct way of doing, but I'm not sure. That's the best solution for now. If you have HTML nodes with a namespace prefix in a source document, you can write a simple transform that strips the namespace prefix.
FYI, xalan 1.10.0 under Debian has the same behavior as the current libxslt (for xalan, I just had to remove the DOCTYPE line from the HTML source document to avoid an error). Moreover, stripping the namespace prefix is not sufficient: one also needs to make sure that the default namespace is correct (for XML parsers). I thought that exclude-result-prefixes was doing this, as this was how old libxslt versions behaved. But it is now clear that this is not what the XSLT spec specifies (in particular because one can exclude the default namespace with "#default"), as said in Comment #2. It seems that the XSLT spec (latest version, from 16 November 1999) doesn't specify what the behavior should be when a prefix listed in exclude-result-prefixes is used in the output tree. There is nothing concerning exclude-result-prefixes in the errata either. I wondered whether this should be regarded as an error. But a note in the XSLT spec seems to imply that the only purpose of exclude-result-prefixes is to avoid "superfluous namespace declarations in the result tree", nothing like making sure that some prefix isn't used in the result tree. So, the current libxslt/xalan behavior can be regarded as what should be done.