After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 620102 - xsltApplyStripSpaces() modifies the *input* document
xsltApplyStripSpaces() modifies the *input* document
Status: RESOLVED OBSOLETE
Product: libxslt
Classification: Platform
Component: general
1.1.26
Other All
: Normal critical
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2010-05-30 16:18 UTC by Stefan Behnel
Modified: 2021-07-05 11:00 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Stefan Behnel 2010-05-30 16:18:30 UTC
This bug was found while investigating a crash that a user reported in lxml.etree during XSLT processing.

https://bugs.launchpad.net/lxml/+bug/583249

The crashing stylesheet (appended to the bug above) uses <xsl:strip-space>, and valgrind reveals that a node was freed during processing that is still in use. I copied the section that frees that node from transform.c, it's in line 5743:

------------------------------
/**
 * xsltApplyStripSpaces:
 * @ctxt:  a XSLT process context
 * @node:  the root of the XML tree
 *
 * Strip the unwanted ignorable spaces from the input tree
 */
void
xsltApplyStripSpaces(xsltTransformContextPtr ctxt, xmlNodePtr node) {
    xmlNodePtr current;
#ifdef WITH_XSLT_DEBUG_PROCESS
    int nb = 0;
#endif


    current = node;
    while (current != NULL) {
        /*
         * Cleanup children empty nodes if asked for
         */
        if ((IS_XSLT_REAL_NODE(current)) &&
            (current->children != NULL) &&
            (xsltFindElemSpaceHandling(ctxt, current))) {
            xmlNodePtr delete = NULL, cur = current->children;

            while (cur != NULL) {
                if (IS_BLANK_NODE(cur))
                    delete = cur;

                cur = cur->next;
                if (delete != NULL) {
                    xmlUnlinkNode(delete);
                    xmlFreeNode(delete);   /**** <== *****/
                    delete = NULL;
#ifdef WITH_XSLT_DEBUG_PROCESS
                    nb++;
#endif
                }
            }
        }
------------------------------

This code seems to modify the input document in place. I can't see any reason why something like this should *ever* happen during XSLT processing, and I don't see any way to work around this in lxml.etree. If the node is in use, e.g. because it was found by a previously executed XPath selection, there is no way to prevent user code from crashing.
Comment 1 Nicolas Gregoire 2017-08-29 21:17:58 UTC
This bug (still unfixed) seems to be caused by lxml, not libxslt. Reproducing this crash using any other libxslt binding (Python, Perl, JavaScript) wasn't possible.
Comment 2 Stefan Behnel 2017-08-30 05:21:02 UTC
Well, the crash might be specific to lxml, because it makes certain assumptions about documents. That doesn't means it is *caused* by lxml. The cause is that libxslt modifies the *input* document in place when applying an XSLT. That is definitely a bug in libxslt, whether it results in a crash or not.

I'm sure there are ways to make other wrappers crash, too. Maybe it's more difficult to do than for lxml, but knowing the described behaviour of libxslt, I'm sure it's possible to write code also in Perl or JavaScript that crashes because of this. At least the input modification should be visible in all wrappers.

See line 6008 in current transform.c, that's where xsltApplyStripSpaces() is used to modify the input document before running the actual transform.
Comment 3 Nicolas Gregoire 2017-08-31 15:03:57 UTC
>  I'm sure it's possible to write code also in Perl or JavaScript that crashes because of this.

Please do it. Debugging anything related to lxml is a mess...
Comment 4 Stefan Behnel 2017-08-31 17:49:19 UTC
Do we actually agree that the current behaviour is wrong, or do you consider it acceptable behaviour that an XSLT transformation modifies the input document in place?
Comment 5 Stefan Behnel 2017-08-31 18:07:11 UTC
Specifically, do you consider the following behaviour correct?


In [1]: import libxml2 as lx, libxslt as xsl

In [2]: d = b"""<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"><xsl:strip-space elements="*" /><xsl:template match="/"><xsl:copy-of select="*" /></xsl:template></xsl:stylesheet>"""

In [3]: xs = lx.parseMemory(d, len(d))

In [4]: t = xsl.parseStylesheetDoc(xs)

In [5]: xml = lx.parseFile("tests/XSLTMark/trend.xml")

In [6]: c = xml.get_children()

In [7]: text = c.get_children()

In [8]: text.get_type(), text.get_name()
Out[8]: ('text', 'text')

In [9]: out = t.applyStylesheet(xml, {})

In [10]: text.get_type(), text.get_name()
Out[10]: ('element', 'val')


This was executed in the project root directory of libxslt, using the libxml2/libxslt Python bindings.
Comment 6 Nicolas Gregoire 2017-09-04 18:32:47 UTC
I agree that this behavior is highly surprising. But there's no crashes...
Comment 7 GNOME Infrastructure Team 2021-07-05 11:00:59 UTC
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org.
As part of that, we are mass-closing older open tickets in bugzilla.gnome.org
which have not seen updates for a longer time (resources are unfortunately
quite limited so not every ticket can get handled).

If you can still reproduce the situation described in this ticket in a recent
and supported software version, then please follow
  https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines
and create a new ticket at
  https://gitlab.gnome.org/GNOME/libxslt/-/issues/

Thank you for your understanding and your help.