Bug 165547 – Performance issue when matching text()|*

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 165547 - Performance issue when matching text()|*


Summary:	Performance issue when matching text()\|*


Status:	RESOLVED OBSOLETE

Product:	libxslt
Classification:	Platform
Component:	general
Version:	1.1.12
Hardware:	Other All

Importance:	Normal normal
Target Milestone:	---
Assigned To:	Daniel Veillard
QA Contact:	libxml QA maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2005-01-28 20:29 UTC by Jerome Pesenti
Modified:	2021-07-05 10:59 UTC

See Also:
GNOME target:	---
GNOME version:	---

Description Jerome Pesenti 2005-01-28 20:29:37 UTC

Please describe the problem:
We encountered a strange performance problem related
to the cost of building large nodesets.  In
particular, it seems that the cost of building a
nodeset is not linear in the size.

Steps to reproduce:
To demonstrate
this we created three XML files:

<?xml version="1.0" ?>
<topnode>
<inner-node/>
<inner-node/>
 ...
</topnode>

The first had 1,000 of the inner nodes (and the
corresponding newlines), the second 3,000 and the last
10,000.  In the tests, we use the following
stylesheet:

<?xml version="1.0" ?>
<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
version="1.0"
>

<xsl:template match="topnode">
  <xsl:call-template name="nodes-only"/>
  <xsl:call-template name="text-only"/>
  <xsl:call-template name="nodes-or-text"/>
</xsl:template>

<xsl:template name="nodes-only">
  <xsl:value-of select="count(*)"/>
  <xsl:value-of select="'&#10;'"/>
</xsl:template>

<xsl:template name="text-only">
  <xsl:value-of select="count(text())"/>
  <xsl:value-of select="'&#10;'"/>
</xsl:template>

<xsl:template name="nodes-or-text">
  <xsl:value-of select="count(*|text())"/>
  <xsl:value-of select="'&#10;'"/>
</xsl:template>

</xsl:stylesheet>

>From the profiling using version 1.1.12, we see the
following times:

Template Name      1000 nodes    3000 nodes    10,000
nodes
nodes-or-text        4890          77834         
1870993
text-only            1274          12357          
213519
nodes-only             27            257            
1526

which corresponds to the following speed ratios

Template Name      1000 nodes    3000 nodes    10,000
nodes
nodes-or-text        1           15.9 times     382.6
times
text-only            1            9.7 times     167.6
times
nodes-only           1            9.5 times     56.5
times

This is definitely not scaling linearly.  And, a
second interesting point, is that count *|text() is 5
times slower than counting * and then counting text()
separately.

Regards,
Jerome Pesenti


Actual results:


Expected results:


Does this happen every time?


Other information:

Comment 1 Daniel Veillard 2005-03-30 12:22:57 UTC

"This is definitely not scaling linearly.  And, a
second interesting point, is that count *|text() is 5
times slower than counting * and then counting text()
separately."

 That's normal ! When you evaluate * | text() you must
return a sorted list, and sorting is not linear. 
There are a lot of optimizations possibles, but the basic
default processing won't be linear and that's normal !

 Discussions about optimizations should go on the list,
not in a bugzilla I will be the only one to look at. So
I'm closing it, followup on the list if interested

Daniel

Comment 2 Jerome Pesenti 2005-03-30 17:29:03 UTC

Daniel, the count(*|text()) is just the second point. The main point is that
even doing count(*) or count(text()) is NOT linear, it's even worse than
quadratic!!! Looks like a bug to me so I am reopening the bug.

Comment 3 kbuchcik 2006-05-19 10:06:39 UTC

I've tried to optimize the function xmlXPathCmpNodes() which is called by
xmlXPathNodeSetSort() (I'll commit the code later today). I'll
describe here the issues/thoughts/solutions I came accross.

The analysis of the expression "count(/foo/text())" revealed the
simplified XPath callstack shown below. Not that we have 2 XPATH_OP_SORT
here. I used a similar scenario as the one already reported (about 3000
elements and 3001 text-nodes):

xmlXPathRunEval  
  -->
  xmlXPathCompOpEval()
    XPATH_OP_SORT
    --> xmlXPathCompOpEval()
      XPATH_OP_FUNCTION
      --> xmlXPathCompOpEval()
        XPATH_OP_ARG
        --> xmlXPathCompOpEval()
          XPATH_OP_SORT
          --> xmlXPathCompOpEval()
            XPATH_OP_COLLECT
            --> xmlXPathCompOpEval()
              XPATH_OP_NODE
              --> xmlXPathNewNodeSet()
              --> valuePush()      
            --> xmlXPathNodeCollectAndTest():
              Here 3001 text-nodes are put into a node-set
          --> xmlXPathNodeSetSort()
            --> xmlXPathCmpNodes()
      --> xmlXPathCountFunction    

The sorting is responsible for the slowdown of the evaluation of
your scenario.
The reasons are:
1) Text-nodes (and all other non-element nodes) are not indexed.
2) All the text-nodes (3001) in this scenario are siblings.
   Together with 1) this results in a worst-case scenario for the
   comparison algorithm called by the sorting algorithm.

This means: that the node-comparison algorithm cannot use an index,
so it has to compare many nodes (in the worst case 3000) in
xmlXPathCmpNodes() for every call of this function.
I don't know exactly how the sorting algorithm works, but if we frankly
assume that xmlXPathCmpNodes() is called about 15000 times and has to
compare about 15000 nodes, then this sums up and takes much time.

On the other hand, for element-nodes, indexes can be used; such a
comparison will always be much faster. The worst-case scenario, as
described above for text-nodes, does not apply to sorting of nodes with
indexes.

An other question is why a sorting is performed at all for an expression
like "count(/foo/text())". First we just want to count the nodes; so
an internal optimization of the expression could recognize this fact and
would not sort. Further an expression like "/foo/text()" also need not be
sorted, since the returned nodes are put in document order into the
resulting node-set. So there's also room for optimization at
compilation time of XPath expressions.

Comment 4 kbuchcik 2006-05-19 10:35:27 UTC

To solve this wort-case scenario a similar technique as for attributes
was applied for text-nodes, cdata-section-nodes, comment-nodes and
processing-instruction-nodes: we'll resolve to the nearest element of
such nodes in order to use the indexes of those elements for faster
comparison. Note that the fix won't have an effect for non-indexed trees,
so always call xmlXPathOrderDocElems() (xpath.c) if you have big trees with
many siblings. Libxslt automatically creates indexes for input trees.

Current results 
----------------
- run in a VMWare with win
- run with the refactored code (defined XSLT_REFACTORED); but this
  should not have impact wrt to speed in this scenario

Basic structure of the scenario:
<foo>
  <bar/>
</foo>

1000 "bar" elements
-------------------
xsltproc --profile bug165547.xsl big-list-1000.xml

number               match                name      mode  Calls Tot 100us Avg

    0                            nodes-or-text                1   1130   1130
    1                               nodes-only                1     73     73
    2                                text-only                1     73     73
    3                  foo                                    1      2      2

                         Total                                4   1278

3000 "bar" elements
-------------------
xsltproc --profile bug165547.xsl big-list-3000.xml

number               match                name      mode  Calls Tot 100us Avg

    0                            nodes-or-text                1   9826   9826
    1                                text-only                1    219    219
    2                               nodes-only                1    197    197
    3                  foo                                    1      3      3

                         Total                                4  10245

10000 "bar" elements
-------------------
xsltproc --profile bug165547.xsl big-list-1
0000.xml
Max inodes: 4
Max LREs  : 2
number               match                name      mode  Calls Tot 100us Avg

    0                            nodes-or-text                1 593353 593353
    1                               nodes-only                1   1831   1831
    2                                text-only                1   1723   1723
    3                  foo                                    1      4      4

                         Total                                4 596911

Comment 5 kbuchcik 2006-05-19 11:26:57 UTC

The optimization is in CVS now, xpath.c, revisiion 1.304.

Comment 6 kbuchcik 2006-05-22 15:45:31 UTC

Eliminated sorting of node-sets if used for the XPath count() function.
Committed to CVS, xpath.c, revision 1.306

The union-expression "count(*|text())" still takes longer. This is due
to the fact that those 2 result node-sets need to be merged, so
(for 10000 "bar" elements) 10000 text-nodes need to be pointer-compared
with 10000 element-nodes, thus we get 10000000 comparisons and additional
node-type checks.

One (theoretical) way I see to optimize this is to rewrite the XPath
module to use a tree-traversal/node-testing/predicate-applying
mechanism which can evaluate each of the unioned expressions against
the same node, so we wouldn't have to merge. But I don't think we want
to do that ;-)

An other way to enhance this is to expand the xmlXPathObj to hold
more information. E.g. a field like nodeSetType would be used to
identify if a node-list consist of nodes of a specific type or of
mixed types.
For our scenario this would mean that we wouldn't need to take care
of duplicates during the merge, since one node-set consists of
element-nodes and the other of text-nodes.
(By the way, I would like the xmlXPathObj to hold a flag
indicating if a node-set was already sorted)

Current results 
----------------
- still run in a VMWare with win, memory-debug dll
- still run with the refactored code (defined XSLT_REFACTORED); but this
  should not have impact wrt to speed in this scenario

1000 "bar" elements
-------------------
xsltproc --profile bug165547.xsl big-list-1000.xml
number               match                name      mode  Calls Tot 100us Avg

    0                            nodes-or-text                1    754    754
    1                               nodes-only                1     43     43
    2                                text-only                1     33     33
    3                  foo                                    1      3      3

                         Total                                4    833

3000 "bar" elements
-------------------
xsltproc --profile bug165547.xsl big-list-3000.xml
number               match                name      mode  Calls Tot 100us Avg

    0                            nodes-or-text                1   8513   8513
    1                               nodes-only                1     75     75
    2                                text-only                1     65     65
    3                  foo                                    1      2      2

                         Total                                4   8655

10000 "bar" elements
--------------------
xsltproc --profile bug165547.xsl big-list-10000.xml
number               match                name      mode  Calls Tot 100us Avg

    0                            nodes-or-text                1 528952 528952
    1                               nodes-only                1    279    279
    2                                text-only                1    250    250
    3                  foo                                    1      5      5

                         Total                                4 529486

Comment 7 Stefan Sauer (gstreamer, gtkdoc dev) 2010-06-21 18:37:33 UTC

Is there still some work to do with regard to this bug/rfe?

Comment 8 GNOME Infrastructure Team 2021-07-05 10:59:34 UTC

GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org.
As part of that, we are mass-closing older open tickets in bugzilla.gnome.org
which have not seen updates for a longer time (resources are unfortunately
quite limited so not every ticket can get handled).

If you can still reproduce the situation described in this ticket in a recent
and supported software version, then please follow
  https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines
and create a new ticket at
  https://gitlab.gnome.org/GNOME/libxslt/-/issues/

Thank you for your understanding and your help.