After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 543489 - slow xslt processing
slow xslt processing
Status: RESOLVED NOTGNOME
Product: gtk-doc
Classification: Platform
Component: general
unspecified
Other Linux
: Normal enhancement
: ---
Assigned To: gtk-doc maintainers
gtk-doc maintainers
: 592355 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2008-07-17 20:21 UTC by Allison Karlitskaya (desrt)
Modified: 2014-02-08 20:32 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Allison Karlitskaya (desrt) 2008-07-17 20:21:15 UTC
xsltproc is very slow -- the slowest part of building all of glib, for example.  it's also the only part that has to be done completely serially.

gtk-doc should somehow support parallel building so that it can use all my cores
Comment 1 Stefan Sauer (gstreamer, gtkdoc dev) 2008-07-21 11:29:00 UTC
There is some help comming. I could e.g. cut down the build time fro gtk's gtk-doc from 20 to about 9 minutes. Regarding the -j options I take patches :)
Comment 2 Stefan Sauer (gstreamer, gtkdoc dev) 2009-01-07 07:21:48 UTC
Also to clarify, if you want xsltproc to use your cores, file a bug against libxslt please.
Comment 3 Stefan Sauer (gstreamer, gtkdoc dev) 2009-10-27 20:26:49 UTC
Right now there are bugs in the Makefile dependencies

./autogen.sh --enable-gtk-doc
make -j2

  ...
  make[2]: Entering directory `/xxx/docs'
  Making all in api
  make[3]: Entering directory `/xxx/docs/api'
  make[3]: *** No rule to make target `xxx-docs.sgml', needed by
  `html-build.stamp'.  Stop.
  make[3]: *** Waiting for unfinished jobs....
  make[3]: Leaving directory `/xxx/docs/api'
  make[2]: *** [all-recursive] Error 1
  make[2]: Leaving directory `/xxx/docs'
Comment 4 Stefan Sauer (gstreamer, gtkdoc dev) 2010-01-03 12:08:26 UTC
*** Bug 592355 has been marked as a duplicate of this bug. ***
Comment 5 Stefan Sauer (gstreamer, gtkdoc dev) 2010-01-03 20:38:28 UTC
This change fixes parallel builds for me - not sure if it is correct though

diff --git a/tests/gtk-doc.make b/tests/gtk-doc.make
index 61a9eac..9d59689 100644
--- a/tests/gtk-doc.make
+++ b/tests/gtk-doc.make
@@ -102,7 +102,7 @@ sgml-build.stamp: tmpl.stamp $(DOC_MODULE)-sections.txt $(srcdir)/tmpl/*.sgml $(
        gtkdoc-mkdb --module=$(DOC_MODULE) --source-dir=$(DOC_SOURCE_DIR) --output-format=xml --expand-content-files="$(expand_content_files)" --main-sgml-file=$(DOC_MAIN_SGML_FILE) $(MKDB_OPTIONS)
        @touch sgml-build.stamp
 
-sgml.stamp: sgml-build.stamp
+sgml.stamp $(DOC_MAIN_SGML_FILE): sgml-build.stamp
        @true
 
 #### html ####
Comment 6 Allison Karlitskaya (desrt) 2010-01-04 05:40:07 UTC
when i filed this bug i was thinking more like some way of breaking up the XSL templates so that you can generate the .html output one file at a time.  that way you could run multiple html files in parallel.  it's probably a fairly substantial change.
Comment 7 Yeti 2010-01-04 09:50:06 UTC
> you can generate the .html output one file at a time

This would e.g. mean that you have resolve all cross-references in the XML manually, find a way to correctly generate index, toc, etc. without processing the entire document.  That's not a `fairly substantial change', that's pretty insane.
Comment 8 Stefan Sauer (gstreamer, gtkdoc dev) 2010-01-04 13:15:28 UTC
Ryan, I have made some attempt regarding this already, but it is quite complex. I wanted to rebuild single html page when the related xml file has changes to avoid rebuilding all files. This can only work under some assumptions:
- one has to uses the xi-include indexes (not he xsl generated ones)
- there should be no autogenerated ids

Getting the makefiles rules right is tricky too. Before we are going there it would still be good to poke the libxml/libxslt people more and tell them we are not satisfied with the performances. E.g. having a means to cache a parsed style sheet could help us (as we can't keep it loaded like webapps do).
Comment 9 Yeti 2010-01-04 13:26:53 UTC
There much more assumptions.

If the title, indexable stuff (including Since status, added functions, ...), object hierarchy, ..., changes, it is not sufficient to [re]build a single HTML page.
Comment 10 Stefan Sauer (gstreamer, gtkdoc dev) 2010-01-04 13:34:15 UTC
David, if you change one header, gtkdoc-mkdb would rebuild:
- one or more xml/<section>.xml files
- eventualy also indexes such as xml/{tree_index,api_index,...}.xml

In most cases it would be enough to rebuild the html files. If the hierarchy changes, gtkdoc-mkdb would change the xml files of objects that are affected (new prerequisite iface, new object in hierarchy).
Comment 11 Yeti 2010-01-04 13:49:32 UTC
The trouble is that at this moment you can use any inference and content generation mechanisms DocBook offers.  This will be lost.  The processing is inherently global and I don't think is it reasonable to fight this.  Consider also the single-page, PDF, man and other processing options...

The questions should be:

1) why can't xsltproc itself make use of multiple threads?

2) why it takes so long -- even compared to typical compilation of other DocBook documents?

We cannot do much with 1) but maybe we can do something with 2).
Comment 12 Stefan Sauer (gstreamer, gtkdoc dev) 2010-01-04 14:37:32 UTC
I will resubscribe to xml-devel list and try to get some answers.
Comment 13 Stefan Sauer (gstreamer, gtkdoc dev) 2010-03-05 10:24:21 UTC
I renamed the bug as imho there is nothing wrong with our makefile rules. Also a small status update:

libxml/libxslt profiling
- some functions show up hight in the profile
- no easy candidates for optimizations
  - I tried addings G_LIKELY/UNLIKELY macros for libxml2, but it speeds up by 1% if at all
- there is a *lot* of memcpy and strcmp as expected

docbook stylesheet
- if would be nice to have a mean to generate a variant of the official stylesheets, with customizations preapplied

xslt compilers
- I tried some xslt compilers, but did not even succeded to built them, old crufty c++ code :/

multithreading
- after chunking the outputs could be generated by multiple worker threads (one shared readonly source document, multiple output files). I've asked on libxslt list - no reply.
Comment 14 Stefan Sauer (gstreamer, gtkdoc dev) 2014-02-08 20:32:52 UTC
Closing this now. if we want faster doc build, we need to speedup xslt proc (e.g. make the chunker multithreaded).