GNOME Bugzilla – Bug 647312
Streaming API slow on documents with many xml:id's
Last modified: 2021-07-05 13:22:24 UTC
I'm trying to parse documents with large amounts of xml:id's using the streaming API. I noticed that libxml2 is puzzlingly slow; my measurements suggests that the running time is quadratic: $ sh make-long-xml.sh 200000 | ( time xmllint --stream --noout - ) real 0m2.027s user 0m1.032s sys 0m0.048s $ sh make-long-xml.sh 400000 | ( time xmllint --stream --noout - ) real 0m4.335s user 0m3.232s sys 0m0.096s $ sh make-long-xml.sh 800000 | ( time xmllint --stream --noout - ) real 0m12.786s user 0m11.501s sys 0m0.208s $ sh make-long-xml.sh 1600000 | ( time xmllint --stream --noout - ) real 0m47.626s user 0m45.691s sys 0m0.620s
Created attachment 230111 [details] test-case producer Attaching the script I used to produce test cases.
Interesting, I will have a look at it, a simple profiler output will probably show the culprit ! Daniel
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled). If you can still reproduce the situation described in this ticket in a recent and supported software version, then please follow https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines and create a new ticket at https://gitlab.gnome.org/GNOME/libxml2/-/issues/ Thank you for your understanding and your help.