Bug 654530 – Improve CPU usage of DIDL-Lite parsing

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 654530 - Improve CPU usage of DIDL-Lite parsing


Summary:	Improve CPU usage of DIDL-Lite parsing


Status:	RESOLVED OBSOLETE

Product:	gupnp-av
Classification:	Other
Component:	General
Version:	unspecified
Hardware:	Other Linux

Importance:	Normal enhancement
Target Milestone:	---
Assigned To:	GUPnP Maintainers
QA Contact:	GUPnP Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2011-07-13 08:31 UTC by Jens Georg
Modified:	2021-05-17 17:04 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Callgrind output (525.51 KB, application/octet-stream) 2012-07-28 12:00 UTC, Jens Georg	Details
Tool to Split large DIDL (853 bytes, text/x-vala) 2012-07-28 12:02 UTC, Jens Georg	Details
Tool that parses single didl snippets (873 bytes, text/x-vala) 2012-07-28 12:03 UTC, Jens Georg	Details

Description Jens Georg 2011-07-13 08:31:59 UTC

DIDL-Lite parsing seems to be unnecessarily heavyweight:

> I can't point at a particular issue. All I can say is that parsing
> DIDL-Lite takes a considerable amount of CPU cycles. It might help to
> use a SAX based parser instead of building the DOM, but that would be a
> major rewrite of gupnp-av and I am not sure if it's worth the effort.

Comment 1 Mark Ryan 2012-04-19 12:42:45 UTC

Looking at the code, the problem may be due to the implementation of GUPnPDIDLLiteObject and its sub-classes.  It may be that it is the GUPnPDIDLLiteObject methods used to retrieve object values that are slow, rather than the parsing code itself.

GUPnPDIDLLiteObjects do not store separate member variables for each of their values.  Instead they store this information in an in-memory xml tree.  Each time the user tries to read a property from the GUPnPDIDLLiteObject, a linear search of the XML tree's children or its attributes must be performed, resulting in lots of strcmps.  If the item contains lots of properties, this is going to be slow, particularly if you try to read them all, which I guess would be a typical use case.  The penalty is heaviest if a given property does not exist, which again, is probably quite typical.

It would probably be more efficient to extract all the values when the object is parsed and store them as member variables in the object.  The parser itself would be slightly slower, but any function called to retrieve item values would execute in constant time, i.e., the "object-available" callback would be much faster.  Setting values would be quicker as well.

This analysis is based on looking at the code.  I have no empirical evidence to back up my claims.  Also as Jens mentions, this would be a big change.

Comment 2 Jens Georg 2012-04-19 12:46:44 UTC

Oh. I thought I added my benchmark to this bug. An awful lot of time (~70% IIRC) really is spent in the XML parsing. I'll try to dig up the sample code I used and attach it here.

Comment 3 Zeeshan Ali 2012-04-19 14:00:37 UTC

Mark, Thanks so much for your analysis. When I first wrote this code, I must admit that I was focused on getting the memory footprint low while still using DOM. Unfortunately there is many string props involved and keeping them in memory would most probably mean increasing our memory footprint significantly. But as Jens pointed out, retreival of props isn't the biggest performance culprit here.

Comment 4 Mark Ryan 2012-04-19 15:43:49 UTC

I was thinking that property retrieval would slow down the parser if you were to register an "object-available" callback that retrieves lots of properties from the objects it is passed.  Since such callbacks are run before gupnp_didl_lite_parser_parse_didl completes, I thought that they might be the bottleneck.  But it seems from your and Jens' comments that this is not the case.

Comment 5 Jens Georg 2012-04-20 10:39:57 UTC

That would indeed be an additional penalty on top.

Comment 6 Jens Georg 2012-07-28 12:00:09 UTC

Created attachment 219771 [details]
Callgrind output

Comment 7 Jens Georg 2012-07-28 12:02:16 UTC

Created attachment 219772 [details]
Tool to Split large DIDL

Used to split a large didl response from a i.e. a search(upnp:class derivedFrom "object.item") into small files.

Comment 8 Jens Georg 2012-07-28 12:03:32 UTC

Created attachment 219773 [details]
Tool that parses single didl snippets

Take the didl snippet files and just run gupnp_didl_lite_parser_parse on them

Comment 9 GNOME Infrastructure Team 2021-05-17 17:04:33 UTC

-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/gupnp-av/-/issues/1.