After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 599870 - [API] micro-optimisation: add GST_DEBUG_FUNCPTR_FULL
[API] micro-optimisation: add GST_DEBUG_FUNCPTR_FULL
Status: RESOLVED OBSOLETE
Product: GStreamer
Classification: Platform
Component: gstreamer (core)
git master
Other Linux
: Normal enhancement
: NONE
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2009-10-28 01:52 UTC by Tim-Philipp Müller
Modified: 2009-12-02 00:59 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Add GST_DEBUG_FUNCPTR_FULL() and use it in core (39.23 KB, patch)
2009-10-28 01:52 UTC, Tim-Philipp Müller
none Details | Review

Description Tim-Philipp Müller 2009-10-28 01:52:01 UTC
Created attachment 146401 [details] [review]
Add GST_DEBUG_FUNCPTR_FULL() and use it in core

Edward mentioned the other day that GST_DEBUG_FUNCPTR showed up in his list
when benchmarking decodebin2 prerolling, so that got me curious.

Turns out that in the course of playbin2 prerolling to play a video we are
re-registering about 1100 function names/pointers we've already registered.
Almost all of these come from gstpad.c and gstghostpad.c, and every attempt
will incur a hash table lookup protected by a mutex. Not sure if contention was
actually observed or not, but it's conceivable and is likely to be more
noticable in more complex dyanamic pipelines (buzztard etc.?).

Enter GST_DEBUG_FUNCPTR_FULL() which will shortcut the registration based on a
local (static) gboolean.

Admittedly not very elegant, but shortcuts 99% of the re-registration attempts
already, just by using it in gstpad.c and gstghostpad.c.

Not entirely sure if it's worth it at all, absolute numbers are still fairly
small after all, or if there's a nicer way or if it should just be kept
internal, but I thought I'd put it out there for other people to comment on.
Comment 1 Sebastian Dröge (slomo) 2009-10-28 06:25:32 UTC
Makes sense... letting instantiation block on a global mutex is probably not the best idea :)

Changing that mutex to a RW lock and letting _gst_debug_register_funcptr() first take a R lock to check if it already is inserted and only take a W lock if it isn't might be a cleaner fix though.
Comment 2 Edward Hervey 2009-10-28 07:32:30 UTC
Review of attachment 146401 [details] [review]:

::: gst/gstinfo.h
@@ +1398,2 @@
 #define GST_DEBUG_FUNCPTR(ptr) (ptr)
+#define GST_DEBUG_FUNCPTR_FULL(ptr,inited) (ptr)

When debugging is disabled, shouldn't that 'inited' boolean be removed somehow (since it won't be used) ?
Comment 3 Edward Hervey 2009-10-28 07:45:05 UTC
(In reply to comment #1)
> Makes sense... letting instantiation block on a global mutex is probably not
> the best idea :)
> 
> Changing that mutex to a RW lock and letting _gst_debug_register_funcptr()
> first take a R lock to check if it already is inserted and only take a W lock
> if it isn't might be a cleaner fix though.

  Use a rwlock will be even worse, since the glib rwlock isn't a native implementation, but instead takes two regular locks... so best case scenario you'd be now taking two lock, and worst case scenario four :)
Comment 4 Sebastian Dröge (slomo) 2009-10-28 07:56:24 UTC
Yes, I'm talking about sane RW locks :) Also, didn't you write a patch to let the RW locks use the pthread RW locks because of this? :P
Comment 5 Tim-Philipp Müller 2009-10-28 08:53:34 UTC
> When debugging is disabled, shouldn't that 'inited' boolean be removed somehow
> (since it won't be used) ?

I guess so. I wasn't going to litter the code with #ifndef #endif though unless a compiler actually
complained though. Filed it in the bridges-to-cross-when-we-get-there-section for now.
Comment 6 Tim-Philipp Müller 2009-10-28 08:58:59 UTC
> Changing that mutex to a RW lock and letting _gst_debug_register_funcptr()
> first take a R lock to check if it already is inserted and only take a W lock
> if it isn't might be a cleaner fix though.

Depends on whether the issue was primarily contention (doubtful?) or the hash table lookups I guess.
Comment 7 Sebastian Dröge (slomo) 2009-10-28 09:13:38 UTC
(In reply to comment #4)
> Yes, I'm talking about sane RW locks :) Also, didn't you write a patch to let
> the RW locks use the pthread RW locks because of this? :P

Also you could use #ifdef HAVE_PTHREAD a pthread RW lock implementation inside gstinfo.c for that hash table, use GStaticRWMutex if GLib new enough and contains your patch and otherwise continue to use the broken normal mutex otherwise.

(In reply to comment #6)
> Depends on whether the issue was primarily contention (doubtful?) or the hash
> table lookups I guess.

I was expecting that contention was the problem here. Hash table lookups should be rather fast in this case, it should more or less just be

somearray[(funcptr % somevalue)] and check if the pointers are equal (*)

If there are multiple function pointers mapped to the same bucket this would just be another (very few) indirections through a linked list (chained hash table implementation) or some more of (*) (open addressing).



Another source of overhead would be the static mutex implementation btw, Edward said it's very inefficient too... so maybe a GMutex instead of a GStaticMutex would help too.
Comment 8 Edward Hervey 2009-10-28 10:24:33 UTC
(In reply to comment #6)
> > Changing that mutex to a RW lock and letting _gst_debug_register_funcptr()
> > first take a R lock to check if it already is inserted and only take a W lock
> > if it isn't might be a cleaner fix though.
> 
> Depends on whether the issue was primarily contention (doubtful?) or the hash
> table lookups I guess.

both actually.

For every single element/pad/object that you create that uses those decorators... you'll be:
(1) introducing extra contention, what's worse not once but many times (for every GST_DEBUG_FUNCPTR call)
(2) using g_hash_table ... when you shouldn't need to

Maybe we could adjust the macro to be a G_STMT with the volatile inside.
Comment 9 David Schleef 2009-10-28 19:44:15 UTC
Thinking back to my non-existent CS days, shouldn't the ideal hash table lookup in this case be lockless?  (Or rather, lockless check followed by lock;recheck;register;unlock.)  If this is not the case, let's fix glib.
Comment 10 Sebastian Dröge (slomo) 2009-10-28 20:22:25 UTC
(In reply to comment #9)
> Thinking back to my non-existent CS days, shouldn't the ideal hash table lookup
> in this case be lockless?  (Or rather, lockless check followed by
> lock;recheck;register;unlock.)  If this is not the case, let's fix glib.

Well, GHashTable simply isn't threadsafe and you need to do the locking from the outside :(
Comment 11 Sebastian Dröge (slomo) 2009-11-06 15:27:13 UTC
(In reply to comment #8)
> (In reply to comment #6)
> > > Changing that mutex to a RW lock and letting _gst_debug_register_funcptr()
> > > first take a R lock to check if it already is inserted and only take a W lock
> > > if it isn't might be a cleaner fix though.
> > 
> > Depends on whether the issue was primarily contention (doubtful?) or the hash
> > table lookups I guess.

> Maybe we could adjust the macro to be a G_STMT with the volatile inside.

That's probably the best solution... any reason not to do that? :)
Comment 12 Tim-Philipp Müller 2009-11-06 17:02:13 UTC
> > Maybe we could adjust the macro to be a G_STMT with the volatile inside.
> 
> That's probably the best solution... any reason not to do that? :)

I didn't know it was possible to do that in a portable way? (afaik this would rely on gcc extensions)

(Also, while it's cleaner API wise it's really a bit ugly under the hood, with it adding 5-20 static booleans per function where just a single one would do just as well in practice...)

FWIW, my plan was to just put the macro into gstprivate.h for now and commit the gst/*pad* bits, which takes care of 90% of the calls and presumably contention already, unless we can make the macro suggestion work in a portable way.