After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 737009 - signal handler lookup doesn't scale
signal handler lookup doesn't scale
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: gobject
unspecified
Other Linux
: Normal normal
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks: 732290
 
 
Reported: 2014-09-20 03:34 UTC by Matthias Clasen
Modified: 2015-05-27 18:41 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Add a performance test for signal connection (7.33 KB, patch)
2014-09-20 04:46 UTC, Matthias Clasen
none Details | Review
Add a global signal handler table (4.71 KB, patch)
2014-09-20 05:09 UTC, Matthias Clasen
none Details | Review
Add a performance test for signal connection (8.28 KB, patch)
2014-09-27 20:16 UTC, Matthias Clasen
committed Details | Review
Add a global signal handler table (6.04 KB, patch)
2014-09-27 20:17 UTC, Matthias Clasen
committed Details | Review

Description Matthias Clasen 2014-09-20 03:34:13 UTC
GTK+ has a singleton object (the screens GtkStyleCascade) with a signal (-gtk-private-changed) that every widget's style context connects to. For reasonably-sized applications, that can be several thousand handlers (I've seen 25000 handlers in polari). And although g_signal_handler_disconnect() takes a handler ID, it is not O(1), but iterates a list. In the case of polari, this occasionally causes the application to freeze for multiple seconds, as widgets get destroyed, causing their style contexts to disconnect from the cascade signal.
Comment 1 Allison Karlitskaya (desrt) 2014-09-20 03:49:51 UTC
best first idea: return the Handler* as the signal id

problem: gulong on 64bit windows is smaller than a void*

also: probably some people are using uint to store handler IDs (because they didn't read the docs).


next idea: make handler IDs globally unique 32bit integer (on the assumption that there is not likely to be more than 4bil handler per process)

we could then have one big global hash table to allow O(1) lookup of signal handlers by ID which would keep overhead low.
Comment 2 Matthias Clasen 2014-09-20 04:46:20 UTC
Created attachment 286676 [details] [review]
Add a performance test for signal connection

This test checks the performance of connecting and disconnecting
many handlers. Various cases are checked: disconnect in the same
order, in the inverse order, at random. Connect to one signal on
a single object, to two signals on the same object, or to the same
signal on two different objects.
Comment 3 Matthias Clasen 2014-09-20 04:47:11 UTC
Sample output:

[mclasen@localhost tests]$ ./signal-handler -m perf --verbose
GTest: random seed: R02S979ba26b25918140580df96472d965dd
GTest: run: /signal/handler/connect-many
(MINPERF:connected 50000 handlers in  0.061 seconds)
GTest: result: OK
GTest: run: /signal/handler/disconnect-many-ordered
(MINPERF:disconnected 50000 handlers in  0.015 seconds)
GTest: result: OK
GTest: run: /signal/handler/disconnect-many-inverse
(MINPERF:disconnected 50000 handlers in  6.134 seconds)
GTest: result: OK
GTest: run: /signal/handler/disconnect-many-random
(MINPERF:disconnected 50000 handlers in  4.263 seconds)
GTest: result: OK
GTest: run: /signal/handler/disconnect-2-signals
(MINPERF:disconnected 50000 handlers in  8.975 seconds)
GTest: result: OK
GTest: run: /signal/handler/disconnect-2-objects
(MINPERF:disconnected 50000 handlers in  4.378 seconds)
GTest: result: OK
Comment 4 Matthias Clasen 2014-09-20 05:09:38 UTC
Created attachment 286679 [details] [review]
Add a global signal handler table

This is a quick-and-dirty patch to add a global table for
signal handlers.
Comment 5 Matthias Clasen 2014-09-20 05:10:39 UTC
Test output with this patch:

[mclasen@localhost tests]$ ./signal-handler -m perf --verbose
GTest: random seed: R02Sf08eaafaf5fd398e90aa0e485bfa56ee
GTest: run: /signal/handler/connect-many
(MINPERF:connected 50000 handlers in  0.073 seconds)
GTest: result: OK
GTest: run: /signal/handler/disconnect-many-ordered
(MINPERF:disconnected 50000 handlers in  0.018 seconds)
GTest: result: OK
GTest: run: /signal/handler/disconnect-many-inverse
(MINPERF:disconnected 50000 handlers in  0.017 seconds)
GTest: result: OK
GTest: run: /signal/handler/disconnect-many-random
(MINPERF:disconnected 50000 handlers in  0.030 seconds)
GTest: result: OK
GTest: run: /signal/handler/disconnect-2-signals
(MINPERF:disconnected 50000 handlers in  0.030 seconds)
GTest: result: OK
GTest: run: /signal/handler/disconnect-2-objects
(MINPERF:disconnected 50000 handlers in  0.030 seconds)
GTest: result: OK
Comment 6 Matthias Clasen 2014-09-27 20:16:54 UTC
Created attachment 287261 [details] [review]
Add a performance test for signal connection
Comment 7 Matthias Clasen 2014-09-27 20:17:36 UTC
Created attachment 287262 [details] [review]
Add a global signal handler table
Comment 8 Matthias Clasen 2014-09-27 20:19:35 UTC
Here is another iteration of these patches. Changes:
1) Add a test for blocking/unblocking signal handlers
2) Use a hash set to safe some space
3) Use the hash in lookup_handler(), so more places benefit from the constant time lookup
Comment 9 Matthias Clasen 2014-09-28 14:15:43 UTC
One thing to point out: at least on 64bit, storing the signal id in the Handler struct does not require extra space - there was a 4-byte hole in the struct anyway.
Comment 10 Matthias Clasen 2014-09-29 21:12:59 UTC
just to give some idea of the size of the global table, here's some numbers for launching a few apps:

bloatpad:               780 signal handlers
gedit:                ~4000 signal handlers
gtk3-widget-factory:  ~8000 signal handlers
gimp:                ~46000 signal handlers
Comment 11 Allison Karlitskaya (desrt) 2014-11-21 04:05:38 UTC
Review of attachment 287262 [details] [review]:

After some reading and trying to come up with better solutions, I think this is probably more or less the best that we can do with the current API.

Two other possibilities to consider:

1) turn all Handler structs into a gigantic array and use array position as handler ID, using a random approach to find empty slots and growing the array if we fail after a certain number of tries.  We could save a fair bit of memory by doing this by replacing the various linked list pointers with integer offsets and removing the sequence number field (since it would be apparent from array position).  This approach has two main downfalls.  The first is that we reissue handler IDs (which may cause trouble with existing code that tries to remove handlers twice) and the second is that we risk wasting a lot of memory in the case that the user connects a million handlers and then disconnects all but the last one.

2) use the handler struct pointer directly as its ID.  This wouldn't work on W64 because long is smaller than void* there and we would still need a table to validate incoming "ids" for actually existing.  We also will end up reissuing ids sometimes.



A better API might be to introduce several signal connect flags allowing the user to specify their intent with respect to disconnection:

 - disconnect by id

 - disconnect by user_data

 - disconnect by func

and to only insert handlers into fast-lookup tables depending on the user's preferred method.  In this way we could also make other forms of disconnection very fast without wasting memory in the case that they will not be used.  In the event that the user gives no flags then we don't add any fast-lookup entries.
Comment 12 Matthias Clasen 2015-05-27 18:41:38 UTC
Attachment 287261 [details] pushed as 8a97dc5 - Add a performance test for signal connection
Attachment 287262 [details] pushed as 916297b - Add a global signal handler table