GNOME Bugzilla – Bug 744458
Diagnostic warning if sync GIO calls take too long
Last modified: 2016-06-16 20:38:03 UTC
As discussed on DDL, perhaps we could add a diagnostic warning to GIO sync calls which would be emitted:
1. if G_ENABLE_DIAGNOSTICS is enabled,
2. for each GIO sync call (e.g. g_input_stream_read()) made from the main thread,
3. if that call blocks the main thread for too long.
What do we count as ‘too long’? More than one GTK+ frame (~17ms?)?
The idea here would be to help in tracking down sync I/O operations which are blocking the main thread. There is the possibility (still being discussed on DDL) of also modifying the GIO docs to discourage people writing code that way in the first place, but that’s a separate issue.
Suggestion by Colin to also limit the warnings to being after the first GMainContext iteration:
Not sure we can limit it to after the first GTK+ paint, since that would require some kind of feedback path from GTK+ to GLib/GIO.
There's no reason this should be specific to GIO... you don't want any other sync calls blocking your main loop either.
It also shouldn't be specific to gtk; you don't want clutter-based apps to block either. I think in general, if you are using a GMainContext, then you shouldn't be blocking it.
I think someone wrote a patch to GMainContext to do this once before, but I can't find it now...
Benjamin had one at some point, I believe
Never managed to use systemtap with jhbuilt libraries FWIW.
(In reply to Dan Winship from comment #2)
> There's no reason this should be specific to GIO... you don't want any other
> sync calls blocking your main loop either.
> It also shouldn't be specific to gtk; you don't want clutter-based apps to
> block either. I think in general, if you are using a GMainContext, then you
> shouldn't be blocking it.
> I think someone wrote a patch to GMainContext to do this once before, but I
> can't find it now...
The approach of warning when a specific GSource takes too long is alright, but I suspect we would end up seeing a lot of anonymous GIdleSources in there from GTask’s complete_in_idle_cb().
I think there’s an important difference between instrumenting the GMainContext and instrumenting the *_sync() functions themselves. With the latter, it should be a lot easier to find the call site which is causing the problem; whereas with the former all you see is that some sync function is being called somewhere in a stack below a standard GSource dispatch.
I could be overestimating the level of function call nesting people hide their sync calls below, and the two approaches could actually be equally useful…but I think they are fundamentally different.
As discussed in various places, this is best handled using SystemTap. We now have a fairly complete set of SystemTap probes for the main context and GSources (bug #759813). Dunfell (https://github.com/pwithnall/dunfell) can visualise the main context and should obviously highlight main context dispatches which are taking a long time. Given backtrace data from SystemTap (which Dunfell doesn't currently capture and present — but it should) this should make debugging long *_sync() calls fairly straightforward, just by finding the main context dispatch which they block.