GNOME Bugzilla – Bug 588139
Support notifications when main loop callbacks take too long
Last modified: 2013-11-23 20:45:37 UTC
I did quite a neat hack yesterday that I think is so useful that it should somehow add up supported in glib. The motivation for this patch was work on GtkEntryCompletion. When typing text into the GtkEntry, I wanted to see the completion update while typing and not delayed, but no matter how well I optimized the code and removed hotspots as shown by sysprof, entering text still caused noticable stuttering. So I wondered if maybe some main loop callbacks can take a long time and delay other callbacks for too long. This is where the following hack came in:
Created attachment 138105 [details] [review] Add a hack to print time elapsed in main loop callbacks This is a very useful hack to figure out which functions are responsible for a "laggy", "unresponsive" or "stuttering". It prints lines like 590ms priority 110 gtk_container_idle_sizer() [gtkcontainer.c:1332] which takes you that the callback gtk_container_idle_sizer() from file gtkcontainer.c line 1332 took 590ms to execute. Everything that takes more than 50ms is questionable, everything that takes longer than 100ms is noticable provided the user interacts with the application (like when typing or dragging with the mouse). To find offenders, compile glib with this patch applied, redirect the output of the app to a file and sort -n it.
The above patch works very well for what I've done, buthas a few things that are not so nice: - There is no good way to get a good textual representation for GSource emissions. This is not trivial, because a lot of them are indirect and the stuff glib knows about them is pretty useless. Important examples here are the gdk event dispatching source (which should likely print the GdkEvent it's emitting), the indirection done by gdk_threads_add_(idle|timeout) or async completion callbacks. I would suggest having API like this: typedef char * (* GSourceDescribeFunc) (GSource *); void g_source_set_describe_func (GSource *, GSourceDescribeFunc); char *g_source_describe (GSource *); and making use of this in gtk and gio. - I need a good way to lookup function names. The patch above makes use of the lookup_symbol() function from cairo, but it would be better to have it inside glib, in particular with the above API suggestions. See also bug 588141 - There needs to be a way to activate the output of this as it can be quite verbose. It's probably best to use G_DEBUG=callback_times? - It would be nice if there was a way to set the time in ms from which the printout triggers. Currently it's hardcoded to only print stuff that takes longer than 1ms. This is quite noisy but that noise is occasionally useful.
I'd previously written a GTK+ module which inserted itself as an idle handler and checked that the time between invocations wasn't more than a few ms, so I'd love to see something like this land in glib.
> - There needs to be a way to activate the output of this as it can be quite > verbose. It's probably best to use G_DEBUG=callback_times? Yeah, I think we want to have this conditionally compiled in and triggered by some debug flag as you propose. > - It would be nice if there was a way to set the time in ms from which the > printout triggers. Currently it's hardcoded to only print stuff that takes > longer than 1ms. This is quite noisy but that noise is occasionally useful. Using an env var for that seems perfectly fine, I think.
Benjamin, did you consider adding G_DEBUG support to your patch?
I think I decided at some point that the proper fix would be to provide this functionality with gdb and/or systemtap because that's a lot more useful than just printing some random string to stderr. And then I stopped thinking about this. So no, I didn't think about that any further.
I stuck a g_source_set_name to bug 606044.
*** Bug 688246 has been marked as a duplicate of this bug. ***
Created attachment 228982 [details] [review] Optionally emit warnings whenever the main loop is blocked in dispatch() This is intended to be a minimal-overhead version of this functionality: there are a couple of extra calls to g_get_monotonic_time(), but there are no extra locks, there's no code to look up symbols or do backtraces, it only looks at the default main-context (the one that would be returned by g_main_context_default()), and there's no I/O unless a dispatch() has already taken too long. The idea is that this can be used to verify whether attempts to reduce main-thread blocking have worked, without the profiling overhead perturbing the timings itself! --- This doesn't catch a long check() or prepare() either; that could be added. It also isn't configurable at runtime. To meet the goal of not affecting timings too much itself, the threshold would have to be set up early (glib_init()?) and stored in a global variable or something.
Created attachment 228983 [details] [review] better version of my previous patch, using GSources' names
*** Bug 708178 has been marked as a duplicate of this bug. ***
Created attachment 255063 [details] [review] gmain: add a time budget warning mechanism Adding my contribution to the bug :) I wanted to investigate where GnomeShell is blocking during its animations. Obviously just printing the name of the symbol called by gmain.c is not enough, because you could be blocking deep down the call stack. This patch uses unix timers to trigger a signal when a given allocated time budget has been spent and saves the backtrace when the signal is fired. The backtrace is then printed on the console after the dispatch is done. This patch also takes care of pausing the unix timer whenever you run a mainloop within a mainloop with the same context. This avoids printing backtraces if you're running a DBus call or GtkDialog with inner mainloop.
Review of attachment 255063 [details] [review]: Maybe use a single #ifndef G_OS_WIN32 and define a bunch of static functions + macros, on the else of the #ifndef define empty macros and place the macros invocations in g_main_dispatch() so that this part is not cluttered with #ifdefs ::: glib/gmain.c @@ +3049,3 @@ + ts->tv_nsec = (ms - (ts->tv_sec * 1000)) * 1000 * 1000; +} + needs an #ifndef G_OS_WIN32 @@ +3080,3 @@ GSList current_source_link; + struct itimerspec its; + GSource *saved_source = NULL; needs an #ifndef G_OS_WIN32 @@ +5177,3 @@ + + free (symbols); + I wish we had a nicer backtrace impl in glib :/
See bug 710741 for a slightly different take on this
I've committed the trace points now, so you can use systemtap to find out how long your callbacks take.