After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 747166 - Feature request: error on high GUI latency / GUI hang (like Android's ANR)
Feature request: error on high GUI latency / GUI hang (like Android's ANR)
Status: RESOLVED OBSOLETE
Product: gtk+
Classification: Platform
Component: Debugging
3.16.x
Other Linux
: Normal enhancement
: ---
Assigned To: gtk-bugs
gtk-bugs
Depends on:
Blocks:
 
 
Reported: 2015-04-01 10:23 UTC by Chris Bainbridge
Modified: 2018-04-15 00:08 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Chris Bainbridge 2015-04-01 10:23:36 UTC
One of the most useful features of Android from a GUI latency perspective is the "Application Not Responding" event dialog which is opened when the period between calls to the GUI event processing thread exceeds some timeout. ie. a hung GUI is treated as an error.

http://developer.android.com/training/articles/perf-anr.html

This helps to catch errors where blocking calls are done on the GUI thread and the application becomes non-responsive.

Example where this would have caught a serious error earlier: https://bugzilla.gnome.org/show_bug.cgi?id=737022
Comment 1 Emmanuele Bassi (:ebassi) 2015-04-01 19:21:40 UTC
We also have bug 588139 about sources in the main loop blocking, and getting notification for those cases.

It's unlikely that the toolkit will be able to spawn a dialog for that on its own: if the main loop is blocked then control won't re-enter into the toolkit, and we cannot use a separate thread to act as a watchdog without going into thread safety issues, with two threads trying to access GDK resources at the same time.

The window manager/display server is in a better position for that, but right now the specification of the ping protocol is not exactly stellar, with WMs updating timestamps only in specific cases, instead of intercepting all pointer/touch events and checking if the application is responsive via ping.
Comment 2 Matthias Clasen 2015-04-02 12:20:58 UTC
I think it would be interesting to show information like missed frames or blocking times in the inspector, but that would be after-the-fact.
Comment 3 Chris Bainbridge 2015-04-06 16:24:01 UTC
> It's unlikely that the toolkit will be able to spawn a dialog for that on its
> own: if the main loop is blocked then control won't re-enter into the toolkit,
> and we cannot use a separate thread to act as a watchdog without going into
> thread safety issues, with two threads trying to access GDK resources at the
> same time.

It might be possible to use a timer to interrupt the UI thread or start a separate process? Although that might interfere with any timers that the rest of the program has set.

> The window manager/display server is in a better position for that, but right
> now the specification of the ping protocol is not exactly stellar, with WMs
> updating timestamps only in specific cases, instead of intercepting all
> pointer/touch events and checking if the application is responsive via ping.

That is how Chrome OS does it. _NET_WM_PING with a 20 second timeout, send SIGABRT, which starts the crash reporter: https://chromium.googlesource.com/chromiumos/platform/window_manager/+/a6531b90974718192287c292e64a73b5bcd60629/chrome_watchdog.cc https://code.google.com/p/chromium/issues/detail?id=188057
Comment 4 Chris Bainbridge 2015-04-06 16:32:55 UTC
> I think it would be interesting to show information like missed frames or
> blocking times in the inspector, but that would be after-the-fact.

The Chrome WM reports a histogram showing the last 60 seconds of ping response times before it aborts the hung process. That could be useful for developers.
Comment 5 Emmanuele Bassi (:ebassi) 2015-04-06 20:05:45 UTC
(In reply to Chris Bainbridge from comment #3)
> > It's unlikely that the toolkit will be able to spawn a dialog for that on its
> > own: if the main loop is blocked then control won't re-enter into the toolkit,
> > and we cannot use a separate thread to act as a watchdog without going into
> > thread safety issues, with two threads trying to access GDK resources at the
> > same time.
> 
> It might be possible to use a timer to interrupt the UI thread or start a
> separate process? Although that might interfere with any timers that the
> rest of the program has set.

No, it's not possible. GTK does not work like the Android UI toolkit, with different threads running different bits of the toolkit: the GTK main loop is running everything inside the same main thread that called gtk_main(). If something is blocking the main loop, another thread may occasionally check in, but it can't access resources on the main thread, and if it tried to interrupt the main thread it would terminate the application as well.

> > The window manager/display server is in a better position for that, but right
> > now the specification of the ping protocol is not exactly stellar, with WMs
> > updating timestamps only in specific cases, instead of intercepting all
> > pointer/touch events and checking if the application is responsive via ping.
> 
> That is how Chrome OS does it. _NET_WM_PING with a 20 second timeout, send
> SIGABRT, which starts the crash reporter:
> https://chromium.googlesource.com/chromiumos/platform/window_manager/+/
> a6531b90974718192287c292e64a73b5bcd60629/chrome_watchdog.cc
> https://code.google.com/p/chromium/issues/detail?id=188057

That's pretty much how it works on GNOME and other environments. The current common strategy is to only send a _NET_WM_PING if the user is trying to interact with the frame holding the server side decoration, or when the key focus is switched to the application's window, but with the advent of client side decorations and non-reparenting compositors the _NET_WM_PING should be sent every time the user interacts with the window itself, since the WM knows when that happens. Wayland compositors have a similar 'ping' protocol.
Comment 6 Chris Bainbridge 2015-04-06 21:08:29 UTC
(In reply to Emmanuele Bassi (:ebassi) from comment #5)

> No, it's not possible. GTK does not work like the Android UI toolkit, with
> different threads running different bits of the toolkit: the GTK main loop
> is running everything inside the same main thread that called gtk_main(). If
> something is blocking the main loop, another thread may occasionally check
> in, but it can't access resources on the main thread, and if it tried to
> interrupt the main thread it would terminate the application as well.

Sorry, I meant use a POSIX timer to interrupt the process? It shouldn't require any extra threads. The process is not actually uninterruptible, it's just busy doing something else, because the programmer forgot not to do long operations on the UI thread.

> That's pretty much how it works on GNOME and other environments. The current
> common strategy is to only send a _NET_WM_PING if the user is trying to
> interact with the frame holding the server side decoration, or when the key
> focus is switched to the application's window, but with the advent of client
> side decorations and non-reparenting compositors the _NET_WM_PING should be
> sent every time the user interacts with the window itself, since the WM
> knows when that happens. Wayland compositors have a similar 'ping' protocol.

So it sounds like that might be the best way to proceed, if people think it would be useful to have something like this. The implementation in Chrome OS looks simple, there is no monitoring of window interactions, just a periodic callback which sends a ping and sets a timeout. Even that would be enough to catch an error like the one in gparted.
Comment 7 Matthias Clasen 2018-02-10 05:16:16 UTC
We're moving to gitlab! As part of this move, we are moving bugs to NEEDINFO if they haven't seen activity in more than a year. If this issue is still important to you and still relevant with GTK+ 3.22 or master, please reopen it and we will migrate it to gitlab.
Comment 8 Matthias Clasen 2018-04-15 00:08:29 UTC
As announced a while ago, we are migrating to gitlab, and bugs that haven't seen activity in the last year or so will be not be migrated, but closed out in bugzilla.

If this bug is still relevant to you, you can open a new issue describing the symptoms and how to reproduce it with gtk 3.22.x or master in gitlab:

https://gitlab.gnome.org/GNOME/gtk/issues/new