GNOME Bugzilla – Bug 687364
first entry to overview blocks compositor for a moment
Last modified: 2015-04-27 15:57:02 UTC
+++ This bug was initially created as a clone of Bug #687362 +++ I'm using an instrumented GLib with access()-based "trace points" similar to [ACCESS], with gnome-shell under strace - so my numbers are likely to be larger than they would be in real life. The first time the Shell enters the overview, it does not get back to the GLib main loop for a moment (~1.6 seconds, with the instrumentation I mentioned), after performing all this I/O inside dispatch(). During this time, the compositor can't render frames. Main thread: * several X11 (?) round trips * stat() on each icon directory (/usr/share/pixmaps/gnome, /usr/share/icons/gnome, /home/smcv/.icons/hicolor, etc.) * starting several image-loading threads * stat() on index.theme in some of the locations I mentioned * mutex contention with the image-loading threads Image-loading threads: * actually loading the images, obviously * mutex contention with each other and the main thread [ACCESS]: http://people.gnome.org/~federico/news-2006-03.html#timeline-tools
Some possible ways this could be addressed (some of these are probably unworkable, I'm erring on the side of "suggest it so people can reject it"): * Make the application-chooser a separate process with a (transparent?) X11 window, with a special case in compositing so it's only composited into the overview rather than being treated as an application window? (Would this mean that libgnome-menus could live in that process, and not in the main Shell process? I haven't dug into how gnome-menus is used yet.) * stat() things in a secondary thread so these syscalls can't block the main loop? * Put all icon loading on a thread (probably that same secondary thread?) and make it synchronous, rather than using the async GIO functions that end up in a generic thread pool, so there's less lock contention? * Make a loading "wishlist" for the secondary thread and hand it off atomically, so there's only one unlock/lock handshake? * Lazily load application icons in a background thread while the Shell is otherwise idle, so that opening the overview is quicker? (Not necessarily particularly useful: I would expect that typical Shell users would start their login session by opening the overview to look for a document or application anyway, at which point the Shell is waiting for the icons.) * Eagerly load application icons while Shell starts up? (This directly conflicts with the goal of making Shell startup quicker.) Bug #671406 is related: if as much I/O as possible was happening in a background thread or process, the compositor thread could display a spinner or something.
Created attachment 227838 [details] strace log while opening the overview for the first time The format is according to strace -ttt -f: {process (thread) ID} {timestamp} {syscall} = {result} (I should have remembered to use -T as well, I'll do that next time.) Process 1739 is the main thread. "Process" 1768 is the GDBus worker thread. access("START: ..."), access("END: ...") and access("MARK: ...") syscalls are from the GLib instrumentation I added in order to diagnose sources of blocking; they wouldn't happen "in real life". You can see the main thread doing one dispatch() from lines 7 to 14, a second from lines 19 to 32, the one I'm concerned about from lines 37 to 1755, and some less eventful dispatch() calls after that. The backtrace after line 1755 indicates where the offending GSource was added: the GSource itself is gdk_event_source_dispatch(). The ioctls on fd 10 are DRI, presumably. I think fd 6 is X11. fd 8 is the GWakeup for the GDBus thread. fd 7 is certainly GDBus; fds 20, 21, 30, 37 are probably GDBus too. fds 13, 25 are D-Bus in the main thread - probably the session and system buses for dbus-glib.
Created attachment 228129 [details] strace from run 2 Here's another run of essentially the same thing. I had no windows open and did not search for any applications; I did have the extra instrumentation in GLib, and both sysprof and strace attached to gnome-shell. Profiling highlights (timings are non-cumulative): 6.58% additional_selector_matches_style() 5.43% add_matched_properties() 3.43% sel_matches_style_real() 3.43% st_theme_node_get_type()
Created attachment 228130 [details] matching sysprof from run 2
(In reply to comment #0) > The first time the Shell enters the overview, it does not get back to the GLib > main loop for a moment (~1.6 seconds, with the instrumentation I mentioned) With my patches from Bug #687465, this is down to 833ms.
Created attachment 228805 [details] strace with patches from #687465
Created attachment 228806 [details] corresponding sysprof
(In reply to comment #0) > I'm using an instrumented GLib with access()-based "trace points" similar to > [ACCESS], with gnome-shell under strace - so my numbers are likely to be > larger than they would be in real life. There is indeed considerable tracing overhead there. Unfortunately, without the higher-overhead tracing, I can't see what bad things are going on in the main thread... With more restrained instrumentation (the minimum needed to make GLib log long dispatch() calls as a g_warning(), no strace, no sysprof), opening the overview for the first time typically blocks for one long dispatch() and one even longer dispatch(), followed by 1-2 dispatch()es close to the 20ms reporting threshold. Average of 3 runs each, without and with the patches from Bug #687465: Long dispatch: 141ms (std.dev 3ms) down to 61ms (std.dev. 2ms) Longer dispatch: 189ms (std.dev 4ms) down to 156ms (std.dev. 3ms) (In reply to comment #1) > * Put all icon loading on a thread (probably that same secondary thread?) > and make it synchronous, rather than using the async GIO functions > that end up in a generic thread pool, so there's less lock contention? I tried this with the same profiling, and got a mean of 61, 157 ms (across 3 runs) for the first two long dispatches (std.dev: 2, 2 respectively). So it's the same timing as the thread-pool, or possibly fractionally slower, making this only worth doing if it simplifies the implementation (but, equally, not worth avoiding if it makes something else easier).
*** Bug 687849 has been marked as a duplicate of this bug. ***
*** Bug 693446 has been marked as a duplicate of this bug. ***
*** Bug 691789 has been marked as a duplicate of this bug. ***
i otherwise think this was a regression in 3.6 compared to 3.4. i never saw that in 3.4 and i see it at every single login in 3.6.
This problem is much less pronounced, to the point that i can say it has disappeared for me in 3.8, on the same two computers on which I had a big problem in 3.6.
Simon, are you still seeing this in 3.10?
Sorry, I'm still on Debian's Shell 3.8.4-5, and not actively working on this issue. To reproduce: hard-disk based laptop, Alt+F2, "r", wait for hard disk activity to cease, press Windows key. On 3.8, first entry to the overview does still seem to pause frame output for ~ half a second - I don't see the animated transition from normal window positions to their overview positions, just a pause in output followed by the windows abruptly "teleporting". The second and subsequent entries to the overview are noticeably smoother, with enough frame rate for the animated transition to look good. Has something specific been done to improve this between 3.8 and 3.10, or are you hoping that it has improved "accidentally"?
(In reply to comment #15) > Has something specific been done to improve this between 3.8 and 3.10, or are > you hoping that it has improved "accidentally"? https://bugzilla.gnome.org/show_bug.cgi?id=712727 https://bugzilla.gnome.org/show_bug.cgi?id=712753
Successive Shell versions generally get better, and I am not actively working on improving this. There is probably still room for improvement (the test I described on Comment #15 still doesn't animate smoothly the first time in Debian's Shell 3.14), but this bug doesn't have a whole lot of value without someone actively working on tracking down individual causes in a more current version; resolving OBSOLETE on that basis.