GNOME Bugzilla – Bug 684639
GtkSpinner animation eats too much CPU
Last modified: 2018-05-02 15:32:00 UTC
When restoring an epiphany session with GTK+ 3.5.18, 100% CPU is used, and pages never actually finish loading.
Downgrading to GTK+ 3.5.16 (the previous built version for Fedora) seems to fix the problem.
Can you find out where it's spinning ?
X is the one using 100% CPU, and the backtraces look like:
Some additional information:
GDK_RENDERING=image fixes it, so our animations are hitting a GPU driver issue.
The GPU in question is Intel and it seems to be F18 only, my F17 is fine.
I've heard people claiming F18 shipping debug kernels might contribute, but I lack enough clue about the graphics stack to have any guesses in that area.
Problem seems fixed by using a non-debug kernel.
I don't think this is fixes as is, we still have no rate limiting in the client when rendering is expensive. I.e. the client will keep pushing new redraw commands every 40msec or whatever, potentially building up a large queue of drawing ops in the XServer such that it will not be able to "catch up" with the client. This essentially locks up X, as not even the VM will be able to do anything.
The correct approach is imho to XSync after drawing so that we never animate until the current frame has been drawn. This way we avoid all kinds of issues similar to this particular Intel rendering issue.
Created attachment 225473 [details] [review]
Make process_all_updates draw synchronously
By calling XSync in _gdk_x11_display_after_process_all_updates we
effectively make gdk rendering sync, which avoids problems with the
client animations running faster than the Xserver rendering, thus
filling up the X rendering pipes and essentially "locking up" the
Xserver (i.e. you can't even close the offending window because the
WM is starved too).
I verified this worked by making GtkSpinner paint multiple times on my
intel driver (which has some issue making this rendering slow atm),
and without this patch i get severe lag where even window dragging
stops for 5 seconds when i drag the mouse around. However, with the
patch everything is smooth.
CCing owen for a quick check.
XSync under normal circumstances will not flush the GPU render queue (it is likely to have that side-effect under a compositor though, alas). The round-trip will serialise with the slow fallbacks that the driver would appear to be hitting. The question is whether you really want to workaround a driver bug by impacting all users of your toolkit.
If you want to perform some ratelimiting, insert an event in _gdk_x11_display_after_process_all_updates() and block expose events until it is returned (or better until the previous one is returned). (To insert such events into the command stream without using xcb, create a dummy window and use SendEvent and hookup a custom callback to increment the seqno.)
We already have _gdk_x11_roundtrip_async() to do a roundtrip. Whats the practical difference though? We'd block less on high latency links, but even with an async version the "visual" latency would be essentially the same, would it not? The only difference would be if rendering is slow *and* calculating the next frame is also slow. Then doing the XSync() async will parallelize the two.
As long as we're doing the synchronization once per frame and at the end of all drawing, I think the practical difference between XSync() and doing some fancy async thing is pretty small - we generally don't want to start processing events or updating animations for the next frame until we know that the previous frame "completed".
In terms of whether XSync() a) flushes the GPU buffers to the kernel b) waits for rendering to complete - I'd say let's fix the problem that we're having - requests queueing up before processing by the X server - and not worry about problems we're not having at the moment - drawing piling up as GPU command buffers inside the X server or piling up in GPU pipeline. There will be a more comprehensive framework for this in 3.8.
The patch looks fine to me.
FWIW, this also makes gnome-boxes unusable for me.
This also seems to negatively impact gnome-documents when there's a spinner animation running.
(In reply to comment #8)
> XSync under normal circumstances will not flush the GPU render queue (it is
> likely to have that side-effect under a compositor though, alas). The
> round-trip will serialise with the slow fallbacks that the driver would appear
> to be hitting.
If the is that we're hitting slow fallbacks, can we fix that instead? I'd be curious to figure out what's happening here.
Created attachment 225526 [details]
cairo trace of simplified spinner rendering
This cairo trace is from a simplified version of the spinner rendering (rectangles, not circles) that still seems to exhibit extremely bad performance:
[ # ] backend test min(s) median(s) stddev. count
[ 0] xcb spin.27363 1.402 1.410 0.44% 5/6
[ 0] xlib spin.27363 1.743 1.746 0.09% 4/6
[ # ] image: pixman 0.26.2
[ 0] image spin.27363 0.066 0.066 0.05% 4/6
[ # ] image16: pixman 0.26.2
[ 0] image16 spin.27363 0.068 0.068 0.20% 6/6
Looks like it tricks UXA into doing lots and lots of short rendering, with the cost of each magnified by the debug kernel. SNA in this case on par with image, not surprising as it decides to keep the workload on the CPU as it is too small to justify the overhead of setting up the GPU.
I did some testing on remote Gtk+ with the XSync patch on a lan with artificial 50msec outgoing latency on both machines. Interactivity wise there was not much difference in the "static" case.
In the case with animations (i tried the animating background example in gtk3-demo) there were obvious differences. Without XSync the animation was very smooth in that it sent each frame. However, this caused interactivity problems as the pipe was full of rendering ops or something, for instance editing text was more laggy. With XSync the animation made much larger jumps, but interactivity was better.
Note that this example is slightly artificial in that the bandwidth is very high. In a more common high-latency case the bandwidth would also be lower, probably causing even more interactivity issues in the non-XSync case, but not noticably affecting the XSynced case (assuming each "frame" is not *that* large).
I'd say go for it now and if someone finds obvious problems with it we can revisit the patch. (And of course, if someone sends a better patch.)
Comment on attachment 225473 [details] [review]
Make process_all_updates draw synchronously
Attachment 225473 [details] pushed as 83c66c9 - Make process_all_updates draw synchronously
I just implemented gradient transitions in git master. This has the side effect of improving the performance for this case a lot, because we avid the expensive cross-fade code, but instead compute the cross-faded gradient in advance.
(Also, new feature in a stable series, I always wanted to do that!)
I updated gtk3 in my F18 virtual box with fallback mote to 3.6.1 and the gtk3-demo -> Spinner still eats too much of CPU, about 8%, thus I do not think this is fixed.
One more observation with gtk3-3.6.1: Evolution uses gcr-prompt for password prompts in a way that on "Continue" it keeps the password prompt opened, whcih shows the spinner in front of the "Continue" button, even the whole dialog is disabled. In gnome-shell, when I get into this spinning Continue state, while evolution is testing the password against the server, the CPU goes high again. Hence I'm reopening this.
Let's not be so vague. Where is the time being spent? What is you acceptance criteria?
I'm not sure what's vague on comment #21, I gave application and such. Nonetheless, I have easier reproducer:
a) boot to gnome-shell
b) run gtk3-demo->Spinner, or Images, or other demos where anything animates
gnome-shell and Xorg (according to 'top') are using the CPU most,
just behind them is gtk3-demo, with almost 10% of CPU
My acceptance criteria is easy, gtk3-3.4.x did the same animation (from the user's point of view) with no indication on CPU usage, while the 3.6.x behaves insanely here. From my point of view, animations are nice, but they should be below IDLE in the priority queue, thus the application, if needs IDLE, can get to it before the animation, because the animation is just to show that the UI thread is not dead/blocked, and for nothing else. In fact, the animation can block delivery of IDLE, if I understand it correctly.
This is within a virtual machine of Fedora 18 installation, run in Fedora 17.
I've installed gtk3-3.6.1-1.fc18.x86_64 now, on a real machine, and gtk3-demo->Spinners eats about 4-8% of CPU when animating those two spinners. On the other hand, the Tree View->List store doesn't suffer of this issue (there is one spinner animating). The Pixbufs demo is also mostly fine (I see occasional jumps to 4%, but nothing constant).
I experienced high CPU consumption when using an active spinner as well.
My gtk3 version:
With an active spinner the CPU consumption of my GTK app jumps 11% up, from 0%-1%. I decided on using an appropriate animated GIF picture instead as spinner - this needs only 1-2%.
Active spinners are also a pain when putting these into UI files in Anjuta. Anjutas CPU consumption jumps then to about 60% (from 1% before that).
I think the frame clock and size limiting the spinner have addressed this problem
(In reply to Matthias Clasen from comment #26)
> I think the frame clock and size limiting the spinner have addressed this
No, it didn't. I still see high CPU usage with gtk3-demo->Spinner with installed gtk3-3.22.26-2.fc27.x86_64.
I can share a little test program, which adds multiple GtkSpinner-s into the window, where it can be easily reproduced. There is expect some CPU usage when it comes to 100s of spinners in the window (frankly, I do not expect any application needing so many of them at one time), but it's good for measuring the performance. I've been even able to reproduce a delay (when I've been lucky then only delay, for example 9 seconds, other time the delivery could be left stuck in the queue and not finished at all) in g_idle_add() callback receiving when there had been like 125 GtkSpinners spinning in the window, but I've not been able to reproduce it consistently between various desktop environments (I tried Plasma, MATE, GNOME on Wayland and GNOME on X.org (where the last had disabled spinning completely for some reason)). That test application contains also an ESpinner, which is used in Evolution, which doesn't cause that significant CPU usage (comparing 25 spinners of one or the other type, not both at the same time). Te ESpinner uses the old method of the "busy" animation, which has its own pros and cons with compare to the GtkSpinner, but it's sufficient for the Evolution itself.
See also http://bugzilla.gnome.org/show_bug.cgi?id=732199 .
and https://gitlab.gnome.org/GNOME/gtk/issues/166 to have it stop animating when hidden, which would, if not resolve, at least mask this problem in suitable cases
fwiw, with `gtk3-demo` on win32, simply running the Spinner test and starting the 2 in there brings CPU usage from 0 to a reliable 2~3% (i5-3210M) - better than above figures, I guess, but probably still a bit excessive for all it's doing.
Milan: You could define what you mean by "high CPU usage" and what you would consider 'not high'. It's vague what the problem is now, especially given that its effect is probably (at least artificially) reduced by higher-performance CPUs being more common today.
"and https://gitlab.gnome.org/GNOME/gtk/issues/166 to have it stop animating when hidden, which would, if not resolve, at least mask this problem in suitable cases"
note this wouldn't help the anaconda case, as it's visible there. anaconda doesn't really want anything fancy, just some kind of simple animation to indicate 'stuff is happening', but they want to use something standardized that they don't have to maintain.
(In reply to Daniel Boles from comment #30)
> Milan: You could define what you mean by "high CPU usage" and what you would
> consider 'not high'. It's vague what the problem is now, especially given
> that its effect is probably (at least artificially) reduced by
> higher-performance CPUs being more common today.
Running a GNOME shell session on Wayland with installed:
in a virtual machine and in it a terminal where I ran `top` and in a different tab gtk3-demo->Spinner, where two spinners are spinning. The `top` itself shows gnome-shell being in more that 30% of CPU, some kworker at 6.3% and gtk3-demo also around 6% of CPU.
There are only two spinners, that should not be visible in `top` at all.
When I stop the Spinner demo gnome-shell goes down to ~1% of CPU and gtk3-demo is at 0%.
Using my test application with a different animation (not GtkSpinner), also with two animated squares of 16x16, I see gnome-shell in around 10-20% of CPU and the test application in less than 2% of CPU. The gnome-shell usage is still quite high from my point of view, which may or may not be related to gnome-shell itself, but at least the application itself is lower than 1/3 of the similar gtk-3-demo functionality (GtkSpinner).
Maybe testing this in a virtual machine is not ideal, I see some downgrades in performance in F28 only when moving mouse in gdm and gnome-shell, I do not know.
-- GitLab Migration Automatic Message --
This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/gtk/issues/408.