GNOME Bugzilla – Bug 732180
Excessive CPU usage due to GtkSpinner
Last modified: 2014-06-30 07:32:54 UTC
So this one is proving a bit slippery to pin down, but I'm seeing it quite consistently and Liam Quin says he's seeing it too, so let's file a bug and hopefully we can figure out what's going on.
I noticed that on my main box, which runs Fedora Rawhide, after the system's been running for a while, htop shows unusually high CPU usage - usually summing to about 100% or higher for one core. The usage is attributed to Xorg, gnome-shell, and Evolution, but quitting Evolution immediately causes Xorg, gnome-shell and everything else to drop to normal idle levels.
I captured a few seconds of strace output from each of the three processes while the system was in this state, and I'll attach that. I'm not sure what else would be useful - please advise!
Created attachment 279145 [details]
a few seconds of strace of affected components while bug is in effect
I've stopped using evolution for now since imap has stopped working; trying to work out if that's just a Mageia-related breakage. But in the meantime I can't reproduce the problem simply because evolution doesn't do any tasks now.
When it was happening, for me it seemed to be related to whenever evolution's little round "busy" symbols appeared on the status bar - although often there were/are blank areas on the status bar that seem to correspond to active tasks that aren't getting displayed there - e.g. if the status bar is divided into six regions.
Additional background -
Things improved somewhat when I changed the number of active connections from 5 to 2 on each of my two main IMAP mailboxes - and also evolution got tied up / unresponsive less often, but I was still running evolution --force-shutdown every few minutes (previously it had been dumping core).
(not sure I need all three, looks like a packaging error)
setting http_proxy didn't seem to help with any of these (probably unrelated) issues.
Seeing tons of EAGAIN errors from recvmsg() calls in all three processes, which might be relevant. Possibly a busy loop somewhere, but I can't tell where it's originating or why the socket is busy.
Also FWIW, I've been observing high CPU usage from Evolution on XFCE as well, sometimes after simply selecting a message.
Liam mentioned that he saw a correlation with Evo actively doing something or other and showing its spinner in the bottom status bar. I just saw at least one case that was like his experience: I was rapidly moving through messages in a folder with the arrow keys, and somehow it seemed that Evo got stuck "Scanning for new messages in (that folder)". That status message was displayed for a long time (new mail checks should be almost instantaneous - the server is on the local network - and usually are) and for that whole time, the high CPU usage was evident. I waited some 30-40 seconds to see if it'd finish whatever it was doing, but it didn't. I then switched into another folder and switched back, the "Checking for new messages" message went away after a couple of seconds, and the CPU usage immediately dropped back to normal background levels.
I can produce much the same levels of heavy load just while moving rapidly through a folder with the arrow keys. I can't quite see how to reliably reproduce the 'stuck in Scanning for new messages" state, but as long as I'm rapidly pressing up or down to page through the folder, usage is pretty high.
Doing the same on an F20 system produces 40% usage for the Evolution process, but no increased usage for Xorg or gnome-shell.
If it is indeed the spinner animation hogging the CPU, this patch might help:
diff --git a/e-util/e-activity-proxy.c b/e-util/e-activity-proxy.c
index f3a452b..ba838d4 100644
@@ -283,7 +283,9 @@ e_activity_proxy_init (EActivityProxy *proxy)
proxy->priv->image = widget;
widget = gtk_spinner_new ();
gtk_spinner_start (GTK_SPINNER (widget));
gtk_box_pack_start (GTK_BOX (container), widget, FALSE, FALSE, 3);
proxy->priv->spinner = widget;
Failing that, I can try patching out the spinner widget entirely.
http://koji.fedoraproject.org/koji/taskinfo?taskID=7073254 is a scratch build for Rawhide which should nerf the spinners, to see if just doing that affects things - Matthew's idea. I'll try it and see.
de-spinnered Evo still shows fairly high usage on the rapid message switch test, something like 40% evo, 35% X, 15% gnome-shell. That may be a bit lower than spinner-ed Evo. I'm waiting for the "stuck in 'Scanning for new messages...'" state to trigger again, and I'll see how usage looks there. I will also attach a sysprof log of the stuck state, from the spinner'ed Evo.
Created attachment 279152 [details]
sysprof profile while stock (without de-spinner patch) Evo is in the 'stuck at Scanning for new messages' state
Profile shows pretty much all gtk/cairo/nouveau calls, although g_signal_emit() is fairly high in the list. Don't see any webkit or sqlite or even evolution functions listed, which suggests whatever this is is purely graphical.
so...with the de-spinnered evo, I'm not sure if I'm actually hitting the same 'stuck' state.
Every so often it does seem to get stuck with "Fetching summary information for new messages in (somefolder)" or "Scanning for new messages in (somefolder)" showing, but it's not the same folder that's currently active in the UI, as it always seems to be in the original case. So it may be a slightly different state.
Whatever - with the de-spinnered Evo, when it's in that state, I don't see excessive CPU usage.
(In reply to comment #12)
> Every so often it does seem to get stuck with "Fetching summary information for
> new messages in (somefolder)" or "Scanning for new messages in (somefolder)"
> showing, but it's not the same folder that's currently active in the UI, as it
> always seems to be in the original case. So it may be a slightly different
That's probably the IMAP backend unnecessarily re-fetching all the message flags in that folder. If it's a large folder, that can be a bottleneck for the rest of the app. But it's a separate issue.
Mine can get stuck for, say, 8 hours "Fetching summary" or "refreshing folder", with no I/O being logged for the process. Sometimes in the remote imap log I see evolution being disconnected because idle.
It occurs to me that, perhaps, Evo gets stuck displaying the status message, but it's not actually in the state the message indicates. i.e. the refresh operation completed long ago, but some kind of spinner rendering bug means the status message keeps being displayed (and causing the heavy resource usage) until some other action, like switching folders, somehow provokes Evo to break the loop or whatever it's stuck in.
So, simply running gtk3-demo and running the "Spinner" demo reproduces the high CPU usage (this time in gtk3-demo, gnome-shell and Xorg.bin). So we can probably consider that as a GTK+ / nouveau issue outside of Evolution.
But the fact that Evo seems to get into this state where it's showing a perpetual status message which just happens to include a spinner would seem to be an Evo bug, so we should probably keep this one open.
Filed https://bugzilla.gnome.org/show_bug.cgi?id=732199 for the 'GtkSpinner runs up the CPU' part of this.
So I'm still trying to figure out exactly what state Evo is really in while this is happening, but in the mean time, here's a video of the basic form of the bug:
As you'll see, the "Scanning for changed messages in 'Triage'" status is 'stuck' at the start of the video (it's hard to catch Evo actually getting *into* the stuck state). I didn't leave it sitting very long to keep the video short, but I could've sat in a single folder doing nothing for an hour and the message wouldn't have gone away.
Then I read through both previously-retrieved and not-yet-retrieved messages in the active folder for a while. You'll see the stuck status remains.
But then I change to a different folder, and the status unsticks.
Throughout, I occasionally flip to a console with a network status monitor showing - you can see that Evo traffic only happens when I load a new message, or switch folders. So it seems like there's no traffic associated with the 'stuck' status.
I *think* the overall 'check for new mail' process really *is* stuck when this happens. I'll send a few more test mails to confirm, but I believe that as long as Evo is 'stuck' in this way, I can retrieve previously un-retrieved mails in the current folder, but it won't retrieve any more new mails from the server until I "unstick" it by switching folders.
Notes: I have "Check for new messages every X minutes" enabled and set to 1 minute, 'Check for new messages in all folders', 'Use Quick Resync if the server supports it', and 'Listen for server change notifications' all enabled.
I like to get my mails. ;)
"Number of concurrent connections to use' is set to 3.
Please open a new bug report for the IMAP thing, thus we'll not mix things together. I would try to disable Quick Resync, it does some odd thing regularly, aka it feels quite unfinished.
Anyway, I also got sick of GtkSpinner, that much that I reintroduced ESpinner, which is much simpler that the GtkSpinner and definitely much weightier on CPU. For example 100 ESpinners on a screen is barely noticeable as a CPU user (it's about 8% of CPU, which isn't much visible in my CPU meter), while 50 GtkSpinners eat almost whole one core. Go figure.
Created commit 4b213de in evo master (3.13.4+) 
Created commit 2292ac6 in evo evolution-3-12 (3.12.4+)
(In reply to comment #20)
> while 50 GtkSpinners eat almost whole one core.
I'm sorry, it's about 50% of one core, not "almost whole". This is with gtk3-3.10.6, I rather didn't try with more recent gtk3.
I'm happy to see this fixed wrt Evo, don't get me wrong, but it seems like a slightly odd approach - obviously the GtkSpinner bug is something that's gonna affect far more than just Evo, and we really need GTK+ devs to fix it, not to force downstream apps to build their own spinners instead...
the IMAP bug seems to persist with qresync disabled, I'll file a new report.
IMAP issue reported as https://bugzilla.gnome.org/show_bug.cgi?id=732366
(In reply to comment #22)
> obviously the GtkSpinner bug is something that's gonna
> affect far more than just Evo, and we really need GTK+ devs to fix it, not to
> force downstream apps to build their own spinners instead...
Yes, I 100% agree, but this misbehaviour of GtkSpinner is here for too long, with no interest to fix on their side, thus I just got sick of waiting for them and made it my own way. It's not ideal, but sometimes the easiest way and definitely good for users too.