After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 790635 - Slow start with 10+ mail accounts enabled
Slow start with 10+ mail accounts enabled
Status: RESOLVED FIXED
Product: evolution-data-server
Classification: Platform
Component: Mailer
3.26.x (obsolete)
Other Linux
: Normal normal
: ---
Assigned To: evolution-mail-maintainers
Evolution QA team
Depends on:
Blocks:
 
 
Reported: 2017-11-20 21:18 UTC by Eugene Kanter
Modified: 2018-01-05 10:09 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
console (3.89 KB, text/plain)
2017-11-22 23:09 UTC, Eugene Kanter
Details
gdb --batch --ex "t a a bt" -pid=`pidof evolution` &>bt.txt (45.32 KB, text/plain)
2017-11-22 23:12 UTC, Eugene Kanter
Details
gdb --batch --ex "t a a bt" -pid=`pidof evolution` (58.65 KB, text/plain)
2017-11-25 03:12 UTC, Eugene Kanter
Details
gdb --batch --ex "t a a bt" -pid=`pidof evolution` (54.22 KB, text/plain)
2017-11-25 03:13 UTC, Eugene Kanter
Details
gdb --batch --ex "t a a bt" -pid=`pidof evolution` (48.26 KB, text/plain)
2017-11-25 03:14 UTC, Eugene Kanter
Details
gdb --batch --ex "t a a bt" -pid=`pidof evolution` &>bt.txt (14.94 KB, text/plain)
2017-12-07 04:50 UTC, Eugene Kanter
Details
send / receive dialog (31.75 KB, image/png)
2017-12-07 14:51 UTC, Eugene Kanter
Details

Description Eugene Kanter 2017-11-20 21:18:55 UTC
Just upgraded Fedora 26 to Fedora 27. In Fedora 26 start up time wasn't too bad but in Fedora 27, evolution-3.26.2-1.fc27.x86_64 it is just horrible. Several minutes with a very hight CPU utilization.
I do have a dozen active email accounts though. But thunderbird starts up almost instantaneously with the same set of accounts.
Comment 1 André Klapper 2017-11-21 14:23:13 UTC
What's shown when starting Evolution from a terminal window?

(Might need a strace if that's not pointing out anything.)
Comment 2 Milan Crha 2017-11-21 16:44:37 UTC
Thanks for a bug report. I'd also appreciate if you could install debuginfo packages for evolution-data-server and evolution, like:

   # dnf install evolution-data-server-debuginfo evolution-debuginfo \
     --enablerepo=fedora-debuginfo --enablerepo=updates-debuginfo

Make sure the versions of installed packages match precisely the version of the two debuginfo packages, then, when evolution gets high on the CPU usage, grab a backtrace of it with gdb command:

   $ gdb --batch --ex "t a a bt" -pid=`pidof evolution` &>bt.txt

Please check the bt.txt for any private information, like passwords, email address, server addresses,... I usually search for "pass" at least (quotes for clarity only).

Maybe it's trying to update/check for consistency some folders.db files. I know there had been added a slower check, which is supposed to recognize breakage of the database files, but there is available also quicker test.

A response for Andre's question is also important.
Comment 3 Eugene Kanter 2017-11-22 23:09:26 UTC
Created attachment 364238 [details]
console
Comment 4 Eugene Kanter 2017-11-22 23:12:59 UTC
Created attachment 364240 [details]
gdb --batch --ex "t a a bt" -pid=`pidof evolution` &>bt.txt
Comment 5 Milan Crha 2017-11-23 08:31:45 UTC
Thanks for the update. I see multiple things involved, both in the backtrace and on the console. One is an issue with Gmail account, which fails to refresh the token. There's a thread about it here [1], where I made also a test package with proposed fix for bug #790267 here [2].

The other issue seems to be that there are configured many accounts, which makes GTask queue starving, but that's only my impression. Maybe when the above is fixed this will be fixed as well. That's filled as bug #774448.

Finally, the hight CPU usage is caused by redraws of the spinner in the left folder tree, that's bug #738423.

Please, try with the packages from [2], whether it'll help. Download and install all packages you've currently installed, thus
   $ rpm -qa | grep evolution-data-server
then install them with command like
   # dnf update ./evolution-data-*.rpm

[1] https://mail.gnome.org/archives/evolution-list/2017-November/msg00075.html
[2] https://koji.fedoraproject.org/koji/taskinfo?taskID=23283566
Comment 6 Eugene Kanter 2017-11-25 03:11:29 UTC
Somewhat better for gmail which does not trigger any timeout errors as current version.
However, total start time is almost the same as well as CPU utilization during startup.
Comment 7 Eugene Kanter 2017-11-25 03:12:39 UTC
Created attachment 364372 [details]
gdb --batch --ex "t a a bt" -pid=`pidof evolution`
Comment 8 Eugene Kanter 2017-11-25 03:13:47 UTC
Created attachment 364373 [details]
gdb --batch --ex "t a a bt" -pid=`pidof evolution`
Comment 9 Eugene Kanter 2017-11-25 03:14:15 UTC
Created attachment 364374 [details]
gdb --batch --ex "t a a bt" -pid=`pidof evolution`
Comment 10 Milan Crha 2017-11-27 12:57:44 UTC
Thanks for the update. I think I see it. You've set an account, or a folder, to synchronize its content locally for offline use. Once the account is turned into the online mode it also downloads messages for offline use. If you have large folder with many messages not downloaded for offline, then it takes a long time. You can workaround it by limiting the time for how old messages would be downloaded locally. It's in account Properties, Receiving Options tab.

Nonetheless, there are 12 g_task_thread_pool_thread threads, most of them waiting for a chance to connect one of the accounts (which means to add another one GTask thread), thus the main issue is bug #774448 here.

I made a test build for you, which increases the maximum GTask thread pool size from 10 to 20 threads. It'll verify whether it's the main issue here. The package it built here:
https://koji.fedoraproject.org/koji/taskinfo?taskID=23414509
Comment 11 Milan Crha 2017-11-27 13:10:53 UTC
I changed the way the synchronization for offline use is done after going online as well, because it might be better to do it later than right after going online. It also means that there is a feedback in the UI about ongoing download for offline, which can be cancelled by the users.

I'll use this bug report for it, also because I believe the test build from comment #10 for glib will fix the main issue for you.

Created commit 7491dd9c0 in eds master (3.27.3+)
Created commit 71c8d6882 in eds gnome-3-26 (3.26.3+)
Comment 12 Eugene Kanter 2017-11-27 18:02:33 UTC
Milan, I just noticed you changed the bug subject to something completely unrelated to the problem. There is no message download issue at all.

Regarding glib2, please provide both architectures next time because I had to build 686 myself to match 86_64 version.
Now the startup time is about the same as "Send / Receive" time. Thread congestion is clearly out of the equation.

Question: Why 10 threads limit causes severe thread congestion during startup but has no visible effect on "Send / Receive" time with the same amount (12) of active email accounts.
Comment 13 Milan Crha 2017-11-27 18:39:20 UTC
(In reply to Eugene Kanter from comment #12)
> Milan, I just noticed you changed the bug subject to something completely
> unrelated to the problem. There is no message download issue at all.

I changed it, because backtrace shows it is related (it also depends on other settings). I will change the committed thing tomorrow, because I could make it better, but I realized it only now.

> Regarding glib2, please provide both architectures next time because I had
> to build 686 myself to match 86_64 version.

I did think of it, but the backtrace showed you use x86_64 evolution. I didn't know you've also 32bit installed.

> Now the startup time is about the same as "Send / Receive" time. Thread
> congestion is clearly out of the equation.

I'm sorry, I do not follow. Is that the startup time is quick with the changed glib2, or that the Send/Receive is awfully slow with the changed glib2? The "out of the equation" confuses me the most.

> Question: Why 10 threads limit causes severe thread congestion during
> startup but has no visible effect on "Send / Receive" time with the same
> amount (12) of active email accounts.

Startup does many more things, especially it tries to check out whether the mail servers are reachable using GNetworkMonitor. The GNetworkMonitor is run in  a GTask thread pool, and as such also runs other GTask threads, which can be just queued, instead of being processed, which causes the delays.
The bug #774448 is all about it. That's the reason why I stole this bug report for another issue, because yours main issue is covered there.
Comment 14 Eugene Kanter 2017-11-27 20:20:29 UTC
I would like to emphasize that I don't see download offline messages related problems simply because all messages already downloaded. If you happened to see download thread in my dump it does not necessarily mean that download takes excessive time.

Based on the fact that test glib2 packages eliminated all thread lock related delays you may mark this bug as a duplicate of bug #774448 if this is indeed the case.

Back on your question about startup time. With new glib2 startup process does not indicate any thread lock related delays. Originally bottom status bar shows mail account items sitting in queue for a long time, minutes, with no changes. Now all items appear and disappear very quickly and all accounts processing time approximately matches all accounts processing time when "Send / Receive" button is pressed.

It seems to me that evolution should query glib for a number of threads available and never submit more tasks at the time then glib could handle concurrently.
Comment 15 Milan Crha 2017-11-28 07:43:08 UTC
(In reply to Eugene Kanter from comment #14)
> Based on the fact that test glib2 packages eliminated all thread lock
> related delays you may mark this bug as a duplicate of bug #774448 if this
> is indeed the case.

That's why I stole your bug. The duplicate may not help anything, the glib bug is old and nobody cares (except of me?).

> It seems to me that evolution should query glib for a number of threads
> available and never submit more tasks at the time then glib could handle
> concurrently.

No. It seems you do not trust me, or believe me. That's sad. Or you do not understand me. That happens. I do not want to repeat what I already wrote, either here, or in the glib bug. Note that a) there is no API to know how many threads GTask can use; b) evolution is not the only software using GTask API, many other used libraries are using it as well; c) glib itself, in this case through GNetworkMonitor, uses GTask API in a way that requires twice more threads.

To make it super clear: the problem is thread limit in GTask, causing starving of requests. I stole your bug for semi-related issue, and I'll keep using it that way.
Comment 16 Milan Crha 2017-11-28 07:55:14 UTC
(In reply to Milan Crha from comment #11)
> Created commit 7491dd9c0 in eds master (3.27.3+)
> Created commit 71c8d6882 in eds gnome-3-26 (3.26.3+)

I reverted these with commit 23fb175a802 and commit 17fc196da, because I'm going to use a different approach.
Comment 17 Milan Crha 2017-11-28 13:17:38 UTC
Thinking of it, you are right, it was a silly idea to steal the bug report. I'm sorry about that.

*** This bug has been marked as a duplicate of bug 774448 ***
Comment 18 Milan Crha 2017-11-28 15:25:18 UTC
Could you try to download these test packages:
https://koji.fedoraproject.org/koji/taskinfo?taskID=23437641
https://koji.fedoraproject.org/koji/taskinfo?taskID=23438477
and install them, while the patched glib2 will be uninstalled, please? I tried to avoid use of the Gtask thread pool queue on places which are most often used after start, which may or may not help. The builds can be used independently, which would be an interesting observation too (I guess the changed evolution-data-server will provide better experience than changed evolution). I try this, because the glib bug seems to be ignored.

I have configured many more accounts, most of them are disabled, but I do not see that issue here, thus I'm asking for a test on your machine, where you seem to be able to reproduce it much easily than I do. Again, please, make sure you won't use the patched glib2, with increased GTask thread pool size.
Comment 19 Eugene Kanter 2017-11-28 21:00:37 UTC
reverted glib2, verified that startup takes long time, them updated per comment #18
startup time seems to be somewhat less then with patched to 20 threads glib2.
Comment 20 Milan Crha 2017-11-29 09:28:56 UTC
Thanks for a quick testing. Thus the workarounds in both packages made it even better? That's interesting, I would not expect it. I'll commit the changes to the sources tomorrow, just in case you'd face some issue till then. I'll appreciate if you could confirm again tomorrow; I'll rather wait for it, because such issues sometimes strike and sometimes not.
Comment 21 Eugene Kanter 2017-11-29 13:41:35 UTC
I haven't found anything out of ordinary during normal daily usage.
Comment 22 Milan Crha 2017-11-30 08:51:38 UTC
Okay, thanks. I cannot commit this to the stable branch, because the eds part contains a new translatable string and the evolution part changes some public API (which is probably not used by anything else, but still). Idea behind both is to skip the GTask thread pool and use its own.

I'll keep this bug marked as a duplicate of the glib bug, because the changes are rather workarounds than real fixes (I'd prefer to use GTask API, but it makes it hard in such cases like that yours).

Created commit_6e9c80cd6 in eds master (3.27.3+) [1]
Created commit_f5ac35891 in evo master (3.27.3+) [2]

[1] https://git.gnome.org/browse/evolution-data-server/commit/?id=6e9c80cd6
[2] https://git.gnome.org/browse/evolution/commit/?id=f5ac35891
Comment 23 Eugene Kanter 2017-12-05 21:00:04 UTC
I don't know if related but now Send / Receive dialogue hangs indefinitely. 
Three items in complete at 100% status, others at updating... at 0%.
Comment 24 Milan Crha 2017-12-06 12:32:00 UTC
Could you install debuginfo packages for the custom builds of eds and evo and then grab a backtrace of the "stuck" evolution, to see what it tries to do, please? You can get the backtrace with command like this:
   $ gdb --batch --ex "t a a bt" -pid=`pidof evolution` &>bt.txt
Please check the bt.txt for any private information, like passwords, email address, server addresses,... I usually search for "pass" at least (quotes for clarity only).
Comment 25 Eugene Kanter 2017-12-07 04:50:13 UTC
Created attachment 365177 [details]
gdb --batch --ex "t a a bt" -pid=`pidof evolution` &>bt.txt
Comment 26 Milan Crha 2017-12-07 08:27:24 UTC
Weird. That backtrace shows one IMAP account being in IDLE (waiting for notifications of changes from the server) in one thread, the other threads are just polling, thus also do not do anything interesting. I do not see any thread in a pending state, neither many GTask or other threads being stuck in anything.
Comment 27 Eugene Kanter 2017-12-07 14:51:07 UTC
Created attachment 365197 [details]
send / receive dialog
Comment 28 Eugene Kanter 2017-12-07 14:52:53 UTC
clicking on cancel all makes all cancel buttons inactive but dialogue stays on.
clicking on an X in top right corner dismisses the dialogue.
Comment 29 Eugene Kanter 2017-12-11 20:08:07 UTC
I reverted back to a stock slow fedora version. I can tolerate slow start but can't tolerate completely broken send/receive.
Comment 30 André Klapper 2017-12-11 21:19:15 UTC
What is a "completely broken send/receive"? You can neither send nor receive emails?
In any case, sounds like a different issue than what this ticket is about?
Comment 31 Eugene Kanter 2017-12-11 21:32:28 UTC
(In reply to André Klapper from comment #30)
> What is a "completely broken send/receive"? 
please see comment #23 and several after.
> sounds like a different issue than what this ticket is about?
It was introduced by the patch created to resolve slow start.
Clearly patch makes things much worse so it needs to be re-worked or pulled.
Comment 32 Milan Crha 2018-01-03 10:54:28 UTC
I use that code change here with 8 enabled accounts, but I do not see what you claim. The image shows 12 accounts, while 4 of them are already completed. Are they all of the same type, like IMAP, or different types (if so, then which, please)?

I do not know how much it's a coincidence on your side. There might be something different for sure, because otherwise I'd see it too. I will try to enable more accounts here, but again, knowing what account types you use will help.
Comment 33 Milan Crha 2018-01-05 10:09:04 UTC
I tried again, I enabled check for new messages after start and 21 accounts in total (one POP, one NNTP, some EWS and many IMAP). I start evolution and let it finish the initial refresh. Then I click Send/Receive and it is finished after some time (couple seconds). Then I click Send/Receive again and again and again. None exhibits what you see.

The only similar behaviour I saw was when I switched Evolution to offline, then back to online. One of my accounts had been waiting for a response from the server for a longer time (more than 10 seconds), but the backtrace showed it to do this waiting and it finally received the response from the server, thus it went fine again, including the Send/Receive.

It can be that the change I backported to the stable version didn't contain everything required.

I'll keep all the accounts enabled for a week, just in case I'd be able to reproduce the behaviour your faced, but otherwise I'm closing this bug report in favour of the development version, where the change landed. Feel free to update the bug with any findings you might get, but, please, do not reopen it. I'll do it if I manage to reproduce it here. Thanks in advance.