GNOME Bugzilla – Bug 732366
[IMAPx] Stuck FETCH command when using IDLE
Last modified: 2014-07-12 01:50:13 UTC
Originally reported as part of https://bugzilla.gnome.org/show_bug.cgi?id=732180 , where it turns out that *this* big causes a long-running GtkSpinner which in turn caused high CPU usage. The spinner thing has now been worked around, but this underlying bug seems to remain. With Evolution 3.12.3 on current Fedora Rawhide, and an IMAP server on the same local subnet running dovecot, configured as IMAP+ with "Check for new messages every 1 minutes", "Check for new messages in all folders", and "Listen for server change notifications" all checked (but Quick Resync disabled), Evo quite often seems to get stuck "Scanning for new messages in (somefolder)". As long as I leave Evo idle, it will remain with that status message displaying - it never seems to clear. A network traffic monitor shows no traffic between Evo and the server in this state. If I then browse through messages in whatever folder is currently selected, the stuck status remains - I can read messages, both previously-downloaded and not-yet-downloaded, just fine, but the "Scanning for new messages" still will never clear. To clear it, I have to switch folders - it seems like the action of switching to another folder triggers a refresh of that folder, and that refresh somehow unsticks the stuck refresh of the other folder. I believe that, as long as Evo is in the stuck state, it actually does stop checking for new mail in any folder - I can read mails in the current folder whose headers have already been retrieved, but new ones won't appear in that or any other folder until I "unstick" Evo by switching folders. A screencast of the issue is available at https://www.happyassassin.net/temp/evo_stuck_checking.webm .
Yes, I also was seeing this, and agree that switching folders seemed to unstick it, so I could switch back and fetch any message not already queued up. Before this, evo was dumping core pretty often - e.g. if I selected half a dozen messages and deleted them all - apparently when trying to cancel an IMAP message fetch, so I don't know if it was related to that fix or not.
I saw somewhere a backtrace which showed basically what you describe - a running job with folder refresh, but the parser thread doing nothing. Due to the activity being eventually removed, on the folder change, I'd guess that some job before this one being executed stopped in a way which didn't start next pending command, keeping the folder refresh job stale in a queue. Finding it will be pretty hard, I'm afraid, also because I do not recall I faced it myself in the past. Let's try two things. The first is to grab CAMEL_DEBUG=imapx log - it'll be awfully chatty with tons of private information, including message content, but I'd be interested only at the end of the log, since the last "completing command buffer is" mentioning probably "UID FETCH" or something above it - there might be some "completing command buffer is" with more than just this "UID FETCH" before the end of the log, thus two or three of those included would help. By the way, if you have more IMAP accounts configured, then disable all than the one, to not have the log full of other IMAPx output from those other accounts, which would make it only harder to read. The second thing is to attach gdb into the running evolution: $ gdb evolution `pidof evolution` get list of running threads (gdb) t a a bt switch to the one where the folder refresh job is waiting - it might be the one camel_imapx_job_run() in its backtrace (say it's thread 20): (gdb) t 20 then switch to a frame which mentions either imapx_refresh_info_sync or imapx_synchronize_sync (say it's frame 5): (gdb) f 5 and here, finally, print the queue status with this command: (gdb) p camel_imapx_store_dump_queue_status(imapx_store) and then just continue (gdb) c or quit gdb (gdb) q I'm interested in the output of the camel_imapx_store_dump_queue_status() call.
Created attachment 279866 [details] CAMEL_DEBUG log requested Here's the first thing you asked for - I ran CAMEL_DEBUG=imapx evolution > evo.log 2>&1 and then pulled the last bit out (I gave what looked like the appropriate context just above the last 'completing command buffer' line). Evo was stuck checking for messages in Redhat.memo-list ; I left it like that for about two minutes, then quit Evo (so there are messages related to the quit at the end of the log). I'll get the gdb bit next.
I just did the gdb bit, with a new Evolution instance (sorry, didn't realize fast enough you might want both bits of output from the same Evo session). Here's a copy/paste of the relevant bits of the gdb session, so you can check that I did it right - I think Thread 2 is the correct one:
+ Trace 233763
$1 = void (gdb) I have saved the CAMEL_DEBUG=imapx output from this Evo session too, and I'll edit and attach it shortly.
Created attachment 279867 [details] camel_debug log for the same session as the gdb output is from Here's the CAMEL_DEBUG=imapx log for the same session the gdb output comes from. I did not leave Evo in the stuck state very long before attaching gdb, so that's likely why it's rather shorter. In this case it was stuck checking for new messages in Junk.
Thanks for the update. It seems to me like there was a fight between IDLE and FETCH command on the connection. The first IMAPX log shows there only one connection, while the second shows two connections towards the server. I will focus the investigation that way.
FWIW, I tried disabling the periodic checks in the hopes that IDLE would just result in me getting all my new mails anyway, but that doesn't seem to work. IDLE one supports one "mailbox" at a time, apparently: I'm not sure if "mailbox" exactly means "folder" in that context, I get lost in the RFCs, but from a 'dumb user experience' PoV anyway, having only "Listen for server change notifications" but no "Check for new messages every X minutes" doesn't seem to be enough for new mail to show up reliably in all folders...
Right, IDLE is for the currently selected mailbox only, which is a folder in Evolution. Doing it the other way, by disabling IDLE, may get better results, I guess. There is also a NOTIFY extension for IMAP, which might be better than IDLE, but there are nto enough server supporting it, I'm afraid (I'm only guessing here).
By the way, I just noticed that mine Dovecot (2.2.4) has the NOTIFY capability, thus I do not get into the IDLE command at all with it here.
When you start looking closely at IDLE, you realize it's nothing more than a hack on IMAP's existing command/response scheme. Whereas NOTIFY is more of a *proper* notification system for IMAP, but it's still too new to be widely deployed. I'm not sure what the use case is for NOT using IDLE when it's available - I would think the backend should just silently do so. The "Listen for server change notifications" option seems to me both misleading and unnecessary. The actual setting name in the .source file is "UseIdle=true/false", which I suppose could be retained as a debugging aid.
I guess I caused this with a change to bug #730788.
It seems like the FETCH command got before the IDLE, but the IDLE was issued anyway. I guess checking also a queue of active commands and avoiding the IDLE issue may do the trick. I will provide rawhide test package for Adam shortly. Created commit a1ef30c in eds master (3.13.4+) [1] Created commit bd6dcf2 in eds evolution-data-server-3-12 (3.12.4+) [1] https://git.gnome.org/browse/evolution-data-server/commit/?id=a1ef30c
I'll check that build (though I'd better hope I can reproduce the bug; after fiddling with my Evo settings a bit and eventually leaving them at "Listen for server change notifications" = "yes" and check for messages every *5* minutes, at least last night it seemed like Evo sort of gave up on pulling down new messages at all. But I'll reboot, go back to the same settings as before, and hope it reproduces. How the hell did you get Dovecot to advertise NOTIFY? I found it too and the dovecot docs indicate NOTIFY should work as long as "mailbox_list_index = yes" , but it doesn't for me, and I can't for the life of me figure out how to do it. Are you using mbox or Maildir for your server? anything special about the config? Thanks!
(In reply to comment #13) > How the hell did you get Dovecot to advertise NOTIFY? I found it too and the > dovecot docs indicate NOTIFY should work as long as "mailbox_list_index = yes" > , but it doesn't for me, and I can't for the life of me figure out how to do > it. As far as I can recall, "mailbox_list_index" is all I needed to change. Note, however, Evolution's NOTIFY support is incomplete. Last fall I got as far as supporting folder events and subscription events but stopped short of handling actual new messages and flag changes. Don't know how much you know about IMAP, but NOTIFY works by delivering the same untagged status information you would normally get in response to commands, but these are UNSOLICITED responses which have to be handled independently of any active commands. A good IMAP client should be written to handle unsolicited responses anyway, but ours isn't. It required quite a lot of rework just to handle unsolicited folder and subscription events gracefully, but then my attention got diverted and I never came back to finish the message events.
The patched build seems to improve things for me. However, I believe it's introduced a new issue. Three times now after installing the patched build, I've seen Evo get stuck after sending an email. What happens is I hit 'Send', the "Sending message" bar/spinner appears. If I pay attention I believe I see it show '100% complete' briefly, but then instead of the message compose window disappearing as it usually would, the bar just goes back to "Sending message" with the spinner and sticks apparently forever (I've left it for over ten minutes). During this time, all operations that involve talking to the server get queued up and stuck - I can read previously retrieved mails, but I can't read any not-previously-retrieved mails, and I see folder refresh operations piling up in the main window's status area. I have to do a 'killev' and restart Evo to get unstuck. The message really does get sent, though, which is why I say it gets stuck after sending the email. (This is why I've sent a couple of duplicate emails out this afternoon...I only just noticed that the stuck attempt actually succeeds, but does not finalize and close out the compose window properly).
https://www.happyassassin.net/temp/evo_stuck_after_send.webm is a short webcast of the stuck state.
Videos are not useful, the backtraces and debug logs are important, because they show what the code does, not how it looks like in the front end. Do you save sent messages into an IMAP folder? If no, then this is about something else than IMAP, otherwise it can be both about SMTP or IMAP, it depends in which stage you've got the "composer window doesn't close" issue. As you mentioned that you saw there, even only quickly flashing, the 100% done, then it can be stuck during the message save. Getting a backtrace of the running evolution should help a bit. I still do not understand how you would get it, beacuse the package itself is 3.12.3 with only added patch from comment #12.
oh, yeah, I do save sent mails into an IMAP folder, good thinking, I didn't think about that. So it likely is what you think. Usually when a single change causes a new problem, if I just describe the problem, the dev often says "oh, duh, OF COURSE it's xxxx" and doesn't need further info to fix it, so that's what I was doing. If it's not obvious and you need the logs, of course I can get 'em for you later; same stuff as for the initial problem? Let me know if just the info that the sent message is being saved to a folder on the same IMAP server is enough for you, or if you need more. thanks!
I have not seen a stuck compose window (but I have had to stop using evolution since IMAP isn't working for me at all; I think too many uses of evolution --force-shutdown left its metadata in a mess; I was doing it several times an hour). But I was getting the status message of Refreshing folder INBOX (I don't know on which IMAP server unfortunately as they all have INBOX). And clicking on messages in the summary wouldn't display them, but switching to a different account and back again would free it up for a while (sometimes only for messages I hadn't already clicked on). Even switching to a different Search Folder would free it up sometimes. I save messages to a local folder, not realizing it was possible to save them to an IMAP one (claws by default saves to Sent on the respective imap server).
liam: What happened is that Milan came up with a patch for the *initial* issue we were both seeing - the one you described again in c#19. That patch is included in a scratch build mcrha provided, at http://koji.fedoraproject.org/koji/taskinfo?taskID=7117386 . The patch does indeed fix the 'stuck at refreshing folder' bug - but it causes a *new* bug, which is the one I described in c#15. You won't see this new bug unless you use the scratch build that fixes the initial bug (or otherwise apply the patch to your Evo).
I see, thanks, Adam. I do play to try and get evo going again but it might not be until mid-august before I can spend any time on it.
Adam, could you give me a backtrace of evolution in that stuck state, please?
agh, shoot, I knew there was something I didn't get around to last night :( I'll do so soon and file it as a new bug.
New bug filed as https://bugzilla.gnome.org/show_bug.cgi?id=733081