GNOME Bugzilla – Bug 704513
Evolution sometimes flushes entire cache of IMAP+ subfolder, resulting in zero messages
Last modified: 2014-02-05 15:54:20 UTC
In Evolution 3.8.3, when I do some quick reading through a subfolder with 85000 messages of my GMail account (IMAP+) and mark messages as read, while Evolution in parallel tries to do some filtering, Evolution sometimes flushes the complete folder by saying that it has zero messages. After switching to another folder and back, Evolution starts reindexing, so there is at least no data loss (yet). Note: I do use physical folders now for Trash and Junk but the problem still happens. Advice is welcome which of the env vars on https://wiki.gnome.org/Evolution/Debugging to use for useful debugging information.
Created attachment 251018 [details] Log with CAMEL_DEBUG=imapx:io Evolution, in GMail inbox subfolder, threaded view. Managed again to make folder displayed "There are no messages in this folder." So I ran CAMEL_DEBUG=imapx:io evolution >& evo-imapx-bz-rewrite.crap and made Evolution reindex the folder with 92000 messages. I am using Ctrl+K and . shortcuts to quickly go through my unread bugmail. I have automatic local filters to tag messages with labels, based on bugmail headers. Sometimes I press Ctrl+Y on a single message in order to filter it again. After a few minutes I got an orange error bar saying something about "not authenticated". And folder displayed again "There are no messages in this folder."
Potential dup of bug 693101?
No quick resync enabled. Still need to find out whether this does NOT happen after NOT hibernating/suspending/whateveritscalled.
From the log in comment 1 it looks as though the connection dropped during the connection/configuration phase. Log doesn't show why the connection dropped, but there are some improvements for that scenario in 3.9.x. As far as I know, the quick resync feature is only problematic when talking to Zimbra servers. Fortunately that's now been fixed upstream in Zimbra. http://bugzilla.zimbra.com/show_bug.cgi?id=82493
(In reply to comment #4) > From the log in comment 1 it looks as though the connection dropped during the > connection/configuration phase. Log doesn't show why the connection dropped, > but there are some improvements for that scenario in 3.9.x. Please see bug #693101 comment #59, there is a regression in 3.9.5, which might be pushed to 3.8.3/.4 as well.
A good breakpoint for this is invalidate_local_cache() in camel-imapx-server.c, which is called when IMAPX thinks the server's uidvalidity has changed, or when the local summary's uidvalidity is lost. I also witnessed tonight while my VPN connection was flakey, the fetch_folders_for_namespaces() function in camel-store.c returned an empty "folders_from_server" hash table and the sync_folders() function was dutifully invalidating the summary for each and every folder missing from the hash table. I caught it in time to save most of my cache, but didn't have "imapx:io" logging enabled to see what happened to cause this. At this point I'm just waiting for it to happen again so I can debug it.
Think I might have a handle on this: https://git.gnome.org/browse/evolution-data-server/commit/?id=dacf16165d22e08c1f758144ec5e9f4d8c6da0d0 Help with testing would be appreciated.
In case anyone finds this useful, I built a test package with Matthew's patches (there were required more patches for a backport to 3.8.5) for Fedora 19 (evolution-data-server 3.8.5). It is available at [1] for couple next days. [1] http://koji.fedoraproject.org/koji/taskinfo?taskID=6030540
(In reply to comment #8) > In case anyone finds this useful, I built a test package with Matthew's patches > (there were required more patches for a backport to 3.8.5) for Fedora 19 > (evolution-data-server 3.8.5). It is available at [1] for couple next days. Package is awesome. Thanks sooo much! No reindexing of IMAP folders after "Cannot store folder. You are not authenticated" errors. Evo sometimes hangs when closing but I don't care, as this does not steal lots of my time. I own you several drinks now.
Thanks for the feedback. I'll close this now and create an official update for Fedora.
Created attachment 256793 [details] [review] eds patch for a reference, and maybe other distros with 3.8.5, this is the patch I used for evolution-data-server.
Just had this happen to me on F-19. It was happening intermittently before as well.
Bojan: See comment 8.
(In reply to comment #13) > Bojan: See comment 8. I'm seeing this with: https://admin.fedoraproject.org/updates/FEDORA-2013-18644/evolution-data-server-3.8.5-5.fc19 Which should be newer than build from comment #8.
I'm reopening this in favour of Bojan comment. Matthew, could you try to elaborate with Bojan any chances of a debugging on his side, please?
Sure, let me know if you still encounter this with Evolution 3.10.
Yeah, it also happened once again to me now with the 3.8.5 package from comment 8. Still so much better and more stable than before.
Happens on every suspend/resume. evolution-3.8.5-2.fc19.x86_64
Should be fixed in 3.10, closing until someone reports otherwise.
Milan, Any chance we can get this patched in F-19?
*** Bug 710642 has been marked as a duplicate of this bug. ***
André, thanks for finding the duplicate. Sorry for not finding it myself. Milan, I am still seeing this on evolution-3.8.5-2.fc19 + evolution-data-server-3.8.5-5.fc19. Should I open a new bug at rhbz or do you want to track the backport here?
(In reply to comment #19) > Should be fixed in 3.10, closing until someone reports otherwise. Matthew, I'm afraid you cannot point into an exact change which should address the regression, right? I mean, is there a set of changes which might be backportable to 3.8 version?
I see there are more users having trouble with the evolution-data-server-3.8.5-5 in Fedora 19, maybe Andre had just a good luck when testing the changes, thus I'll revert the change in 3.8.5-6 update, rather than try to chase all the related changes up to 3.10.0.
(In reply to comment #23) > Matthew, I'm afraid you cannot point into an exact change which should address > the regression, right? I mean, is there a set of changes which might be > backportable to 3.8 version? It wasn't one exact change, it was a systemic problem with the way the IMAP backend was handling errors and required a good amount of refactoring. That's why I didn't backport it to 3.8. From the commit history, looks like the bulk of the relevant changes occurred between August 12th and 15th, totaling about 25-30 commits.
Also, the reason the cache was getting flushed was because a failed IMAP LIST command was returning an empty folder result set, but also no error because of the aforementioned error handling issues. The IMAP backend, not seeing an error, then proceeded to synchronize the local folder caches with the latest LIST results. An empty folder result set meant the entire local cache got blown away. A dead TCP socket as a result of a suspend/resume cycle or loss of a VPN connection (if the server was behind firewall) was enough to trigger a failed LIST command. The error from a failed LIST command is now properly propagated back to the cache synchronization logic in 3.10, as well as many other places where errors from failed IMAP commands were getting lost.
*** Bug 711549 has been marked as a duplicate of this bug. ***
*** Bug 711698 has been marked as a duplicate of this bug. ***
Created attachment 262979 [details] [review] proposed 3.8.x eds patch for evolution-data-server(3.8.x); Yet another try to address this for 3.8.x of eds. Is there anyone willing to test it in Fedora 19? I'll be happy to build a test package, if so.
I'd be thrilled to help test. I'm on F19.
Thanks Benjamin, here is the test build for Fedora 19: http://koji.fedoraproject.org/koji/taskinfo?taskID=6234189
Milan, I have upgraded to evolution-data-server-3.8.5-6.1.fc19.x86_64 on Thursday and not seen any incident since. I'll let you know if it does, but for now it looks at least no worse than before. Thank you.
Milan, The new eds packages appear to have completely solved the problem. I've run a number of testing including frequent network drops and suspend/resume and suspend/resume/vpn reconnect. No problems so far.
(In reply to comment #31) > Thanks Benjamin, here is the test build for Fedora 19: > http://koji.fedoraproject.org/koji/taskinfo?taskID=6234189 Works for me, too! Survived my daily VPN disconnect. Thanks.
This still happens from time to time in 3.10.3 on Fedora 20 to me with GMail IMAP on a flaky wifi. Shall I open a new bug? How to debug?
Try with 3.10.4, I think of bug #702709
(In reply to comment #36) > Try with 3.10.4, I think of bug #702709 Is there going to a build of that for F-20?
(In reply to comment #37) > Is there going to a build of that for F-20? Sure, as soon as it's released, which might be during the next week.