Bug 704513 – Evolution sometimes flushes entire cache of IMAP+ subfolder, resulting in zero messages

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 704513 - Evolution sometimes flushes entire cache of IMAP+ subfolder, resulting in zero messages


Summary:	Evolution sometimes flushes entire cache of IMAP+ subfolder, resulting in zer...


Status:	RESOLVED FIXED

Product:	evolution
Classification:	Applications
Component:	Mailer
Version:	3.8.x (obsolete)
Hardware:	Other Linux

Importance:	Normal major
Target Milestone:	---
Assigned To:	evolution-mail-maintainers
QA Contact:	Evolution QA team

URL:
Whiteboard:

Duplicates:	710642 711549 711698 (view as bug list)
Depends on:
Blocks:	693101

Reported:	2013-07-19 02:12 UTC by André Klapper
Modified:	2014-02-05 15:54 UTC

See Also:
GNOME target:	---
GNOME version:	3.7/3.8

Attachments
Log with CAMEL_DEBUG=imapx:io (47.28 KB, text/plain) 2013-08-07 07:17 UTC, André Klapper		Details
eds patch (5.72 KB, patch) 2013-10-09 10:38 UTC, Milan Crha	rejected	Details \| Review
proposed 3.8.x eds patch (4.70 KB, patch) 2013-11-27 20:50 UTC, Milan Crha	none	Details \| Review

Description André Klapper 2013-07-19 02:12:35 UTC

In Evolution 3.8.3, when I do some quick reading through a subfolder with 85000 messages of my GMail account (IMAP+) and mark messages as read, while Evolution in parallel tries to do some filtering, Evolution sometimes flushes the complete folder by saying that it has zero messages. 

After switching to another folder and back, Evolution starts reindexing, so there is at least no data loss (yet).
Note: I do use physical folders now for Trash and Junk but the problem still happens.

Advice is welcome which of the env vars on https://wiki.gnome.org/Evolution/Debugging to use for useful debugging information.

Comment 1 André Klapper 2013-08-07 07:17:25 UTC

Created attachment 251018 [details]
Log with CAMEL_DEBUG=imapx:io

Evolution, in GMail inbox subfolder, threaded view.
Managed again to make folder displayed "There are no messages in this folder."

So I ran 
    CAMEL_DEBUG=imapx:io evolution >& evo-imapx-bz-rewrite.crap
and made Evolution reindex the folder with 92000 messages.

I am using Ctrl+K and . shortcuts to quickly go through my unread bugmail.
I have automatic local filters to tag messages with labels, based on bugmail headers.
Sometimes I press Ctrl+Y on a single message in order to filter it again.

After a few minutes I got an orange error bar saying something about "not authenticated".

And folder displayed again "There are no messages in this folder."

Comment 2 André Klapper 2013-08-07 10:06:15 UTC

Potential dup of bug 693101?

Comment 3 André Klapper 2013-08-07 10:21:01 UTC

No quick resync enabled. Still need to find out whether this does NOT happen after NOT hibernating/suspending/whateveritscalled.

Comment 4 Matthew Barnes 2013-08-07 11:21:13 UTC

From the log in comment 1 it looks as though the connection dropped during the connection/configuration phase.  Log doesn't show why the connection dropped, but there are some improvements for that scenario in 3.9.x.

As far as I know, the quick resync feature is only problematic when talking to Zimbra servers.  Fortunately that's now been fixed upstream in Zimbra.

http://bugzilla.zimbra.com/show_bug.cgi?id=82493

Comment 5 Milan Crha 2013-08-07 11:36:22 UTC

(In reply to comment #4)
> From the log in comment 1 it looks as though the connection dropped during the
> connection/configuration phase.  Log doesn't show why the connection dropped,
> but there are some improvements for that scenario in 3.9.x.

Please see bug #693101 comment #59, there is a regression in 3.9.5, which might be pushed to 3.8.3/.4 as well.

Comment 6 Matthew Barnes 2013-08-11 06:51:42 UTC

A good breakpoint for this is invalidate_local_cache() in camel-imapx-server.c, which is called when IMAPX thinks the server's uidvalidity has changed, or when the local summary's uidvalidity is lost.

I also witnessed tonight while my VPN connection was flakey, the fetch_folders_for_namespaces() function in camel-store.c returned an empty "folders_from_server" hash table and the sync_folders() function was dutifully invalidating the summary for each and every folder missing from the hash table.  I caught it in time to save most of my cache, but didn't have "imapx:io" logging enabled to see what happened to cause this.

At this point I'm just waiting for it to happen again so I can debug it.

Comment 7 Matthew Barnes 2013-08-12 11:27:47 UTC

Think I might have a handle on this:

https://git.gnome.org/browse/evolution-data-server/commit/?id=dacf16165d22e08c1f758144ec5e9f4d8c6da0d0

Help with testing would be appreciated.

Comment 8 Milan Crha 2013-10-07 10:34:53 UTC

In case anyone finds this useful, I built a test package with Matthew's patches (there were required more patches for a backport to 3.8.5) for Fedora 19 (evolution-data-server 3.8.5). It is available at [1] for couple next days.

[1] http://koji.fedoraproject.org/koji/taskinfo?taskID=6030540

Comment 9 André Klapper 2013-10-09 08:49:21 UTC

(In reply to comment #8)
> In case anyone finds this useful, I built a test package with Matthew's patches
> (there were required more patches for a backport to 3.8.5) for Fedora 19
> (evolution-data-server 3.8.5). It is available at [1] for couple next days.

Package is awesome. Thanks sooo much! No reindexing of IMAP folders after "Cannot store folder. You are not authenticated" errors. Evo sometimes hangs when closing but I don't care, as this does not steal lots of my time.
I own you several drinks now.

Comment 10 Milan Crha 2013-10-09 10:34:32 UTC

Thanks for the feedback. I'll close this now and create an official update for Fedora.

Comment 11 Milan Crha 2013-10-09 10:38:05 UTC

Created attachment 256793 [details] [review]
eds patch

for a reference, and maybe other distros with 3.8.5, this is the patch I used for evolution-data-server.

Comment 12 Bojan Smojver 2013-10-16 08:38:43 UTC

Just had this happen to me on F-19. It was happening intermittently before as well.

Comment 13 André Klapper 2013-10-16 10:19:01 UTC

Bojan: See comment 8.

Comment 14 Bojan Smojver 2013-10-16 10:59:33 UTC

(In reply to comment #13)
> Bojan: See comment 8.

I'm seeing this with:

https://admin.fedoraproject.org/updates/FEDORA-2013-18644/evolution-data-server-3.8.5-5.fc19

Which should be newer than build from comment #8.

Comment 15 Milan Crha 2013-10-16 16:23:03 UTC

I'm reopening this in favour of Bojan comment.

Matthew, could you try to elaborate with Bojan any chances of a debugging on his side, please?

Comment 16 Matthew Barnes 2013-10-16 16:51:48 UTC

Sure, let me know if you still encounter this with Evolution 3.10.

Comment 17 André Klapper 2013-10-17 14:06:38 UTC

Yeah, it also happened once again to me now with the 3.8.5 package from comment 8. Still so much better and more stable than before.

Comment 18 Bojan Smojver 2013-10-21 20:48:43 UTC

Happens on every suspend/resume. evolution-3.8.5-2.fc19.x86_64

Comment 19 Matthew Barnes 2013-10-21 21:12:01 UTC

Should be fixed in 3.10, closing until someone reports otherwise.

Comment 20 Bojan Smojver 2013-10-22 10:18:11 UTC

Milan,

Any chance we can get this patched in F-19?

Comment 21 André Klapper 2013-10-22 12:51:17 UTC

*** Bug 710642 has been marked as a duplicate of this bug. ***

Comment 22 Robert Buchholz 2013-10-22 13:18:55 UTC

André, thanks for finding the duplicate. Sorry for not finding it myself.

Milan, I am still seeing this on evolution-3.8.5-2.fc19 + evolution-data-server-3.8.5-5.fc19. Should I open a new bug at rhbz or do you want to track the backport here?

Comment 23 Milan Crha 2013-11-05 08:21:56 UTC

(In reply to comment #19)
> Should be fixed in 3.10, closing until someone reports otherwise.

Matthew, I'm afraid you cannot point into an exact change which should address the regression, right? I mean, is there a set of changes which might be backportable to 3.8 version?

Comment 24 Milan Crha 2013-11-05 11:35:18 UTC

I see there are more users having trouble with the evolution-data-server-3.8.5-5 in Fedora 19, maybe Andre had just a good luck when testing the changes, thus I'll revert the change in 3.8.5-6 update, rather than try to chase all the related changes up to 3.10.0.

Comment 25 Matthew Barnes 2013-11-05 12:59:24 UTC

(In reply to comment #23)
> Matthew, I'm afraid you cannot point into an exact change which should address
> the regression, right? I mean, is there a set of changes which might be
> backportable to 3.8 version?

It wasn't one exact change, it was a systemic problem with the way the IMAP backend was handling errors and required a good amount of refactoring.  That's why I didn't backport it to 3.8.

From the commit history, looks like the bulk of the relevant changes occurred between August 12th and 15th, totaling about 25-30 commits.

Comment 26 Matthew Barnes 2013-11-05 13:12:34 UTC

Also, the reason the cache was getting flushed was because a failed IMAP LIST command was returning an empty folder result set, but also no error because of the aforementioned error handling issues.

The IMAP backend, not seeing an error, then proceeded to synchronize the local folder caches with the latest LIST results.  An empty folder result set meant the entire local cache got blown away.

A dead TCP socket as a result of a suspend/resume cycle or loss of a VPN connection (if the server was behind firewall) was enough to trigger a failed LIST command.

The error from a failed LIST command is now properly propagated back to the cache synchronization logic in 3.10, as well as many other places where errors from failed IMAP commands were getting lost.

Comment 27 André Klapper 2013-11-07 22:33:25 UTC

*** Bug 711549 has been marked as a duplicate of this bug. ***

Comment 28 Matthew Barnes 2013-11-08 17:12:27 UTC

*** Bug 711698 has been marked as a duplicate of this bug. ***

Comment 29 Milan Crha 2013-11-27 20:50:07 UTC

Created attachment 262979 [details] [review]
proposed 3.8.x eds patch

for evolution-data-server(3.8.x);

Yet another try to address this for 3.8.x of eds. Is there anyone willing to test it in Fedora 19? I'll be happy to build a test package, if so.

Comment 30 Benjamin Kahn 2013-11-27 20:55:24 UTC

I'd be thrilled to help test. I'm on F19.

Comment 31 Milan Crha 2013-11-28 07:51:13 UTC

Thanks Benjamin, here is the test build for Fedora 19:
http://koji.fedoraproject.org/koji/taskinfo?taskID=6234189

Comment 32 Robert Buchholz 2013-11-30 14:51:41 UTC

Milan, I have upgraded to evolution-data-server-3.8.5-6.1.fc19.x86_64 on Thursday and not seen any incident since. I'll let you know if it does, but for now it looks at least no worse than before. Thank you.

Comment 33 Benjamin Kahn 2013-12-02 22:19:54 UTC

Milan,

The new eds packages appear to have completely solved the problem.

I've run a number of testing including frequent network drops and suspend/resume and suspend/resume/vpn reconnect.

No problems so far.

Comment 34 Kai Engert 2013-12-03 21:29:12 UTC

(In reply to comment #31)
> Thanks Benjamin, here is the test build for Fedora 19:
> http://koji.fedoraproject.org/koji/taskinfo?taskID=6234189

Works for me, too! Survived my daily VPN disconnect. Thanks.

Comment 35 André Klapper 2014-02-04 17:05:05 UTC

This still happens from time to time in 3.10.3 on Fedora 20 to me with GMail IMAP on a flaky wifi. Shall I open a new bug? How to debug?

Comment 36 Milan Crha 2014-02-04 19:34:09 UTC

Try with 3.10.4, I think of bug #702709

Comment 37 Bojan Smojver 2014-02-05 03:38:57 UTC

(In reply to comment #36)
> Try with 3.10.4, I think of bug #702709

Is there going to a build of that for F-20?

Comment 38 Milan Crha 2014-02-05 15:54:20 UTC

(In reply to comment #37)
> Is there going to a build of that for F-20?

Sure, as soon as it's released, which might be during the next week.