GNOME Bugzilla – Bug 322727
Mail retrieval does not complete properly; "Summary and folder mismatch" keeps popping up
Last modified: 2013-09-10 13:42:14 UTC
Version details: 1.4.1.1 After some "unfortunate" run of Evolution, an error message starts popping up after each retrieval from the local mailbox: Summary and folder mismatch, even after a sync Duplicates of the received messages keep popping up after each mail retrieval in the inbox and filter destinations.
Strangely, just as I launched Evolution with CAMEL_DEBUG=all, the problem seems to have been resolved by itself.
Created attachment 55384 [details] Output of 'CAMEL_DEBUG=all E2K_DEBUG=1 evolution', featuring the sync error Now it occurs again. The received and deleted messages triplicate after a restart!
Also reproduced with 1.4.2. I wonder could it have something to do with libdb 4.3.29 from the system, hacked into e-d-s instead of the internal libdb in my setup?
After I switched to the internal libdb, the error disappeared.
Oddly, after an upgrade to 1.4.2.1 (with internal libdb), the problem reappeared. However, it might be an after-effect of the previous corruption.
Reproduced after a month on a folder structure cleanly imported from backups, with Evolution 2.4.2.1 and e-d-s 1.4.2.1. I'm fed up and consider switching to Thunderbird.
As per discussion on IRC: Type "local mailbox delivery", pulling mail from /var/mail/mhz. Mails get retrieved over and over. Looks like the spool can't be deleted properly. Maybe related to bug #319246. Please add more details, anything about your setup that might be related. NEEDINFO. Please reopen this bug, when you provided the info requested. Thanks.
FWIW, regarding possible relation to bug 319246: Self built RPM, installed as root. The Evo executable is owned by root.
The important details I forgot to mention: fcntl locking is used, dot locking is turned off, evolution contests the mailbox with procmail.
Strace seems to show that reading of /var/mail/mhz is not guarded with an fcntl lock. More investigation to follow.
This is it: I've enabled debug printouts in camel/camel-lock.c and none shows up in Evolution output. This is a serious oversight that should be corrected in the stable branch.
(In reply to comment #11) > This is it: I've enabled debug printouts in camel/camel-lock.c and none shows > up in Evolution output. Sorry, that was a buffering issue; the fcntl log lines do turn up. More investigation is needed.
The problem is in the function below: int camel_lock_fcntl(int fd, CamelLockType type, CamelException *ex) { #ifdef USE_FCNTL struct flock lock; d(printf("fcntl locking %d\n", fd)); memset(&lock, 0, sizeof(lock)); lock.l_type = type==CAMEL_LOCK_READ?F_RDLCK:F_WRLCK; if (fcntl(fd, F_SETLK, &lock) == -1) { /* If we get a 'locking not vailable' type error, we assume the filesystem doesn't support fcntl() locking */ /* this is somewhat system-dependent */ if (errno != EINVAL && errno != ENOLCK) { camel_exception_setv (ex, CAMEL_EXCEPTION_SYSTEM, _("Failed to get lock using fcntl(2): %s"), g_strerror (errno)); return -1; } else { static int failed = 0; if (failed == 0) fprintf(stderr, "fcntl(2) locking appears not to work on this filesystem"); failed++; } } #endif return 0; } If the fcntl command was F_SETLKW, this code would work (maybe not as intended by the developers, but in any case it wouldn't fall through unlocked). With F_SETLK, fcntl may return error and set errno to EAGAIN, meaning that the file is currently locked; it doesn't wait. The code above returns zero in this case as if the lock has succeeded. I'm unsure which way should be taken to fix this: changing the lock acquire primitive to wait indefinitely would solve this dead, but the calling code makes retries on a failed lock, hinting that the lock acquire should gracefully fail if the mailbox is busy for too long.
Setting Target Milestone. See bug 319246 for a related issue.
I've made a mistake again. My previous comment is invalid. An exception in case of EAGAIN is unwarranted though.
I've found a way to remedy the situation to some extent. After the bug appears (so far it always happened after application startup), I close Evolution, then remove the spool file in ~/.evolution/mail/spool, after that the Inbox can be used again, though I may have to delete the newly received messages once again despite having marked them for deletion before (the folder summary is rebuilt?). Next time I'll try to save the corrupted spool for examination.
Created attachment 67294 [details] Corrupted spool file Here is the bzipped file from ~/.evolution/mail/spool/, taken at the time of the corruption.
changing target milestone from obsolete 1.4.3 to 1.6.x.
Akhil/Prasad: Can you use this mbox to verify our latest, greatest fix for "folder-summary mismatch" ?
I assume that one is closed with latest changes wrt folder summary mismatch. Thus closing as OBSELETE. If it's not, please reopen.