GNOME Bugzilla – Bug 639621
Evolution freezes after restore
Last modified: 2013-12-20 22:52:47 UTC
A full backup was made under evolution 2.24.5 and an attempt to restore under 2.32.1. The backup was read successfully by the new version, the data directories created but some problem occurs in reading back the information. Attached is the result of running CAMEL_DEBUG=all evolution >& evo.log The evolution window is completely unresponsive and had to be killed. It appears (this is just a guess from examining the end of evo.log) that the sequence of messages being read/inserted into the database has a jump in number just before the freeze.
Created attachment 178407 [details] gdb output
Created attachment 178409 [details] compressed output of CAMEL_DEBUG=all evolution >& evo.log
possible dupe of bug 617038
(In reply to comment #3) > possible dupe of bug 617038 This appears to be different from 617038 but I am no expert so perhaps this will help: The restore operation completes normally (I specify that evolution NOT restart automatically after the restore in the dialog). After the restore I do a telinit 3 then telinit 5 to make sure that we're starting everything fresh. I run evolution via gdb and/or with the CAMEL_DEBUG=all. The outputs are above. Evolution hangs and there IS an evolution-alarm-notify process present. It looks like there is some sort of interaction with the sqlite that is blocked.
Attachments for valgind and gdb traceback with debugging symbols follow. These are from running evolution after the restore operation. Valgrind finds an invalid read and a weird calloc call in icaltzutil_fetch_timeszone. The gdb traceback shows the explicit operations on the sqlite db at the time things freeze.
Created attachment 178465 [details] valgrind output
Created attachment 178466 [details] gdb with symbols
(In reply to comment #0) > A full backup was made under evolution 2.24.5 and an attempt to restore under > 2.32.1. The backup was read successfully by the new version, the data > directories created but some problem occurs in reading back the information. > Attached is the result of running CAMEL_DEBUG=all evolution >& evo.log > The evolution window is completely unresponsive and had to be killed. It > appears (this is just a guess from examining the end of evo.log) that the > sequence of messages being read/inserted into the database has a jump in number > just before the freeze. There is no jump in sequence number at the end. My mistake in scanning the output.
Is this a move from 32 bit system to 64 bit system or vice versa? I mean, the backup was made on a 32 bit system and you are about to run evolution after restore on a 64 bit system or vice versa?
(In reply to comment #9) > Is this a move from 32 bit system to 64 bit system or vice versa? I mean, the > backup was made on a 32 bit system and you are about to run evolution after > restore on a 64 bit system or vice versa? No. This was 64 bit before and after the migration from Fedora 10 to 14. Howerver: The mail files have been rolled over from one machine to another previously. Briefly, I know they were on a 32 bit machine running evolution 2.22.3.1 and migrated about 2 years ago to a 64 bit machine. I didn't have any particular problem doing so. If it would be helpful I could try to produce a backup from the files on the original 32 bit machine and feed them into the current 64 bit evolution to see if the problem is present.
(In reply to comment #10) > If it would be helpful I could try to > produce a backup from the files on the original 32 bit machine and > feed them into the current 64 bit evolution to see if the problem is present. Thanks, but there is no need to waste your time, the issue with a move from 32bit to 64bit system is still there with backups from evolution before 2.32.0. From your backtrace with symbols I see it's stuck in sqlite3. Please try to go to ~/.local/share/evolution/mail/local and move away a folders.db file from there, and then run evolution. It may recreate it and possibly fix the issue. Note there are more folders.db files in ~/.local/share/evolution/mail subfolders, which might cause similar issue (but maybe not). It depends how many mail accounts you have configures, what type they are, and if it'll be stuck even after moving away the mentioned folders.db file, then how the backtrace changes. I see in that yours that it's working with "On This Computer" trash (".#evolution/Trash").
+ Trace 225583
Thread 1 (Thread 0x7ffff7fae980 (LWP 2673))
(In reply to comment #11) > (In reply to comment #10) > > If it would be helpful I could try to > > produce a backup from the files on the original 32 bit machine and > > feed them into the current 64 bit evolution to see if the problem is present. > > Thanks, but there is no need to waste your time, the issue with a move from > 32bit to 64bit system is still there with backups from evolution before 2.32.0. > > From your backtrace with symbols I see it's stuck in sqlite3. Please try to go > to ~/.local/share/evolution/mail/local and move away a folders.db file from > there, and then run evolution. It may recreate it and possibly fix the issue. > Note there are more folders.db files in ~/.local/share/evolution/mail > subfolders, which might cause similar issue (but maybe not). It depends how > many mail accounts you have configures, what type they are, and if it'll be > stuck even after moving away the mentioned folders.db file, then how the > backtrace changes. I see in that yours that it's working with "On This > Computer" trash (".#evolution/Trash"). > > I moved the folders.db out and the new one was created; everything seems to work, no freezes, everything intact. Thanks! FYI: the new folders.db is roughly 1/2 the size of the old one. And there was only one folders.db in my setup. Also, the valgrind error remains.
valgrind output is OK, there's a bug #633967 for it. I'm wondering what to do now. You've it fixed and it works for me as expected (most likely because of sequential updates on my side). I'm not sure whether sharing your broken folders.db file would be of any help, because apart of sharing pretty sensitive information one should have the same folder structure with messages on the machine too, what I really do not consider a good thing to do.
3 possibilities: 1. If the *.db files can be accessed by a tool like mysql I can characterize the differences between the good and bad db. 2. I rebuilt 2.31.1 from source (same as on my FC 14; I tried to specify no-optimization via CFLAGS=-g -O0 but that part didn't seem to do what I hoped -- I still see some variables optimized out) and I could go in and "debug" if there's something specific to examine. Of course, I know nothing about the insides of evolution so this is probably pretty inefficient unless you can give me some general instructions. 3. I can just wait. One last discovery: the address book had disappeared but thats not too important in my case. In any case, thanks for your help.
(In reply to comment #14) > 1. If the *.db files can be accessed by a tool like mysql > I can characterize the differences between the good and bad db. It's using sqlite3 databases, and the command is named sqlite3 too > 2. I rebuilt 2.31.1 from source (same as on my FC 14; I tried to 2.31.1 is not in F14, it's 2.32.1. There about half-year difference between these two versions ;) > specify no-optimization via CFLAGS=-g -O0 but that part didn't > seem to do what I hoped -- I still see some variables optimized out) > and I could go in and "debug" if there's something specific to examine. > > Of course, I know nothing about the insides of evolution so > this is probably pretty inefficient unless you can give me some > general instructions. There are none I'm aware of, this is too general issue, we may try to find out what is wrong with the folders.db file, maybe some version update between sqlite3, but doing it either through bugzilla or anyhow "offline" is too much time consuming for all interested. > 3. I can just wait. Wait for what? I understood that removing the old folders.db file fixed the issue for you so you are fine now. > One last discovery: the address book had disappeared but thats not > too important in my case. This was reported and is fixed for 2.32.2.
(In reply to comment #15) > (In reply to comment #14) > > 1. If the *.db files can be accessed by a tool like mysql > > I can characterize the differences between the good and bad db. > > It's using sqlite3 databases, and the command is named sqlite3 too > > > 2. I rebuilt 2.31.1 from source (same as on my FC 14; I tried to > > 2.31.1 is not in F14, it's 2.32.1. There about half-year difference between > these two versions ;) > > > specify no-optimization via CFLAGS=-g -O0 but that part didn't > > seem to do what I hoped -- I still see some variables optimized out) > > and I could go in and "debug" if there's something specific to examine. > > > > Of course, I know nothing about the insides of evolution so > > this is probably pretty inefficient unless you can give me some > > general instructions. > > There are none I'm aware of, this is too general issue, we may try to find out > what is wrong with the folders.db file, maybe some version update between > sqlite3, but doing it either through bugzilla or anyhow "offline" is too much > time consuming for all interested. > > > 3. I can just wait. > > Wait for what? I understood that removing the old folders.db file fixed the > issue for you so you are fine now. > > > One last discovery: the address book had disappeared but thats not > > too important in my case. > > This was reported and is fixed for 2.32.2. Yes I recompiled 2.32.1 and yes its working for me. Here are a few more clues: I examined the database integrity with sqlite3 (folders.db) prior to starting the restore and its OK. After the restore it has an unreferenced page. I traced to the point at which the database gets locked in evolution. It occurs for a particular message uid = 35296. I dumped the folders.db and found TWO statements that appear to refer to the message; INSERT INTO ".#evolution/Trash" VALUES('feFI23dL35296'); INSERT INTO ".#evolution/Trash" VALUES('KhCGGnx35296'); There is only one occurrence of this sort of statement for each message handled correctly before 35296. So the hint is that the folders.db file contains evidence that the same message was deleted more than once. The schematic form of record 35296 looks identical to the one successfully moved to trash in the previous operation. I will attach examples.
Created attachment 179194 [details] 2 records from folders.db
I don't expect this report to move anywhere further and I see you managed to get your environment back in a working state, so I'm closing this as obsolete. Feel free to reopen if you think there's actually a need to do anything else regarding the issue.