GNOME Bugzilla – Bug 674184
Trying to create new email causes Evolution to crash on Debian Unstable
Last modified: 2012-04-18 16:27:24 UTC
Whenever I try to create a new email or to reply to an email on one of my Debian Unstable machiens, Evolution crashes and all open terminals receive: Message from syslogd@anglides at Apr 16 09:49:05 ... kernel:[ 436.682966] Oops: 0000 [#1] SMP Message from syslogd@anglides at Apr 16 09:49:05 ... kernel:[ 436.683113] Stack: Message from syslogd@anglides at Apr 16 09:49:05 ... kernel:[ 436.683133] Call Trace: Message from syslogd@anglides at Apr 16 09:49:05 ... kernel:[ 436.683185] Code: df e8 a7 4c ff ff 48 89 ef 89 44 24 08 e8 d8 fc ff ff 8b 44 24 08 48 83 c4 38 5b 5d 41 5c 41 5d 41 5e 41 5f c3 f0 80 4f 48 04 c3 <48> 8b 7f b8 31 c0 48 85 ff 74 16 8b 57 30 83 e6 03 21 f2 39 f2 Message from syslogd@anglides at Apr 16 09:49:05 ... kernel:[ 436.683229] CR2: ffffffffffffffb8 On the other machines things are fine. The difference between the machiens is that the one that crashes has an NVIDIA card. I am using all the standard Debian NVIDIA subsystems. The machine is fully up to date with Debian Unstable as at 2012-04-16T10:00+01:00. If people can guide me as to what data to provide to help investigate this issue I can provide it.
Thanks for taking the time to report this bug. Without a stack trace from the crash it's very hard to determine what caused it. Can you get us a stack trace? Please see http://live.gnome.org/GettingTraces for more information on how to do so. Thanks in advance!
I installed the evolution-dbg package and then ran evolution from within gdb. After waiting for the sync of all the folder caches so there was no activity I clicked on the "new message" button, got the kernel messages, but Evolution has deadlocked rather than crashed. So I ^C to get the prompt and then asked for the backtrace: Message from syslogd@anglides at Apr 16 12:51:44 ... kernel:[10017.745037] Oops: 0000 [#2] SMP Message from syslogd@anglides at Apr 16 12:51:44 ... kernel:[10017.745170] Stack: Message from syslogd@anglides at Apr 16 12:51:44 ... kernel:[10017.745186] Call Trace: Message from syslogd@anglides at Apr 16 12:51:44 ... kernel:[10017.745237] Code: df e8 a7 4c ff ff 48 89 ef 89 44 24 08 e8 d8 fc ff ff 8b 44 24 08 48 83 c4 38 5b 5d 41 5c 41 5d 41 5e 41 5f c3 f0 80 4f 48 04 c3 <48> 8b 7f b8 31 c0 48 85 ff 74 16 8b 57 30 83 e6 03 21 f2 39 f2 Message from syslogd@anglides at Apr 16 12:51:44 ... kernel:[10017.745281] CR2: ffffffffffffffb8 ^C Program received signal SIGINT, Interrupt.
+ Trace 230067
Thread 140736980678400 (LWP 4505)
Which doesn't seem entirely helpful since this is all libc threads stuff and not a lot to do with Evolution per se.
What does "deadlocked" exactly mean for you? And after hitting Ctrl+C once, can you enter "thread apply all bt" and post the complete gdb output here, please?
Message from syslogd@anglides at Apr 16 19:29:01 ... kernel:[23455.103786] Oops: 0000 [#2] SMP Message from syslogd@anglides at Apr 16 19:29:01 ... kernel:[23455.103927] Stack: Message from syslogd@anglides at Apr 16 19:29:01 ... kernel:[23455.103943] Call Trace: Message from syslogd@anglides at Apr 16 19:29:01 ... kernel:[23455.103994] Code: df e8 a7 4c ff ff 48 89 ef 89 44 24 08 e8 d8 fc ff ff 8b 44 24 08 48 83 c4 38 5b 5d 41 5c 41 5d 41 5e 41 5f c3 f0 80 4f 48 04 c3 <48> 8b 7f b8 31 c0 48 85 ff 74 16 8b 57 30 83 e6 03 21 f2 39 f2 Message from syslogd@anglides at Apr 16 19:29:01 ... kernel:[23455.104039] CR2: ffffffffffffffb8 ^C Program received signal SIGINT, Interrupt. [Switching to Thread 0x7fffe1bd7700 (LWP 5690)] 0x00007fffee6b7cc3 in poll () from /lib/x86_64-linux-gnu/libc.so.6 gdb>thread apply all bt Cannot find new threads: generic error gdb>
This looks like a problem that is WAY deeper in the stack than in Evolution. You mention that the only difference is the NVIDIA card. Before I close this as NOTGNOME and ask you to file a report in Debian's bugtracker: Do you use kernel 3.3.1/.2 or 3.4.0? If yes: Does this also happen with 3.3.0? Do you use NFS as your home directory? If yes: Can you provide dmesg output? I have https://bugzilla.redhat.com/show_bug.cgi?id=811138 in mind, but probably I'm totally wrong. :)
Or might be something like bug 670478, which turned out to be an NVIDIA driver bug. (libnvidia-tls.so.295.20 in the backtrace is the telltale sign.) Impossible to tell without a usable backtrace though.
André, I am using the kernel 3.2.0-2-amd64 series as per Debian Unstable. Currently 3.2.15-1. I have no idea if it fails with other kernels. I am definitely using NFS for my home directory -- it's the only sane way of having multiple machines guaranteed to be using the same filestore -- though, sadly, it is increasingly the case that GNOME assumes every machine is working with a separate filestore for ~ just like in the Windows 3.1 days :-(( dmesg result being appended shortly. It looks suspiciously like NFS4 is involved, which is strange as I thought I was using NFS3. Matthew, The NVIDIA driver has been a source of great problems recently, causing irregular and terminal X crashes. Each time there is a new kernel or a new NVIDIA driver I give things a whirl but get the X terminations. I was trying a new driver which seemed to be holding up when this Evolution issue hit and so I have been using other machines as not being ale to send email is not an option.
Created attachment 212298 [details] DMesg of machine for which fails happen This is the dmesg as is without special preparation. If a more directed dmesg is needed, please let me know.
I agree with Andre in comment #5, this is almost certainly an NFS issue. Gonna go ahead and close this as NOTGNOME until a gdb stack trace of the evolution process is provided.
I have posted a bug report in the Debian bugzilla: 669270: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=669270