After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 445309 - crash in in camel_certdb_save at camel-certdb.c:371
crash in in camel_certdb_save at camel-certdb.c:371
Status: RESOLVED FIXED
Product: evolution-data-server
Classification: Platform
Component: Mailer
1.12.x (obsolete)
Other All
: Immediate blocker
: ---
Assigned To: evolution-mail-maintainers
Evolution QA team
view as bug list)
Depends on:
Blocks:
 
 
Reported: 2007-06-07 22:18 UTC by earwickerspam
Modified: 2008-04-13 16:41 UTC
See Also:
GNOME target: 2.22.x
GNOME version: 2.19/2.20


Attachments
Proposed patch (768 bytes, patch)
2008-01-21 03:37 UTC, Matthew Barnes
accepted-commit_now Details | Review

Description earwickerspam 2007-06-07 22:18:19 UTC
What were you doing when the application crashed?
Closing Evolution using 'Quit'


Distribution: Fedora release 7 (Moonshine)
Gnome Release: 2.18.0 2007-03-23 (Red Hat, Inc)
BugBuddy Version: 2.18.0

System: Linux 2.6.21-1.3194.fc7 #1 SMP Wed May 23 22:47:07 EDT 2007 x86_64
X Vendor: The X.Org Foundation
X Vendor Release: 10300000
Selinux: Enforcing
Accessibility: Disabled
GTK+ Theme: Clearlooks
Icon Theme: Fedora

Memory status: size: 653873152 vsize: 653873152 resident: 77918208 share: 22032384 rss: 77918208 rss_rlim: 18446744073709551615
CPU usage: start_time: 1181254403 rtime: 1426 utime: 1325 stime: 101 cutime:1 cstime: 1 timeout: 0 it_real_value: 0 frequency: 100

Backtrace was generated from '/usr/bin/evolution'

Using host libthread_db library "/lib64/libthread_db.so.1".
[Thread debugging using libthread_db enabled]
[New Thread 46912496389728 (LWP 28963)]
[New Thread 1094719824 (LWP 29143)]
[New Thread 1157925200 (LWP 28996)]
0x0000003f24a0d89f in waitpid () from /lib64/libpthread.so.0

Thread 2 (Thread 1094719824 (LWP 29143))

  • #0 __lll_mutex_lock_wait
    from /lib64/libpthread.so.0
  • #1 _L_mutex_lock_103
    from /lib64/libpthread.so.0
  • #2 pthread_mutex_lock
    from /lib64/libpthread.so.0
  • #3 <signal handler called>
  • #4 pthread_mutex_lock
    from /lib64/libpthread.so.0
  • #5 PR_Lock
    from /usr/lib64/libnspr4.so
  • #6 ??
    from /usr/lib64/libnspr4.so
  • #7 __nptl_deallocate_tsd
    from /lib64/libpthread.so.0
  • #8 start_thread
    from /lib64/libpthread.so.0
  • #9 clone
    from /lib64/libc.so.6


----------- .xsession-errors ---------------------
** Message: volume = 0
Xlib:  extension "SHAPE" missing on display ":0.0".
Xlib:  extension "SHAPE" missing on display ":0.0".
Xlib:  extension "SHAPE" missing on display ":0.0".
** Message: drive = 0
** Message: volume = 0
** Message: drive = 0
** Message: volume = 0
** Message: drive = 0
** Message: volume = 0
** Message: drive = 0
** Message: volume = 0
Xlib:  extension "SHAPE" missing on display ":0.0".
Xlib:  extension "SHAPE" missing on display ":0.0".
Xlib:  extension "SHAPE" missing on display ":0.0".
--------------------------------------------------
Comment 1 André Klapper 2007-06-12 00:34:41 UTC
*** Bug 445386 has been marked as a duplicate of this bug. ***
Comment 2 André Klapper 2007-06-13 11:54:57 UTC
*** Bug 445472 has been marked as a duplicate of this bug. ***
Comment 3 André Klapper 2007-06-13 11:55:02 UTC
*** Bug 446335 has been marked as a duplicate of this bug. ***
Comment 4 André Klapper 2007-06-13 11:55:11 UTC
*** Bug 444560 has been marked as a duplicate of this bug. ***
Comment 5 André Klapper 2007-06-13 11:55:26 UTC
*** Bug 446808 has been marked as a duplicate of this bug. ***
Comment 6 André Klapper 2007-06-15 16:03:59 UTC
*** Bug 447379 has been marked as a duplicate of this bug. ***
Comment 7 André Klapper 2007-06-18 11:51:50 UTC
*** Bug 448059 has been marked as a duplicate of this bug. ***
Comment 8 André Klapper 2007-06-18 11:52:02 UTC
*** Bug 448594 has been marked as a duplicate of this bug. ***
Comment 9 André Klapper 2007-06-18 11:52:08 UTC
*** Bug 448678 has been marked as a duplicate of this bug. ***
Comment 10 André Klapper 2007-06-18 11:52:15 UTC
*** Bug 448681 has been marked as a duplicate of this bug. ***
Comment 11 André Klapper 2007-06-18 11:52:20 UTC
*** Bug 448746 has been marked as a duplicate of this bug. ***
Comment 12 André Klapper 2007-06-18 11:53:42 UTC
only fedora reports so far...
Comment 13 André Klapper 2007-06-18 20:18:21 UTC
*** Bug 448781 has been marked as a duplicate of this bug. ***
Comment 14 André Klapper 2007-06-18 20:18:25 UTC
*** Bug 448811 has been marked as a duplicate of this bug. ***
Comment 15 André Klapper 2007-06-18 20:18:30 UTC
*** Bug 448871 has been marked as a duplicate of this bug. ***
Comment 16 André Klapper 2007-06-19 23:00:24 UTC
*** Bug 449165 has been marked as a duplicate of this bug. ***
Comment 17 André Klapper 2007-06-20 08:36:00 UTC
*** Bug 449352 has been marked as a duplicate of this bug. ***
Comment 18 André Klapper 2007-06-20 10:29:21 UTC
*** Bug 449424 has been marked as a duplicate of this bug. ***
Comment 19 André Klapper 2007-06-20 15:43:13 UTC
*** Bug 449448 has been marked as a duplicate of this bug. ***
Comment 20 André Klapper 2007-06-20 15:43:17 UTC
*** Bug 449452 has been marked as a duplicate of this bug. ***
Comment 21 Karsten Bräckelmann 2007-06-20 20:20:54 UTC
*** Bug 449557 has been marked as a duplicate of this bug. ***
Comment 22 Karsten Bräckelmann 2007-06-20 20:23:51 UTC
*** Bug 442067 has been marked as a duplicate of this bug. ***
Comment 23 André Klapper 2007-06-21 22:25:08 UTC
*** Bug 449846 has been marked as a duplicate of this bug. ***
Comment 24 André Klapper 2007-06-22 09:02:57 UTC
*** Bug 449896 has been marked as a duplicate of this bug. ***
Comment 25 André Klapper 2007-06-22 09:03:02 UTC
*** Bug 449990 has been marked as a duplicate of this bug. ***
Comment 26 palfrey 2007-06-25 00:16:40 UTC
*** Bug 450735 has been marked as a duplicate of this bug. ***
Comment 27 palfrey 2007-06-27 13:30:24 UTC
*** Bug 451318 has been marked as a duplicate of this bug. ***
Comment 28 palfrey 2007-06-27 13:34:19 UTC
*** Bug 451582 has been marked as a duplicate of this bug. ***
Comment 29 palfrey 2007-06-28 13:56:39 UTC
*** Bug 451736 has been marked as a duplicate of this bug. ***
Comment 30 palfrey 2007-06-28 13:56:47 UTC
*** Bug 451757 has been marked as a duplicate of this bug. ***
Comment 31 palfrey 2007-06-30 15:23:09 UTC
*** Bug 452433 has been marked as a duplicate of this bug. ***
Comment 32 Karsten Bräckelmann 2007-07-02 19:16:24 UTC
*** Bug 453140 has been marked as a duplicate of this bug. ***
Comment 33 Karsten Bräckelmann 2007-07-02 19:16:35 UTC
*** Bug 452635 has been marked as a duplicate of this bug. ***
Comment 34 Karsten Bräckelmann 2007-07-02 19:17:26 UTC
*** Bug 445550 has been marked as a duplicate of this bug. ***
Comment 35 palfrey 2007-07-06 14:43:16 UTC
*** Bug 453974 has been marked as a duplicate of this bug. ***
Comment 36 palfrey 2007-07-06 14:44:25 UTC
*** Bug 454077 has been marked as a duplicate of this bug. ***
Comment 37 palfrey 2007-07-11 12:47:55 UTC
*** Bug 455772 has been marked as a duplicate of this bug. ***
Comment 38 palfrey 2007-07-11 12:47:58 UTC
*** Bug 455773 has been marked as a duplicate of this bug. ***
Comment 39 palfrey 2007-07-11 12:48:01 UTC
*** Bug 455780 has been marked as a duplicate of this bug. ***
Comment 40 palfrey 2007-07-19 15:11:05 UTC
*** Bug 458051 has been marked as a duplicate of this bug. ***
Comment 41 Tobias Mueller 2007-07-21 01:07:10 UTC
*** Bug 456765 has been marked as a duplicate of this bug. ***
Comment 42 Tobias Mueller 2007-07-21 01:57:28 UTC
*** Bug 457746 has been marked as a duplicate of this bug. ***
Comment 43 Tobias Mueller 2007-07-21 01:57:43 UTC
*** Bug 458805 has been marked as a duplicate of this bug. ***
Comment 44 Tobias Mueller 2007-07-21 02:00:07 UTC
*** Bug 455213 has been marked as a duplicate of this bug. ***
Comment 45 Tobias Mueller 2007-07-21 02:00:34 UTC
*** Bug 455332 has been marked as a duplicate of this bug. ***
Comment 46 Tobias Mueller 2007-07-21 02:06:27 UTC
Dear reporter, thank you for your bug report.
It would be helpful if you can install a glibc-debug package, reproduce the bug and attach the stacktrace here.

That way we maybe can determine, why fsync() is about to crash.

Thanks in advance!
Comment 47 rwt 2007-07-21 14:38:33 UTC
(In reply to comment #46)
> Dear reporter, thank you for your bug report.
> It would be helpful if you can install a glibc-debug package, reproduce the bug
> and attach the stacktrace here.
> 
> That way we maybe can determine, why fsync() is about to crash.
> 
> Thanks in advance!
> 

Hi, added glibc-debuginfo-common, glib-debuginfo, glib2-debuginfo, glibc-debuginfo. I didn't see just a glibc-debug package for fedora 7. Of course since I added those packages it will probably work just fine now.

BTW, I'm running into situations while using evolution where it stops working. Sometimes right in the middle of composing a message. A lot of times while it starts up. Is there a way to cause it to dump to see what is going on or do I need to set up a debug environment and fire up gdb? I used to develop code, however I haven't done that in many years. What I do in those situations is I send it a HUP signal, restart it and it asks me if I want to recover the message and everything is fine, for a while. This issue will probably end up being another bug to track. I'm just not sure how to bring this up as an issue.

Thanks.
Comment 48 Tobias Mueller 2007-07-21 23:48:01 UTC
Hi.

(In reply to comment #47)
> Is there a way to cause it to dump to see what is going on or do I
> need to set up a debug environment and fire up gdb?
You could attach a debugger (ie. gdb) to the evolution process and generate a backtrace. You don't need a special debug environment.

But please file a bug for each issue or write to the evolution-list first :)

Cheers.
Comment 49 André Klapper 2007-07-26 23:03:56 UTC
*** Bug 450163 has been marked as a duplicate of this bug. ***
Comment 50 André Klapper 2007-07-26 23:04:00 UTC
*** Bug 456200 has been marked as a duplicate of this bug. ***
Comment 51 André Klapper 2007-07-26 23:04:15 UTC
*** Bug 456295 has been marked as a duplicate of this bug. ***
Comment 52 André Klapper 2007-07-29 14:18:55 UTC
*** Bug 459892 has been marked as a duplicate of this bug. ***
Comment 53 Susana 2007-07-29 22:50:03 UTC
*** Bug 461575 has been marked as a duplicate of this bug. ***
Comment 54 palfrey 2007-08-01 12:21:04 UTC
*** Bug 462143 has been marked as a duplicate of this bug. ***
Comment 55 palfrey 2007-08-01 12:21:34 UTC
*** Bug 462241 has been marked as a duplicate of this bug. ***
Comment 56 palfrey 2007-08-01 12:25:27 UTC
*** Bug 462374 has been marked as a duplicate of this bug. ***
Comment 57 André Klapper 2007-08-01 16:42:57 UTC
*** Bug 461768 has been marked as a duplicate of this bug. ***
Comment 58 André Klapper 2007-08-01 16:42:59 UTC
*** Bug 461799 has been marked as a duplicate of this bug. ***
Comment 59 Iestyn Pryce 2007-08-04 12:49:25 UTC
*** Bug 463261 has been marked as a duplicate of this bug. ***
Comment 60 Tobias Mueller 2007-08-22 17:14:19 UTC
Please note, that there are way better stacktraces e.g. in bug 445386
  • #2 <signal handler called>
  • #3 fsync
    from /lib64/libpthread.so.0
  • #4 camel_certdb_save
    at camel-certdb.c line 371
  • #5 camel_shutdown
    at camel.c line 68
  • #6 *__GI_exit
    at exit.c line 75
  • #2 <signal handler called>
  • #3 fsync
    from /lib64/libpthread.so.0
  • #4 camel_certdb_save
    at camel-certdb.c line 371
  • #5 camel_shutdown
    at camel.c line 68
  • #6 *__GI_exit
    at exit.c line 75


my 0.02$ to this is, that certdb hooks g_atexit() which the file descriptor maybe do as well. So certdb then tries to write to an invalid filedescriptor and crashes. Just wild thoughts though...

Moving from Evo to e-d-s.
Comment 61 André Klapper 2007-08-22 18:06:51 UTC
*** Bug 467649 has been marked as a duplicate of this bug. ***
Comment 62 André Klapper 2007-08-22 18:06:54 UTC
*** Bug 466708 has been marked as a duplicate of this bug. ***
Comment 63 André Klapper 2007-08-22 18:06:59 UTC
*** Bug 465692 has been marked as a duplicate of this bug. ***
Comment 64 André Klapper 2007-08-22 18:08:55 UTC
this is currently the worst e-d-s crasher, adding gnome-2.20 target.
Comment 65 André Klapper 2007-08-22 18:25:50 UTC
[restore]
Comment 66 Srinivasa Ragavan 2007-08-28 11:07:03 UTC
I see only fedora crashers/dupes and nothing else. I don't see any way the code can crash wrt trunk. I would love to see if any body can prove me wrong with a code review :)
Comment 67 Matthew Barnes 2007-08-28 19:41:35 UTC
This may or may not be relevant, but the Fedora evolution-data-server package is configured with:

    --enable-file-locking=fcntl
    --enable-dot-locking=no

Might help to reproduce the problem on non-Fedora distros.
Comment 68 Matthew Barnes 2007-08-28 20:27:35 UTC
If someone could post or find a stacktrace for this bug that includes debugging info for glibc, that would be very helpful.

   yum install glibc-debuginfo
Comment 69 Matthew Barnes 2007-08-30 15:17:48 UTC
(In reply to comment #60)
> my 0.02$ to this is, that certdb hooks g_atexit() which the file descriptor
> maybe do as well. So certdb then tries to write to an invalid filedescriptor
> and crashes. Just wild thoughts though...

If the file descriptor is invalid, fsync() should simply return -1 with errno set appropriately, not crash the program.  Still, I suspect it may be related to calling fsync() in a atexit() callback.

Not only are all the dupes from Fedora, all but the very first dupe (bug #445386) are from Fedora 7.  The first dupe is Fedora 8 Development, filed in early June.  It makes me wonder if perhaps there was a glitch in glibc that got fixed shortly after Fedora 7 was released.  I'll sift through ChangeLogs and look for clues.
Comment 70 Laurent Vaills 2007-08-31 12:23:37 UTC
That's true that's quite a long time I did not have this bug. Perhaps it was fixed in an update of glibc as Matthew thought.
Comment 71 Tobias Mueller 2007-09-04 23:04:37 UTC
*** Bug 473638 has been marked as a duplicate of this bug. ***
Comment 72 Tobias Mueller 2007-09-04 23:06:10 UTC
*** Bug 472163 has been marked as a duplicate of this bug. ***
Comment 73 Tobias Mueller 2007-09-04 23:06:26 UTC
*** Bug 470893 has been marked as a duplicate of this bug. ***
Comment 74 Tobias Mueller 2007-09-04 23:06:39 UTC
*** Bug 470412 has been marked as a duplicate of this bug. ***
Comment 75 Tobias Mueller 2007-09-04 23:06:51 UTC
*** Bug 469858 has been marked as a duplicate of this bug. ***
Comment 76 Matthew Barnes 2007-09-05 10:36:47 UTC
Given that all the dupes are from Fedora users, and all but one are Fedora 7 (including the most recent dupes that Tobias marked), I'm going to move this downstream.  It should not block the Evolution 2.12 release.

Closing this as NOTGNOME.  Please refer to:
http://bugzilla.redhat.com/show_bug.cgi?id=278171
Comment 77 Tobias Mueller 2007-09-05 21:49:58 UTC
*** Bug 474039 has been marked as a duplicate of this bug. ***
Comment 78 Tobias Mueller 2007-09-06 15:01:46 UTC
*** Bug 474216 has been marked as a duplicate of this bug. ***
Comment 79 Tobias Mueller 2007-09-07 16:00:09 UTC
*** Bug 474582 has been marked as a duplicate of this bug. ***
Comment 80 Suman Manjunath 2007-09-09 15:53:23 UTC
*** Bug 474834 has been marked as a duplicate of this bug. ***
Comment 81 Suman Manjunath 2007-09-09 15:54:01 UTC
*** Bug 475014 has been marked as a duplicate of this bug. ***
Comment 82 Suman Manjunath 2007-09-09 15:54:44 UTC
*** Bug 475032 has been marked as a duplicate of this bug. ***
Comment 83 André Klapper 2007-09-16 00:53:09 UTC
hmm, also see bug 347997 and bug 475277! perhaps not NOTGNOME...
Comment 84 Suman Manjunath 2007-09-17 04:42:11 UTC
*** Bug 475585 has been marked as a duplicate of this bug. ***
Comment 85 Suman Manjunath 2007-09-17 04:43:44 UTC
*** Bug 475845 has been marked as a duplicate of this bug. ***
Comment 86 Suman Manjunath 2007-09-22 20:14:16 UTC
*** Bug 477341 has been marked as a duplicate of this bug. ***
Comment 87 Suman Manjunath 2007-09-22 20:15:15 UTC
*** Bug 475277 has been marked as a duplicate of this bug. ***
Comment 88 Sergio 2007-09-22 21:13:07 UTC
Name        : evolution
Product     : Fedora 7
Version     : 2.10.3
Release     : 4.fc7
This update fixes a couple bugs:

- Evolution fails to close after an IMAP alert has been received.
- Combo boxes under "Automatic Contacts" are malfunctioning.

I think last evolution version of fedora fix this issue 
Comment 89 Suman Manjunath 2007-09-23 14:27:06 UTC
*** Bug 478884 has been marked as a duplicate of this bug. ***
Comment 90 Tobias Mueller 2007-09-24 14:27:19 UTC
*** Bug 479832 has been marked as a duplicate of this bug. ***
Comment 91 Pedro Villavicencio 2007-09-24 23:49:58 UTC
*** Bug 480020 has been marked as a duplicate of this bug. ***
Comment 92 Tobias Mueller 2007-09-26 18:07:17 UTC
*** Bug 480346 has been marked as a duplicate of this bug. ***
Comment 93 Tobias Mueller 2007-10-01 14:35:39 UTC
*** Bug 481504 has been marked as a duplicate of this bug. ***
Comment 94 Tobias Mueller 2007-10-01 14:35:46 UTC
*** Bug 481719 has been marked as a duplicate of this bug. ***
Comment 95 Tobias Mueller 2007-10-01 20:33:26 UTC
*** Bug 482303 has been marked as a duplicate of this bug. ***
Comment 96 Tobias Mueller 2007-10-01 20:33:56 UTC
Please note a great stacktrace in bug 482303 as well.
Comment 97 chris 2007-10-01 21:00:33 UTC
bug 482303 has:

glibc-2.6-4
evolution-2.10.3-4.fc7
evolution-data-server-1.10.3.1-2.fc7
Comment 98 Suman Manjunath 2007-10-07 15:19:04 UTC
*** Bug 483124 has been marked as a duplicate of this bug. ***
Comment 99 Akhil Laddha 2007-10-12 04:06:43 UTC
*** Bug 485895 has been marked as a duplicate of this bug. ***
Comment 100 Akhil Laddha 2007-10-12 04:07:03 UTC
*** Bug 485840 has been marked as a duplicate of this bug. ***
Comment 101 Christian Kirbach 2007-10-12 22:33:49 UTC
*** Bug 486160 has been marked as a duplicate of this bug. ***
Comment 102 Akhil Laddha 2007-10-17 04:47:39 UTC
*** Bug 487088 has been marked as a duplicate of this bug. ***
Comment 103 André Klapper 2007-10-18 23:53:07 UTC
*** Bug 486904 has been marked as a duplicate of this bug. ***
Comment 104 Tobias Mueller 2007-10-22 23:12:35 UTC
*** Bug 487654 has been marked as a duplicate of this bug. ***
Comment 105 Akhil Laddha 2007-10-24 07:28:29 UTC
*** Bug 486930 has been marked as a duplicate of this bug. ***
Comment 106 Akhil Laddha 2007-10-24 07:28:43 UTC
*** Bug 488671 has been marked as a duplicate of this bug. ***
Comment 107 Akhil Laddha 2007-10-24 07:28:59 UTC
*** Bug 489640 has been marked as a duplicate of this bug. ***
Comment 108 Tobias Mueller 2007-10-25 23:27:33 UTC
*** Bug 490290 has been marked as a duplicate of this bug. ***
Comment 109 André Klapper 2007-10-31 21:44:47 UTC
*** Bug 492075 has been marked as a duplicate of this bug. ***
Comment 110 Tobias Mueller 2007-11-02 02:42:20 UTC
*** Bug 492432 has been marked as a duplicate of this bug. ***
Comment 111 Tobias Mueller 2007-11-02 20:08:23 UTC
*** Bug 492647 has been marked as a duplicate of this bug. ***
Comment 112 Akhil Laddha 2007-11-05 05:26:24 UTC
*** Bug 490598 has been marked as a duplicate of this bug. ***
Comment 113 Akhil Laddha 2007-11-05 05:26:42 UTC
*** Bug 492839 has been marked as a duplicate of this bug. ***
Comment 114 André Klapper 2007-11-07 11:03:00 UTC
*** Bug 494432 has been marked as a duplicate of this bug. ***
Comment 115 André Klapper 2007-11-08 11:42:51 UTC
*** Bug 494932 has been marked as a duplicate of this bug. ***
Comment 116 Tobias Mueller 2007-11-11 12:05:27 UTC
*** Bug 495713 has been marked as a duplicate of this bug. ***
Comment 117 Tobias Mueller 2007-11-11 12:45:22 UTC
*** Bug 495407 has been marked as a duplicate of this bug. ***
Comment 118 Tobias Mueller 2007-11-14 01:09:32 UTC
*** Bug 496452 has been marked as a duplicate of this bug. ***
Comment 119 André Klapper 2007-11-15 11:46:09 UTC
reopen. bug 491988 is from debian 2.18.
Comment 120 André Klapper 2007-11-15 11:46:30 UTC
*** Bug 491988 has been marked as a duplicate of this bug. ***
Comment 121 Laurent Vaills 2007-11-22 15:01:10 UTC
Just to note that I'm using now Evolution 2.12 from Fedora 8 and I do not have this bug anymore.
Comment 122 Akhil Laddha 2007-11-23 05:07:04 UTC
*** Bug 498913 has been marked as a duplicate of this bug. ***
Comment 123 Akhil Laddha 2007-11-28 04:53:24 UTC
*** Bug 500088 has been marked as a duplicate of this bug. ***
Comment 124 Suman Manjunath 2007-11-30 19:51:48 UTC
*** Bug 500545 has been marked as a duplicate of this bug. ***
Comment 125 palfrey 2007-12-03 16:18:01 UTC
*** Bug 501274 has been marked as a duplicate of this bug. ***
Comment 126 André Klapper 2007-12-03 16:29:33 UTC
no Evolution 2.12/GNOME 2.20 reports yet.
removing gnome-target milestone.
Comment 127 palfrey 2007-12-06 19:11:36 UTC
*** Bug 502068 has been marked as a duplicate of this bug. ***
Comment 128 palfrey 2007-12-07 16:27:21 UTC
*** Bug 502259 has been marked as a duplicate of this bug. ***
Comment 129 Tobias Mueller 2007-12-09 06:47:35 UTC
*** Bug 502547 has been marked as a duplicate of this bug. ***
Comment 130 Johnny Jacob 2007-12-11 05:15:37 UTC
*** Bug 502953 has been marked as a duplicate of this bug. ***
Comment 131 André Klapper 2007-12-13 15:30:43 UTC
*** Bug 503258 has been marked as a duplicate of this bug. ***
Comment 132 André Klapper 2007-12-13 15:30:45 UTC
*** Bug 503429 has been marked as a duplicate of this bug. ***
Comment 133 André Klapper 2007-12-13 15:31:02 UTC
*** Bug 500532 has been marked as a duplicate of this bug. ***
Comment 134 Martin Jürgens 2007-12-13 20:35:07 UTC
This just happened again using Evolution 2.12.2 on GNOME 2.20.2. This is by backtrace (on Fedora 8)


Using host libthread_db library "/lib/libthread_db.so.1".
[Thread debugging using libthread_db enabled]
[New Thread -1208580336 (LWP 7699)]
[New Thread -1251656816 (LWP 7746)]
0x00110402 in __kernel_vsyscall ()

Thread 2 (Thread -1251656816 (LWP 7746))

  • #0 __kernel_vsyscall
  • #1 __lll_lock_wait
    from /lib/libpthread.so.0
  • #2 _L_lock_88
    from /lib/libpthread.so.0
  • #3 pthread_mutex_lock
    from /lib/libpthread.so.0
  • #4 segv_redirect
    at main.c line 422
  • #5 <signal handler called>
  • #6 pthread_mutex_lock
    from /lib/libpthread.so.0
  • #7 PR_Lock
    from /usr/lib/libnspr4.so
  • #8 ??
    from /usr/lib/libnspr4.so
  • #9 __nptl_deallocate_tsd
    from /lib/libpthread.so.0
  • #10 start_thread
    from /lib/libpthread.so.0
  • #11 clone
    from /lib/libc.so.6

Comment 135 Tobias Mueller 2007-12-14 08:55:12 UTC
*** Bug 503523 has been marked as a duplicate of this bug. ***
Comment 136 Susana 2007-12-16 22:30:24 UTC
*** Bug 503744 has been marked as a duplicate of this bug. ***
Comment 137 Tobias Mueller 2007-12-26 00:02:06 UTC
*** Bug 505565 has been marked as a duplicate of this bug. ***
Comment 138 Tobias Mueller 2007-12-26 20:02:27 UTC
*** Bug 505522 has been marked as a duplicate of this bug. ***
Comment 139 Matthew Barnes 2008-01-10 17:38:43 UTC
The only think I can think to suggest that would likely knock this out is to change the procedure to write certificates to a string buffer and then dump the buffer to a file in one shot, rather than writing certificates directly to an open file stream and then moving that temporary file into place.

Should be fairly straight-forward to implement but it _will_ break Camel's API slightly, though it's not a part that I think anything outside of CamelCertDB is actually using.
Comment 140 Tobias Mueller 2008-01-12 15:41:19 UTC
*** Bug 508872 has been marked as a duplicate of this bug. ***
Comment 141 Tobias Mueller 2008-01-12 15:41:28 UTC
*** Bug 508880 has been marked as a duplicate of this bug. ***
Comment 142 Philip Van Hoof 2008-01-13 14:54:34 UTC
We have seen invalid filedescriptor warnings when testing Modest and Tinymail based E-mail clients with valgrind. Those warnings where about the cert-db handling too.

May I suggest testing this problem with valgrind and putting your findings in comments here?
Comment 143 Tobias Mueller 2008-01-16 12:42:04 UTC
*** Bug 509829 has been marked as a duplicate of this bug. ***
Comment 144 André Klapper 2008-01-18 04:24:33 UTC
the missing piece from bug 508638:

  • #4 <signal handler called>
  • #5 open
    from /lib64/libpthread.so.0
  • #6 camel_certdb_save
    at /usr/include/bits/fcntl2.h line 54
  • #7 camel_shutdown
    at camel.c line 68
  • #8 exit
    at exit.c line 75
  • #9 __libc_start_main
    at libc-start.c line 252
  • #10 _start

Comment 145 André Klapper 2008-01-18 04:25:16 UTC
*** Bug 508638 has been marked as a duplicate of this bug. ***
Comment 146 Matthew Barnes 2008-01-18 05:37:35 UTC
Hmm, strange, most of the other stack traces show the crash in fsync().

Anyway, here's the code for open() from glibc in Fedora 8:

41: __extern_always_inline int
42: open (__const char *__path, int __oflag, ...)
43: {
44:   if (__va_arg_pack_len () > 1)
45:    __open_too_many_args ();
46:
47:   if (__builtin_constant_p (__oflag))
48:     {
49:       if ((__oflag & O_CREAT) != 0 && __va_arg_pack_len () < 1)
50:         {
51:           __open_missing_mode ();
52:           return __open_2 (__path, __oflag);
53:         }
54:       return __open_alias (__path, __oflag, __va_arg_pack ());
55:     }
56:
57:   if (__va_arg_pack_len () < 1)
58:     return __open_2 (__path, __oflag);
59:
60:   return __open_alias (__path, __oflag, __va_arg_pack ());
61: }

Doesn't shed as much light as I'd hoped, but it gives me something else to search for.  Tracing beyond this seems to take us into the kernel.
Comment 147 Srinivasa Ragavan 2008-01-18 09:27:57 UTC


Thread 1 (Thread 0xb66136c0 (LWP 12550))

  • #0 __kernel_vsyscall
  • #1 waitpid
    from /lib/libpthread.so.0
  • #2 g_spawn_sync
    at gspawn.c line 374
  • #3 g_spawn_command_line_sync
    at gspawn.c line 682
  • #4 ??
    from /usr/lib/gtk-2.0/modules/libgnomebreakpad.so
  • #5 ??
    from /usr/lib/gtk-2.0/modules/libgnomebreakpad.so
  • #6 google_breakpad::ExceptionHandler::InternalWriteMinidump
    from /usr/lib/gtk-2.0/modules/libgnomebreakpad.so
  • #7 google_breakpad::ExceptionHandler::HandleException
    from /usr/lib/gtk-2.0/modules/libgnomebreakpad.so
  • #8 <signal handler called>
  • #9 __kernel_vsyscall
  • #10 fsync
    from /lib/libpthread.so.0
  • #11 camel_certdb_save
    at camel-certdb.c line 371
  • #12 camel_shutdown
    at camel.c line 68
  • #13 exit
    from /lib/libc.so.6
  • #14 __libc_start_main
    from /lib/libc.so.6
  • #15 _start
$4 = (FILE *) 0x836aeb0
(gdb) p *out
$5 = {_flags = -72536956, 
  _IO_read_ptr = 0xb5ea8000 "\200\201�O=SERVICES2,OU=Organizational CA�CN=midro.wal.novell.com,OU=IS&T,O=Novell,L=Waltham,ST=Massachussets,C=US\221wal-3.novell.com�**********************************************\203", 
  _IO_read_end = 0xb5ea8000 "\200\201�O=SERVICES2,OU=Organizational CA�CN=midro.wal.novell.com,OU=IS&T,O=Novell,L=Waltham,ST=Massachussets,C=US\221wal-3.novell.com�**********************************************\203", 
  _IO_read_base = 0xb5ea8000 "\200\201�O=SERVICES2,OU=Organizational CA�CN=midro.wal.novell.com,OU=IS&T,O=Novell,L=Waltham,ST=Massachussets,C=US\221wal-3.novell.com�**********************************************\203", 
  _IO_write_base = 0xb5ea8000 "\200\201�O=SERVICES2,OU=Organizational CA�CN=midro.wal.novell.com,OU=IS&T,O=Novell,L=Waltham,ST=Massachussets,C=US\221wal-3.novell.com�**********************************************\203", 
  _IO_write_ptr = 0xb5ea8000 "\200\201�O=SERVICES2,OU=Organizational CA�CN=midro.wal.novell.com,OU=IS&T,O=Novell,L=Waltham,ST=Massachussets,C=US\221wal-3.novell.com�**********************************************\203", _IO_write_end = 0xb5ea9000 "\177ELF\001\001\001", 
  _IO_buf_base = 0xb5ea8000 "\200\201�O=SERVICES2,OU=Organizational CA�CN=midro.wal.novell.com,OU=IS&T,O=Novell,L=Waltham,ST=Massachussets,C=US\221wal-3.novell.com�**********************************************\203", _IO_buf_end = 0xb5ea9000 "\177ELF\001\001\001", _IO_save_base = 0x0, _IO_backup_base = 0x0, _IO_save_end = 0x0, 
  _markers = 0x0, _chain = 0xb24101c8, _fileno = 16, _flags2 = 0, _old_offset = 142054648, _cur_column = 0, _vtable_offset = 0 '\0', _shortbuf = "\b", _lock = 0x836af48, 
  _offset = -1, __pad1 = 0x1, __pad2 = 0x836af54, __pad3 = 0x0, __pad4 = 0x0, __pad5 = 141397416, _mode = -1, 
  _unused2 = "�ȥ\b�<\033\t\000\000\000\000\000\000\000\000pɥ\b", '\0' <repeats 12 times>, "!\000\000\000�\020�\b"}
(gdb) p fileno(out)
$6 = 16


I have masked few data.

So, I got this today. I have debugged more. *out is valid and fileno is right. I still donno why fsync crashes. 

Should we change to just sync() ?
Comment 148 Matthew Barnes 2008-01-18 13:02:45 UTC
I was comparing camel_certdb_save() against GLib's g_file_set_contents().  GLib doesn't call fflush() or fsync() explicitly.  It just does fwrite() followed by fclose(), and I believe fclose() should flush and sync the file for you.  The Single UNIX Specification v3 says:

  "The fclose() function shall cause the stream pointed to by stream to be
   flushed and the associated file to be closed."

Still, I suspect removing the fflush() and fsync() calls would only move the problem elsewhere.  And it still doesn't explain why open() is crashing.

This brings me back to my note in comment #69:

  "If the file descriptor is invalid, fsync() should simply return -1 with
   errno set appropriately, not crash the program.  Still, I suspect it may
   be related to calling fsync() in a atexit() callback."

camel_certdb_save() gets called from camel_shutdown(), which is an atexit() callback.  Maybe camel_shutdown() should be public and we should require calling it explicitly before exiting the process.

Evolution calls camel_init() from mail_session_init().  Perhaps we need a mail_session_shutdown() that calls camel_shutdown()?
Comment 149 Srinivasa Ragavan 2008-01-18 13:12:55 UTC
Matt, I'm fine to try it. Its definitely dont gonna harm more. We can have a clean shutdown path that way.
Comment 150 Jeffrey Stedfast 2008-01-18 14:18:13 UTC
fclose() calls fflush(), which flushes the FILE* buffers, but they might not get sync'd to disk, which can only be done via fsync().
Comment 151 André Klapper 2008-01-18 19:04:25 UTC
*** Bug 510447 has been marked as a duplicate of this bug. ***
Comment 152 Matthew Barnes 2008-01-21 03:28:39 UTC
I think I made some actual investigative progress on this tonight.

My idea in comment #148 did not fix the problem.  If anything, it made it EASIER to reproduce.  By making camel_shutdown() public and explicitly calling it earlier in the shutdown process (before exit() begins), I was able to reproduce the crash fairly frequently by simply starting Evolution, waiting for it to start freshing folders on an SSL-enabled IMAP account, then closing Evolution before it finished.

The source of the crash is not fsync().  It's PR_Lock(), called from _pt_thread_death_internal() in the "primordial" thread.  (NSPR defines the "primordial" thread as the thread from which PR_Init() was called.)  fsync() seems to be the catalyst for a race between _pth_thread_death_internal() and PR_Cleanup().

I'll let the code speak for itself.  Note the comment.


  PR_IMPLEMENT(PRStatus) PR_Cleanup(void)
  {
    PRThread *me = PR_GetCurrentThread();

    ...

    if (me->state & PT_THREAD_PRIMORD)
    {
          ...

          /*
           * I am not sure if it's safe to delete the cv and lock here,
           * since there may still be "system" threads around. If this
           * call isn't immediately prior to exiting, then there's a
           * problem.
           */
          if (0 == pt_book.system)
          {
              PR_DestroyCondVar(pt_book.cv); pt_book.cv = NULL;
              PR_DestroyLock(pt_book.ml); pt_book.ml = NULL;
          }

          ...
     }
  }

  static void _pt_thread_death_internal(void *arg, PRBool callDestructors)
  {
      PRThread *thred = (PRThread*)arg;

      if (thred->state & (PT_THREAD_FOREIGN|PT_THREAD_PRIMORD))
      {
          PR_Lock(pt_book.ml);

          ...

          PR_Unlock(pt_book.ml);
      }

      ...
  }

PR_Cleanup() is called BEFORE camel_certdb_save() in camel_shutdown().  I think simply swapping the order of these calls might do the trick.  Indeed, with the calls swapped and after many tries, I've not been able to reproduce the crash.  It might also explain the reported Win32 problems:

  From camel_shutdown():

  #if defined (HAVE_NSS) && !defined (G_OS_WIN32)
          /* For some reason we get into trouble on Win32 if we call these.
           * But they shouldn't be necessary as the process is exiting anywy?
           */
          NSS_Shutdown ();

          PR_Cleanup ();
  #endif /* HAVE_NSS */


Footnote: Milan was right about MailComponent never being finalized.  In fact
          none of the EvolutionComponents are being finalized.  EShellView is
          also leaking a reference.  I've yet to pin down the source of the
          leak but I suspect it's somewhere in the shell.
Comment 153 Matthew Barnes 2008-01-21 03:37:57 UTC
Created attachment 103303 [details] [review]
Proposed patch

If I'm right about everything above, I think this should do it.

Note that I've removed the Win32 workaround since I have a theory about what the problem may have been. It will be interesting to see if it crops up again in the future, assuming we ever get Evolution working on Win32 again.
Comment 154 Srinivasa Ragavan 2008-01-21 06:05:17 UTC
Matt, Your theory seems fine to me. I think the only way we could verify completely is to commit and test it. I would say commit it asap and give it a week or two and watch for it in bugzilla.
Comment 155 Matthew Barnes 2008-01-21 12:57:40 UTC
Committed to trunk (revision #8399).

Everyone please keep an eye out for this in 2.21.90 or later!
Comment 156 Tobias Mueller 2008-01-24 15:15:08 UTC
*** Bug 511475 has been marked as a duplicate of this bug. ***
Comment 157 Tobias Mueller 2008-02-01 14:38:56 UTC
*** Bug 513469 has been marked as a duplicate of this bug. ***
Comment 158 Tobias Mueller 2008-02-02 13:27:10 UTC
*** Bug 513874 has been marked as a duplicate of this bug. ***
Comment 159 Tobias Mueller 2008-02-03 19:58:10 UTC
*** Bug 514110 has been marked as a duplicate of this bug. ***
Comment 160 Akhil Laddha 2008-02-05 08:19:08 UTC
*** Bug 514423 has been marked as a duplicate of this bug. ***
Comment 161 Akhil Laddha 2008-02-08 04:24:52 UTC
*** Bug 514770 has been marked as a duplicate of this bug. ***
Comment 162 Tobias Mueller 2008-02-10 19:26:23 UTC
*** Bug 515479 has been marked as a duplicate of this bug. ***
Comment 163 Tobias Mueller 2008-02-14 18:20:10 UTC
*** Bug 516310 has been marked as a duplicate of this bug. ***
Comment 164 Tobias Mueller 2008-02-14 23:18:55 UTC
*** Bug 516551 has been marked as a duplicate of this bug. ***
Comment 165 Tobias Mueller 2008-02-16 04:41:37 UTC
*** Bug 516783 has been marked as a duplicate of this bug. ***
Comment 166 Akhil Laddha 2008-02-20 06:53:19 UTC
*** Bug 517528 has been marked as a duplicate of this bug. ***
Comment 167 Akhil Laddha 2008-02-20 06:53:38 UTC
*** Bug 513477 has been marked as a duplicate of this bug. ***
Comment 168 Tobias Mueller 2008-02-25 20:54:57 UTC
*** Bug 518658 has been marked as a duplicate of this bug. ***
Comment 169 André Klapper 2008-03-04 23:08:22 UTC
*** Bug 510655 has been marked as a duplicate of this bug. ***
Comment 170 André Klapper 2008-03-04 23:08:51 UTC
*** Bug 515189 has been marked as a duplicate of this bug. ***
Comment 171 André Klapper 2008-03-04 23:09:02 UTC
*** Bug 519601 has been marked as a duplicate of this bug. ***
Comment 172 André Klapper 2008-03-04 23:09:02 UTC
*** Bug 520396 has been marked as a duplicate of this bug. ***
Comment 173 Kandepu Prasad 2008-03-06 12:22:05 UTC
*** Bug 520727 has been marked as a duplicate of this bug. ***
Comment 174 Tobias Mueller 2008-03-09 22:46:34 UTC
*** Bug 521447 has been marked as a duplicate of this bug. ***
Comment 175 Tobias Mueller 2008-03-10 08:17:30 UTC
*** Bug 521485 has been marked as a duplicate of this bug. ***
Comment 176 Matthew Barnes 2008-04-13 16:41:06 UTC
*** Bug 347997 has been marked as a duplicate of this bug. ***