Bug 601525 – e-calendar-factory crashes the second time I start evo

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 601525 - e-calendar-factory crashes the second time I start evo


Summary:	e-calendar-factory crashes the second time I start evo


Status:	RESOLVED FIXED

Product:	evolution-data-server
Classification:	Platform
Component:	Calendar
Version:	2.30.x (obsolete)
Hardware:	Other Linux

Importance:	Normal blocker
Target Milestone:	---
Assigned To:	Travis Reitter
QA Contact:	Evolution QA team

URL:
Whiteboard:	evolution[dbus]

Depends on:
Blocks:

Reported:	2009-11-11 12:17 UTC by Milan Crha
Modified:	2013-09-14 16:53 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Don't double-free the list of connections (after it's already an invalid pointer) (1.07 KB, patch) 2009-12-11 16:33 UTC, Travis Reitter	accepted-commit_now	Details \| Review
Fix the same crasher in the calendar (?) (1.15 KB, patch) 2009-12-11 16:39 UTC, Travis Reitter	accepted-commit_now	Details \| Review

Description Milan Crha 2009-11-11 12:17:14 UTC

I start evolution in mailer, and have selected a mail with a meeting invitation. I let it do until it's idle, (all the fetching from server and calendars and such) and close evolution. Then I run evolution again, pretty quickly, and the e-calendar-factory crashes in Thread 1 with backtrace shown below. I also noticed a runtime warning on console:
exchange-mapi-connection.c:1335: Leaving exchange_mapi_connection_fetch_items: folder-id A9758C5D00000001 
(process:11579): libedata-cal-CRITICAL **: e_data_cal_notify_mode: assertion `E_IS_DATA_CAL (cal)' failed
but that might be unrelated to this issue, it might be because of the MAPI only.
Note the patch from bug #597648 is not helping here, and seeing that only the dbus code is involved in the main thread of the process it's probably no surprise.

+ Trace 219007

Thread 1 (Thread 0x7ffff7732790 (LWP 11579))

#0 g_slice_alloc
from /lib64/libglib-2.0.so.0
#1 g_string_sized_new
from /lib64/libglib-2.0.so.0
#2 method_dir_signature_from_object_info
at dbus-gobject.c line 241
#3 method_input_signature_from_object_info
at dbus-gobject.c line 262
#4 invoke_object_method
at dbus-gobject.c line 1167
#5 gobject_message_function
at dbus-gobject.c line 1497
#6 _dbus_object_tree_dispatch_and_unlock
at dbus-object-tree.c line 856
#7 dbus_connection_dispatch
at dbus-connection.c line 4447
#8 message_queue_dispatch
at dbus-gmain.c line 101
#9 g_main_context_dispatch
from /lib64/libglib-2.0.so.0
#10 ??
from /lib64/libglib-2.0.so.0
#11 g_main_loop_run
from /lib64/libglib-2.0.so.0
#12 main
at e-data-cal-factory.c line 728

Comment 1 Travis Reitter 2009-12-11 05:35:31 UTC

At least in e-addressbook-factory, this seems to be a GSlice misuse by dbus-glib:

==10065== Invalid read of size 4
==10065==    at 0x4372B86: g_slice_alloc (gslice.c:474)
==10065==    by 0x4378DE2: g_string_sized_new (gstring.c:380)
==10065==    by 0x4173659: method_dir_signature_from_object_info (dbus-gobject.c:241)
==10065==    by 0x4176315: object_registration_message (dbus-gobject.c:262)
==10065==    by 0x41A6F12: ??? (in /lib/libdbus-1.so.3.4.0)
==10065==    by 0x4199CEB: dbus_connection_dispatch (in /lib/libdbus-1.so.3.4.0)
==10065==    by 0x4172BFC: message_queue_dispatch (dbus-gmain.c:101)
==10065==    by 0x43549B7: g_main_context_dispatch (gmain.c:1960)
==10065==    by 0x435825F: g_main_context_iterate (gmain.c:2591)
==10065==    by 0x43586CE: g_main_loop_run (gmain.c:2799)
==10065==    by 0x804A8B8: main (e-data-book-factory.c:484)
==10065==  Address 0x2 is not stack'd, malloc'd or (recently) free'd
==10065== 
==10065== 
==10065== Process terminating with default action of signal 11 (SIGSEGV)
==10065==  Access not within mapped region at address 0x2
==10065==    at 0x4372B86: g_slice_alloc (gslice.c:474)
==10065==    by 0x4378DE2: g_string_sized_new (gstring.c:380)
==10065==    by 0x4173659: method_dir_signature_from_object_info (dbus-gobject.c:241)
==10065==    by 0x4176315: object_registration_message (dbus-gobject.c:262)
==10065==    by 0x41A6F12: ??? (in /lib/libdbus-1.so.3.4.0)
==10065==    by 0x4199CEB: dbus_connection_dispatch (in /lib/libdbus-1.so.3.4.0)
==10065==    by 0x4172BFC: message_queue_dispatch (dbus-gmain.c:101)
==10065==    by 0x43549B7: g_main_context_dispatch (gmain.c:1960)
==10065==    by 0x435825F: g_main_context_iterate (gmain.c:2591)
==10065==    by 0x43586CE: g_main_loop_run (gmain.c:2799)
==10065==    by 0x804A8B8: main (e-data-book-factory.c:484)
==10065==  If you believe this happened as a result of a stack
==10065==  overflow in your program's main thread (unlikely but
==10065==  possible), you can try to increase the size of the
==10065==  main thread stack using the --main-stacksize= flag.
==10065==  The main thread stack size used in this run was 8388608.

-------------------------------

The above stack only happens when you run the factory with regular slice allocation -- if you run it with G_SLICE=always-malloc, it doesn't end up crashing.

Comment 2 Milan Crha 2009-12-11 11:08:05 UTC

Does it mean that the fix should go to dbus-glib?

Comment 3 Travis Reitter 2009-12-11 16:33:11 UTC

Created attachment 149592 [details] [review]
Don't double-free the list of connections (after it's already an invalid pointer)

When the name owner of the addressbook factory on the D-Bus message bus changed, we were unreffing each of the books for a given connection, then freeing the GList that contained them. However, each unref caused the book's weak ref notify function to run, which itself modified the list, which could change the start of the list (and removed the link that contained that book anyhow).

So in the end, we were holding onto a pointer which may or may not even point to a valid link in the list and then trying to free it.

Comment 4 Travis Reitter 2009-12-11 16:39:51 UTC

Created attachment 149593 [details] [review]
Fix the same crasher in the calendar (?)

See the comment for attachment #149592 [details]

Comment 5 Travis Reitter 2009-12-11 16:47:21 UTC

Milan, I just realized that this bug is for the calendar factory. The code is essentially identical, but I can't get the calendar factory to crash right now.

Anyway, the patch for the addressbook factory definitely fixes an easily-repeatable crasher, and the second patch fixes the same problem for the calendar factory.

Could you please review and try them out? Just let me know if anything needs to be changed and whether you'd like to apply them or have me do it.

Comment 6 Chenthill P 2009-12-13 05:45:45 UTC

I was thinking that it was one of the threading issues in dbus-glib and waited for the GDBus port. nice fix!!

Comment 7 Milan Crha 2009-12-14 17:17:39 UTC

Good. I tested this and both processes survive close and open of evolution. (Without patches they both crashed.) Please commit both to master.

Comment 8 Travis Reitter 2009-12-15 19:42:18 UTC

Pushed to master.