After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 597648 - Crash due to DBus used from more threads in one time
Crash due to DBus used from more threads in one time
Status: RESOLVED FIXED
Product: evolution
Classification: Applications
Component: Calendar
2.30.x (obsolete)
Other Linux
: Normal blocker
: ---
Assigned To: evolution-calendar-maintainers
Evolution QA team
evolution[dbus]
: 596293 597517 601434 601436 601888 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2009-10-07 06:32 UTC by Akhil Laddha
Modified: 2013-09-13 01:06 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
test eds patch (41.27 KB, patch)
2009-10-14 12:44 UTC, Milan Crha
committed Details | Review

Description Akhil Laddha 2009-10-07 06:32:39 UTC
Evolution 2.29.1 

** ERROR **: Out of memory
aborting...

Program received signal SIGTRAP, Trace/breakpoint trap.
0xb67e2836 in IA__g_logv (log_domain=0x0, log_level=G_LOG_LEVEL_ERROR, format=0xb6c34548 "Out of memory", args1=0xbfffda1c "����@") at gmessages.c:502
502			G_BREAKPOINT ();
(gdb) t a a bt

Thread 1 (Thread 0xb63aa720 (LWP 29561))

  • #0 IA__g_logv
    at gmessages.c line 502
  • #1 IA__g_log
    at gmessages.c line 526
  • #2 dbus_g_proxy_begin_call_internal
    from /usr/lib/libdbus-glib-1.so.2
  • #3 dbus_g_proxy_call
    from /usr/lib/libdbus-glib-1.so.2
  • #4 org_gnome_evolution_dataserver_calendar_Cal_get_object
    at e-data-cal-bindings.h line 445
  • #5 e_cal_get_objects_for_uid
    at e-cal.c line 1884
  • #6 generate_instances
    at e-cal.c line 2480
  • #7 e_cal_generate_instances_for_object
    at e-cal.c line 2750
  • #8 tag_calendar_by_comp
    at tag-calendar.c line 228
  • #9 dn_e_cal_view_objects_added_cb
    at gnome-cal.c line 758
  • #10 IA__g_cclosure_marshal_VOID__POINTER
  • #11 IA__g_closure_invoke
    at gclosure.c line 767
  • #12 signal_emit_unlocked_R
    at gsignal.c line 3247
  • #13 IA__g_signal_emit_valist
    at gsignal.c line 2980
  • #14 IA__g_signal_emit
    at gsignal.c line 3037
  • #15 objects_added_cb
    at e-cal-view.c line 114
  • #16 IA__g_cclosure_marshal_VOID__BOXED
    at gmarshal.c line 566
  • #17 marshal_dbus_message_to_g_marshaller
    from /usr/lib/libdbus-glib-1.so.2
  • #18 IA__g_closure_invoke
    at gclosure.c line 767
  • #19 signal_emit_unlocked_R
    at gsignal.c line 3247
  • #20 IA__g_signal_emit_valist
    at gsignal.c line 2980
  • #21 IA__g_signal_emit
    at gsignal.c line 3037
  • #22 dbus_g_proxy_manager_filter
    from /usr/lib/libdbus-glib-1.so.2
  • #23 dbus_connection_dispatch
    from /lib/libdbus-1.so.3
  • #24 message_queue_dispatch
    from /usr/lib/libdbus-glib-1.so.2
  • #25 g_main_dispatch
    at gmain.c line 1960
  • #26 IA__g_main_context_dispatch
    at gmain.c line 2513
  • #27 g_main_context_iterate
    at gmain.c line 2591
  • #28 IA__g_main_loop_run
    at gmain.c line 2799
  • #29 IA__gtk_main
    at gtkmain.c line 1205
  • #30 main
    at main.c line 623
  • #0 IA__g_logv
    at gmessages.c line 502
  • #1 IA__g_log
    at gmessages.c line 526
  • #2 dbus_g_proxy_begin_call_internal
    from /usr/lib/libdbus-glib-1.so.2
  • #3 dbus_g_proxy_call
    from /usr/lib/libdbus-glib-1.so.2
  • #4 org_gnome_evolution_dataserver_calendar_Cal_get_object
    at e-data-cal-bindings.h line 445
  • #5 e_cal_get_objects_for_uid
    at e-cal.c line 1884
  • #6 generate_instances
    at e-cal.c line 2480
  • #7 e_cal_generate_instances_for_object
    at e-cal.c line 2750
  • #8 tag_calendar_by_comp
    at tag-calendar.c line 228
  • #9 dn_e_cal_view_objects_added_cb
    at gnome-cal.c line 758
  • #10 IA__g_cclosure_marshal_VOID__POINTER
    at gmarshal.c line 601
  • #11 IA__g_closure_invoke
    at gclosure.c line 767
  • #12 signal_emit_unlocked_R
    at gsignal.c line 3247
  • #13 IA__g_signal_emit_valist
    at gsignal.c line 2980
  • #14 IA__g_signal_emit
    at gsignal.c line 3037
  • #15 objects_added_cb
    at e-cal-view.c line 114
  • #16 IA__g_cclosure_marshal_VOID__BOXED
    at gmarshal.c line 566
  • #17 marshal_dbus_message_to_g_marshaller
    from /usr/lib/libdbus-glib-1.so.2
  • #18 IA__g_closure_invoke
    at gclosure.c line 767
  • #19 signal_emit_unlocked_R
    at gsignal.c line 3247
  • #20 IA__g_signal_emit_valist
    at gsignal.c line 2980
  • #21 IA__g_signal_emit
    at gsignal.c line 3037
  • #22 dbus_g_proxy_manager_filter
    from /usr/lib/libdbus-glib-1.so.2
  • #23 dbus_connection_dispatch
    from /lib/libdbus-1.so.3
  • #24 message_queue_dispatch
    from /usr/lib/libdbus-glib-1.so.2
  • #25 g_main_dispatch
    at gmain.c line 1960
  • #26 IA__g_main_context_dispatch
    at gmain.c line 2513
  • #27 g_main_context_iterate
    at gmain.c line 2591
  • #28 IA__g_main_loop_run
    at gmain.c line 2799
  • #29 IA__gtk_main
    at gtkmain.c line 1205
  • #30 main
    at main.c line 623

Comment 1 Colin Walters 2009-10-07 16:01:29 UTC
So after a bit of discussion it seems there are a number of possible duplicates of this which are ultimately caused by threaded usage of dbus-glib/dbus.  libdbus only has incomplete support for threading.  dbus-glib is not really known to be safe here.
Comment 2 Colin Walters 2009-10-07 16:48:09 UTC
Options here:

- make update_query_async use asynchronous dbus calls instead of sync
  This would entail some rewriting of e-cal.[ch]; I'm not sure of the ultimate scope here
- make update_query_async synchronous
  Obviously lessens interactivity; but how long do we expect to be blocking here?  Is this just reading local files, or will we do e.g. http requests?
- Further investigate supporting threading in dbus-glib/libdbus
  Some people are using private connections for this, which is sort of suboptimal but relatively straightforward.
Comment 3 Colin Walters 2009-10-07 16:49:10 UTC
Ross, any opinions on this?
Comment 4 Milan Crha 2009-10-08 09:47:04 UTC
Just for a record, why we are talking about update_query_async:
At the top are prints from DBus, to see which function failed before the "Out of memory" abort. Then is shown part of the valgrind output of the abort.

protected_change_timeout: no timeouts
_dbus_connection_attach_pending_call_unlocked: _dbus_connection_add_timeout_unlocked failed
dbus_connection_send_with_reply: _dbus_connection_attach_pending_call_unlocked failed
dbus_g_proxy_begin_call_internal: send with reply failed

** ERROR **: Out of memory
aborting...
127	m_dispatch/dispatch-amd64-linux.S: No such file or directory.
==2767== 
==2767== Invalid read of size 8
==2767==    at 0x8437D60: _dbus_pending_call_get_connection_and_lock (dbus-pending-call.c:309)
==2767==    by 0x8424188: reply_handler_timeout (dbus-connection.c:3125)
==2767==    by 0x843D592: dbus_timeout_handle (dbus-timeout.c:476)
==2767==    by 0x81F48E4: timeout_handler_dispatch (dbus-gmain.c:343)
==2767==    by 0x30A82381CA: (within /lib64/libglib-2.0.so.0.2000.4)
==2767==    by 0x30A8237ABD: g_main_context_dispatch (in /lib64/libglib-2.0.so.0.2000.4)
==2767==    by 0x30A823B277: (within /lib64/libglib-2.0.so.0.2000.4)
==2767==    by 0x30A823B6D4: g_main_loop_run (in /lib64/libglib-2.0.so.0.2000.4)
==2767==    by 0x344E3422B6: gtk_main (in /usr/lib64/libgtk-x11-2.0.so.0.1600.6)
==2767==    by 0x403AFF: main (main.c:623)
==2767==  Address 0x1d8a1498 is 32 bytes inside a block of size 72 free'd
==2767==    at 0x4A0633D: free (vg_replace_malloc.c:323)
==2767==    by 0x8446132: dbus_free (dbus-memory.c:644)
==2767==    by 0x8437EE8: _dbus_pending_call_last_unref (dbus-pending-call.c:413)
==2767==    by 0x8438118: dbus_pending_call_unref (dbus-pending-call.c:579)
==2767==    by 0x84246FF: dbus_connection_send_with_reply_and_block (dbus-connection.c:3334)
==2767==    by 0x81FB879: get_name_owner (dbus-gproxy.c:882)
==2767==    by 0x81FE13B: dbus_g_proxy_new_for_name_owner (dbus-gproxy.c:1995)
==2767==    by 0x71806F0: e_cal_get_query (e-cal.c:3591)
==2767==    by 0xF6361F9: update_query_async (gnome-cal.c:1046)
==2767==    by 0xF633CA0: message_proxy (gnome-cal.c:185)
==2767==    by 0x30A8261F31: (within /lib64/libglib-2.0.so.0.2000.4)
==2767==    by 0x30A8260933: (within /lib64/libglib-2.0.so.0.2000.4)
Comment 5 Milan Crha 2009-10-08 09:49:27 UTC
(In reply to comment #2)
> - make update_query_async synchronous
>   Obviously lessens interactivity; but how long do we expect to be blocking
> here?  Is this just reading local files, or will we do e.g. http requests?

The backend can make remote calls too, in order the events are not stored locally yet. Or it can wait for a local cache filling before returning. Thus yes, it can take some time.
Comment 6 Ross Burton 2009-10-08 09:52:53 UTC
Damn.

We probably do need to solve the entire threaded apps using libecal/libebook sync APIs problem. I'd done some basic testing and it had been working for me, but obviously it's not working entirely correctly.

I'd like to see a move to less threads, so evolution should make async calls instead of sync calls in a thread.

Does anyone know where there are threading problems in dbus/dbus-glib?  I thought that libdbus itself was threadsafe these days, so any problems are limited to dbus-glib.
Comment 7 Akhil Laddha 2009-10-08 10:17:35 UTC
*** Bug 597517 has been marked as a duplicate of this bug. ***
Comment 8 Akhil Laddha 2009-10-08 10:19:12 UTC
*** Bug 596293 has been marked as a duplicate of this bug. ***
Comment 9 Matthew Barnes 2009-10-08 13:45:48 UTC
(In reply to comment #6)
> I'd like to see a move to less threads, so evolution should make async calls
> instead of sync calls in a thread.

+100.  I'm pushing for the same thing on the Camel/mailer side.
Comment 10 Colin Walters 2009-10-08 13:59:35 UTC
(In reply to comment #6)
> Damn.
> 
> We probably do need to solve the entire threaded apps using libecal/libebook
> sync APIs problem. I'd done some basic testing and it had been working for me,
> but obviously it's not working entirely correctly.
> 
> I'd like to see a move to less threads, so evolution should make async calls
> instead of sync calls in a thread.
> 
> Does anyone know where there are threading problems in dbus/dbus-glib?  I
> thought that libdbus itself was threadsafe these days, so any problems are
> limited to dbus-glib.

libdbus has a simple lock around the connection and won't obviously explode if used from multiple threads, and in fact could be termed "threadsafe".  But that doesn't mean it has the semantics that one might want.  This is

http://bugs.freedesktop.org/show_bug.cgi?id=857

I haven't done much auditing of dbus-glib in this respect.
Comment 11 Colin Walters 2009-10-08 14:56:21 UTC
(In reply to comment #4)
> Just for a record, why we are talking about update_query_async:

I know for sure there are problems with the dbus-glib/libdbus stack and threads here; it's not going to be immediately useful to trace down exactly what the cause of the double free is here because it will likely come down to some sort of race condition, and to fix that kind of thing requires overall design work.
Comment 12 Milan Crha 2009-10-14 12:44:22 UTC
Created attachment 145414 [details] [review]
test eds patch

for evolution-data-server;

Akhil, could you try to run with this test patch, please? It does nothing special, only adds locking on the connection/proxy on evo side, something similar should be done properly in a lower level, in dbus/dbus-glib, I believe, in a very less invasive manner.

Anyway, what you can see is also a freeze of UI sometimes, from which it recovers after a while, so I believe it'll be easier to fix evo for "more threading" (I'm sorry Matt), than waiting for thread-safety on dbus side or rewriting all of the EBook/ECal stuff to be async. The async-ness is too complicated, anyway.

Evo backtrace of one such freeze:

Thread 2 (Thread 0x7fd0a3fff910 (LWP 780)):
#0  in poll ()
#1  in socket_do_iteration (...) at dbus-transport-socket.c:1046
#2  in _dbus_transport_do_iteration (...) at dbus-transport.c:956
#3  in _dbus_connection_do_iteration_unlocked (...) at dbus-connection.c:1150
#4  in _dbus_connection_block_pending_call (...) at dbus-connection.c:2381
#5  in dbus_g_proxy_end_call_internal (...) at dbus-gproxy.c:2264
#6  in dbus_g_proxy_call (...) at dbus-gproxy.c:2593
#7  in org_gnome_evo...ar_CalView_start (...) at e-data-cal-view-bindings.h:32
#8  in e_cal_view_start (...) at e-cal-view.c:418
#9  in update_query_async (...) at gnome-cal.c:1074
#10 in message_proxy (...) at gnome-cal.c:185
#11 in ?? ()
#12 in ?? ()
#13 in start_thread ()
#14 in clone ()
#15 in ?? ()

Thread 1 (Thread 0x7fd0fa767800 (LWP 663))

  • #20 dbus_g_proxy_manager_filter
    at dbus-gproxy.c line 1293

Comment 13 Ross Burton 2009-10-14 16:24:11 UTC
This patch puts locks around the DBus connection, so any threads making concurrent calls via DBus will be serialised.  This is very invasive. :/
Comment 14 Milan Crha 2009-10-14 18:12:05 UTC
Yup, I know, but I have nothing better than this :( But I'm still opened to suggestions, even not much willing to rewrite all the evo from a ground.
Comment 15 Akhil Laddha 2009-10-27 09:16:40 UTC
Patch attached in comment#12 reduces crash frequency a lot.
Comment 16 Chenthill P 2009-11-02 09:45:15 UTC
Ross, what do u suggest,

Go with the patch at comment #12 until we get a fix from libdbus/dbus-glib ?
Comment 17 Akhil Laddha 2009-11-11 04:00:55 UTC
*** Bug 601436 has been marked as a duplicate of this bug. ***
Comment 18 Akhil Laddha 2009-11-11 04:01:05 UTC
*** Bug 601434 has been marked as a duplicate of this bug. ***
Comment 19 David Ronis 2009-11-11 16:52:35 UTC
For what it's worth, the patch doesn't apply cleanly to the current git master.
Comment 20 Chenthill P 2009-11-13 09:34:06 UTC
milan, please commit the patch.
Comment 21 Milan Crha 2009-11-13 10:36:34 UTC
Created commit 2f75310 in eds master (2.29.2+)
Comment 22 Ritesh Khadgaray ( irc:ritz) 2009-11-17 05:22:53 UTC
*** Bug 601888 has been marked as a duplicate of this bug. ***