After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 747349 - Conversion of gdbus to use GTask causes deadlocks
Conversion of gdbus to use GTask causes deadlocks
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: gio
2.44.x
Other Linux
: Normal major
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2015-04-04 21:36 UTC by Ross Lagerwall
Modified: 2015-04-06 16:22 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
gdbus: fix deadlock on message cancel/timeout (5.92 KB, patch)
2015-04-06 14:44 UTC, Dan Winship
committed Details | Review

Description Ross Lagerwall 2015-04-04 21:36:45 UTC
The conversion of gdbus to use GTask seems to frequently cause deadlocks when refreshing a remote share in Nautilus (i.e. using gvfs). I haven't yet had the time to dig into the cause.

Example backtrace:

(gdb) thread apply all bt

Thread 2 (Thread 0x7fab4f499700 (LWP 4045))

  • #0 syscall
    from /usr/lib/libc.so.6
  • #1 g_mutex_lock_slowpath
    at gthread-posix.c line 1313
  • #2 g_mutex_lock
    at gthread-posix.c line 1337
  • #3 on_worker_message_about_to_be_sent
    at gdbusconnection.c line 2319
  • #4 _g_dbus_worker_emit_message_about_to_be_sent
    at gdbusprivate.c line 465
  • #5 continue_writing
    at gdbusprivate.c line 1455
  • #6 continue_writing_in_idle_cb
    at gdbusprivate.c line 1516
  • #7 g_idle_dispatch
    at gmain.c line 5397
  • #8 g_main_dispatch
    at gmain.c line 3122
  • #9 g_main_context_dispatch
    at gmain.c line 3737
  • #10 g_main_context_iterate
    at gmain.c line 3808
  • #11 g_main_loop_run
    at gmain.c line 4002
  • #12 gdbus_shared_thread_func
    at gdbusprivate.c line 246
  • #13 g_thread_proxy
    at gthread.c line 764
  • #14 start_thread
    from /usr/lib/libpthread.so.0
  • #15 clone
    from /usr/lib/libc.so.6

Thread 1 (Thread 0x7fab5e05b940 (LWP 4043))

  • #0 syscall
    from /usr/lib/libc.so.6
  • #1 g_mutex_lock_slowpath
    at gthread-posix.c line 1313
  • #2 g_mutex_lock
    at gthread-posix.c line 1337
  • #3 g_dbus_connection_unregister_object
    at gdbusconnection.c line 5163
  • #4 remove_connection_locked
    at gdbusinterfaceskeleton.c line 734
  • #5 g_dbus_interface_skeleton_unexport
    at gdbusinterfaceskeleton.c line 964
  • #6 g_daemon_file_enumerator_finalize
    at gdaemonfileenumerator.c line 133
  • #7 g_object_unref
    at gobject.c line 3174
  • #8 async_call_enumerate_free
    at gdaemonfile.c line 3363
  • #9 async_proxy_create_free
    at gdaemonfile.c line 549
  • #10 weak_refs_notify
    at gobject.c line 2629
  • #11 g_data_set_internal
    at gdataset.c line 407
  • #12 g_datalist_id_set_data_full
    at gdataset.c line 670
  • #13 g_object_real_dispose
    at gobject.c line 1021
  • #14 g_object_unref
    at gobject.c line 3137
  • #15 enumerate_children_async_cb
    at gdaemonfile.c line 3393
  • #16 g_task_return_now
    at gtask.c line 1106
  • #17 g_task_return
    at gtask.c line 1164
  • #18 g_task_return_error
    at gtask.c line 1703
  • #19 reply_cb
    at gdbusproxy.c line 2568
  • #20 g_task_return_now
    at gtask.c line 1106
  • #21 g_task_return
    at gtask.c line 1164
  • #22 g_task_return_error
    at gtask.c line 1703
  • #23 g_dbus_connection_call_done
    at gdbusconnection.c line 5401
  • #24 g_task_return_now
    at gtask.c line 1106
  • #25 g_task_return
    at gtask.c line 1164
  • #26 g_task_return_error
    at gtask.c line 1703
  • #27 g_task_return_new_error
    at gtask.c line 1737
  • #28 send_message_with_reply_cancelled_idle_cb
    at gdbusconnection.c line 1838
  • #29 g_idle_dispatch
    at gmain.c line 5397
  • #30 g_main_dispatch
    at gmain.c line 3122
  • #31 g_main_context_dispatch
    at gmain.c line 3737
  • #32 g_main_context_iterate
    at gmain.c line 3808
  • #33 g_main_context_iteration
    at gmain.c line 3869
  • #34 g_application_run
    at gapplication.c line 2308
  • #35 main
    at nautilus-main.c line 103

Comment 1 Ross Lagerwall 2015-04-04 21:44:01 UTC
I suppose the problem is that send_message_with_reply_cancelled_idle_cb takes the connection lock and ends up calling the application callback with the lock still held.
Comment 2 Dan Winship 2015-04-06 14:44:21 UTC
Created attachment 301022 [details] [review]
gdbus: fix deadlock on message cancel/timeout

The gdbus GTask port introduced a deadlock because some code had been
using g_simple_async_result_complete_in_idle() to ensure that the
callback didn't run until after a mutex was unlocked, but in the gtask
version, the callback was being run immediately. Fix it to drop the
mutex before calling g_task_return*(). Also, tweak
tests/gdbus-connection to test this.
Comment 3 Ross Lagerwall 2015-04-06 15:16:17 UTC
(In reply to Dan Winship from comment #2)
> Created attachment 301022 [details] [review] [review]
> gdbus: fix deadlock on message cancel/timeout
> 
> The gdbus GTask port introduced a deadlock because some code had been
> using g_simple_async_result_complete_in_idle() to ensure that the
> callback didn't run until after a mutex was unlocked, but in the gtask
> version, the callback was being run immediately. Fix it to drop the
> mutex before calling g_task_return*(). Also, tweak
> tests/gdbus-connection to test this.

Well that seems to fix the hangs. Thanks
Comment 4 Matthias Clasen 2015-04-06 16:18:46 UTC
Review of attachment 301022 [details] [review]:

Looks good. Thanks for updating the tests to cover this.

::: gio/gdbusconnection.c
@@ +1858,3 @@
+                                   G_IO_ERROR,
+                                   G_IO_ERROR_CANCELLED,
+                                   _("Operation was cancelled"));

I slightly prefer these on one line, but no big deal either way
Comment 5 Dan Winship 2015-04-06 16:22:25 UTC
Attachment 301022 [details] pushed as 7e8d414 - gdbus: fix deadlock on message cancel/timeout