After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 708816 - gvfs-daemon deadlock in peer_connection_closed
gvfs-daemon deadlock in peer_connection_closed
Status: RESOLVED FIXED
Product: gvfs
Classification: Core
Component: daemon
git master
Other Linux
: Normal normal
: ---
Assigned To: gvfs-maint
gvfs-maint
Depends on:
Blocks:
 
 
Reported: 2013-09-26 09:41 UTC by Bastien Nocera
Modified: 2013-09-26 11:34 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
GVfsDaemon: Don't deadlock on cliend disconnect (2.29 KB, patch)
2013-09-26 11:34 UTC, Alexander Larsson
committed Details | Review

Description Bastien Nocera 2013-09-26 09:41:56 UTC
gvfs-1.16.3-2.fc19.x86_64
libsoup-2.42.2-2.fc19.x86_64

gvfs-open with an http URL will hang for about 50 seconds as the gvfsd-http daemon fails to respond to requests. This is due to a deadlock when cancelling a task. Backtrace of gvfsd-http below.

  • #0 __lll_lock_wait
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S line 135
  • #1 _L_lock_974
    from /lib64/libpthread.so.0
  • #2 __GI___pthread_mutex_lock
    at pthread_mutex_lock.c line 104
  • #3 g_mutex_lock
    at gthread-posix.c line 210
  • #4 job_finished_callback
    at gvfsdaemon.c line 552
  • #5 g_closure_invoke
    at gclosure.c line 777
  • #6 signal_emit_unlocked_R
    at gsignal.c line 3584
  • #7 g_signal_emit_valist
    at gsignal.c line 3328
  • #8 g_signal_emit
    at gsignal.c line 3384
  • #9 _g_closure_invoke_va
    at gclosure.c line 840
  • #10 g_signal_emit_valist
    at gsignal.c line 3234
  • #11 g_signal_emit
    at gsignal.c line 3384
  • #12 g_vfs_job_send_reply
    at gvfsjob.c line 236
  • #13 g_vfs_job_failed_literal
    at gvfsjob.c line 270
  • #14 open_for_read_ready
    at gvfsbackendhttp.c line 361
  • #15 g_task_return_now
    at gtask.c line 1105
  • #16 g_task_return
    at gtask.c line 1158
  • #17 send_callback
    at gvfshttpinputstream.c line 222
  • #18 g_task_return_now
    at gtask.c line 1105
  • #19 g_task_return
    at gtask.c line 1158
  • #20 http_input_stream_ready_cb
    at soup-request-http.c line 126
  • #21 g_task_return_now
    at gtask.c line 1105
  • #22 g_task_return
    at gtask.c line 1158
  • #23 g_task_return_new_error
    at gtask.c line 1684
  • #24 async_send_request_return_result
    at soup-session.c line 3625
  • #25 _g_closure_invoke_va
    at gclosure.c line 840
  • #26 g_signal_emit_valist
    at gsignal.c line 3234
  • #27 g_signal_emit
    at gsignal.c line 3384
  • #28 soup_message_finished
    at soup-message.c line 1054
  • #29 soup_session_process_queue_item
    at soup-session.c line 1849
  • #30 async_run_queue
    at soup-session.c line 1896
  • #31 got_connection
    at soup-session.c line 1615
  • #32 socket_connect_finished
    at soup-connection.c line 504
  • #33 async_connected
    at soup-socket.c line 757
  • #34 g_task_return_now
    at gtask.c line 1105
  • #35 g_task_return
    at gtask.c line 1158
  • #36 g_task_return_error_if_cancelled
    at gtask.c line 1715
  • #37 g_socket_client_enumerator_callback
    at gsocketclient.c line 1618
  • #38 g_task_return_now
    at gtask.c line 1105
  • #39 g_task_return
    at gtask.c line 1158
  • #40 complete_async
    at gproxyaddressenumerator.c line 319
  • #41 g_task_return_now
    at gtask.c line 1105
  • #42 g_task_return
    at gtask.c line 1158
  • #43 got_addresses
    at soup-address.c line 1126
  • #44 complete_resolve_async
    at soup-address.c line 657
  • #45 lookup_resolved
    at soup-address.c line 699
  • #46 g_task_return_now
    at gtask.c line 1105
  • #47 g_task_return
    at gtask.c line 1158
  • #48 _g_closure_invoke_va
    at gclosure.c line 840
  • #49 g_signal_emit_valist
    at gsignal.c line 3234
  • #50 g_signal_emit
    at gsignal.c line 3384
  • #51 g_cancellable_cancel
    at gcancellable.c line 507
  • #52 g_vfs_job_cancel
    at gvfsjob.c line 229
  • #53 peer_connection_closed
    at gvfsdaemon.c line 620
  • #54 ffi_call_unix64
    from /lib64/libffi.so.6
  • #55 ffi_call
    from /lib64/libffi.so.6
  • #56 g_cclosure_marshal_generic
    at gclosure.c line 1454
  • #57 g_closure_invoke
    at gclosure.c line 777
  • #58 signal_emit_unlocked_R
    at gsignal.c line 3584
  • #59 g_signal_emit_valist
    at gsignal.c line 3328
  • #60 g_signal_emit
    at gsignal.c line 3384
  • #61 emit_closed_in_idle
    at gdbusconnection.c line 1377
  • #62 g_main_dispatch
    at gmain.c line 3054
  • #63 g_main_context_dispatch
    at gmain.c line 3630
  • #64 g_main_context_iterate
    at gmain.c line 3701
  • #65 g_main_loop_run
    at gmain.c line 3895
  • #66 daemon_main
    at daemon-main.c line 395
  • #67 main
    at daemon-main-generic.c line 42

Comment 1 Alexander Larsson 2013-09-26 09:46:48 UTC
So, what happens is that a gvfs mount daemon gets a dbus-connection-died signal from one of the clients, and there was an outstanding operation.

This then does in peer_connection_closed():

  g_mutex_lock (&daemon->lock);
  for (l = daemon->jobs; l != NULL; l = l->next)
    {
      GVfsJob *job = G_VFS_JOB (l->data);
      
      if (G_VFS_IS_JOB_DBUS (job) &&
          G_VFS_JOB_DBUS (job)->invocation &&
          g_dbus_method_invocation_get_connection (G_VFS_JOB_DBUS (job)->invocation) == connection)
        g_vfs_job_cancel (job);
    }
  g_mutex_unlock (&daemon->lock);

The g_vfs_job_cancel() emits a cancelled signal that eventually ends back in gvfsdaemon.c in job_finished_callback() to remove the job from the jobs list. However, this takes the daemon->lock again and deadlocks.

Obvious fix is to cancel outside the lock. The question is whether that is threadsafe.
Comment 2 Alexander Larsson 2013-09-26 09:49:05 UTC
handle_cancel() does cancel outside the lock, so it better be safe :)
Comment 3 Alexander Larsson 2013-09-26 11:34:03 UTC
Created attachment 255821 [details] [review]
GVfsDaemon: Don't deadlock on cliend disconnect

There was a deadlock that could happen on client disconnect in
peer_connection_closed(). If the cancelled job immediately called back into
job_finished_callback() we would deadlock on daemon->mutex.

So, make sure we cancel jobs without holding the lock, just like
handle_cancel() does.
Comment 4 Alexander Larsson 2013-09-26 11:34:30 UTC
Attachment 255821 [details] pushed as 3448fa0 - GVfsDaemon: Don't deadlock on cliend disconnect