After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 702520 - queue: deadlock when reconfigure event
queue: deadlock when reconfigure event
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gstreamer (core)
1.1.1
Other Linux
: Normal blocker
: 1.1.2
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2013-06-18 00:50 UTC by Aleix Conchillo Flaqué
Modified: 2013-06-19 18:09 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Aleix Conchillo Flaqué 2013-06-18 00:50:13 UTC
I just switched from 1.0.7 to 1.1.1 and a working program got stuck. It seems something (mutex lock) introduced in this commit:

http://cgit.freedesktop.org/gstreamer/gstreamer/commit/plugins/elements/gstqueue.c?id=c955ddc712f3b4de9ef5d822b95a6f4bd9985eb3

I still don't understand why this happens, I'll keep on checking.

This is the backtrace with the affected threads:

Thread 4 (Thread 0x7f3a73004700 (LWP 26202))

  • #0 __lll_lock_wait
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S line 132
  • #1 _L_lock_1006
    from /lib/x86_64-linux-gnu/libpthread.so.0
  • #2 __pthread_mutex_lock
    at pthread_mutex_lock.c line 101
  • #3 g_mutex_lock
    from /opt/oblong/deps/lib/libglib-2.0.so.0
  • #4 gst_queue_handle_src_event
    at gstqueue.c line 1288
  • #5 gst_pad_send_event_unchecked
    at gstpad.c line 4996
  • #6 gst_pad_push_event_unchecked
    at gstpad.c line 4685
  • #7 gst_pad_push_event
    at gstpad.c line 4808
  • #8 event_forward_func
    at gstpad.c line 2741
  • #9 gst_pad_forward
    at gstpad.c line 2695
  • #10 gst_pad_event_default
    at gstpad.c line 2792
  • #11 gst_pad_send_event_unchecked
    at gstpad.c line 4996
  • #12 gst_pad_push_event_unchecked
    at gstpad.c line 4685
  • #13 gst_pad_push_event
    at gstpad.c line 4808
  • #14 event_forward_func
    at gstpad.c line 2741
  • #15 gst_pad_forward
    at gstpad.c line 2695
  • #16 gst_pad_event_default
    at gstpad.c line 2792
  • #17 gst_pad_send_event_unchecked
    at gstpad.c line 4996
  • #18 gst_pad_push_event_unchecked
    at gstpad.c line 4685
  • #19 gst_pad_push_event
    at gstpad.c line 4808
  • #20 gst_type_find_element_src_event
    at gsttypefindelement.c line 519
  • #21 gst_pad_send_event_unchecked
    at gstpad.c line 4996
  • #22 gst_pad_push_event_unchecked
    at gstpad.c line 4685
  • #23 gst_pad_push_event
    at gstpad.c line 4808
  • #24 event_forward_func
    at gstpad.c line 2741
  • #25 gst_pad_forward
    at gstpad.c line 2695
  • #26 gst_pad_event_default
    at gstpad.c line 2792
  • #27 gst_pad_send_event_unchecked
    at gstpad.c line 4996
  • #28 gst_pad_send_event
    at gstpad.c line 5154
  • #29 gst_pad_link_full
    at gstpad.c line 2289

Comment 1 Aleix Conchillo Flaqué 2013-06-18 01:37:58 UTC
(In reply to comment #0)
> 
> This is the backtrace with the affected threads:
> 

Sorry, only thread 4 and 2 from the bt apply to the same queue.
Comment 2 Sebastian Dröge (slomo) 2013-06-18 11:25:12 UTC
Can you provide backtraces of all GStreamer related threads? From these threads there should not be a deadlock as none of them hold that mutex while waiting for the GCond. There must be one thread that is currently blocking this mutex and not releasing it because it waits for something else to finish, namely the allocation query or the pad linking that caused the reconfigure event.
Comment 3 Aleix Conchillo Flaqué 2013-06-18 14:16:11 UTC
(In reply to comment #2)
> Can you provide backtraces of all GStreamer related threads? 

I'll do as soon as I get to the office.

> From these threads
> there should not be a deadlock as none of them hold that mutex while waiting
> for the GCond. There must be one thread that is currently blocking this mutex
> and not releasing it because it waits for something else to finish, namely the
> allocation query or the pad linking that caused the reconfigure event.

It might not be this, but isn't gst_queue_handle_sink_query holding the mutex and waiting on query_handled (thread 2)? The task (queue_loop) seems to start when the reconfigure event is received, but this can't happen because gst_queue_handle_sink_query has the mutex (thread 4).
Comment 4 Aleix Conchillo Flaqué 2013-06-18 16:55:27 UTC
(In reply to comment #2)
> Can you provide backtraces of all GStreamer related threads?

Just rechecked and the backtrace I originally sent contained all the GStreamer related threads.
Comment 5 Aleix Conchillo Flaqué 2013-06-18 17:22:39 UTC
I cut some parts before, here is the full bt.

(gdb) thread apply all bt

Thread 4 (Thread 0x7f3a73004700 (LWP 26202))

  • #0 __lll_lock_wait
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S line 132
  • #1 _L_lock_1006
    from /lib/x86_64-linux-gnu/libpthread.so.0
  • #2 __pthread_mutex_lock
    at pthread_mutex_lock.c line 101
  • #3 g_mutex_lock
    from /opt/oblong/deps/lib/libglib-2.0.so.0
  • #4 gst_queue_handle_src_event
    at gstqueue.c line 1288
  • #5 gst_pad_send_event_unchecked
    at gstpad.c line 4996
  • #6 gst_pad_push_event_unchecked
    at gstpad.c line 4685
  • #7 gst_pad_push_event
    at gstpad.c line 4808
  • #8 event_forward_func
    at gstpad.c line 2741
  • #9 gst_pad_forward
    at gstpad.c line 2695
  • #10 gst_pad_event_default
    at gstpad.c line 2792
  • #11 gst_pad_send_event_unchecked
    at gstpad.c line 4996
  • #12 gst_pad_push_event_unchecked
    at gstpad.c line 4685
  • #13 gst_pad_push_event
    at gstpad.c line 4808
  • #14 event_forward_func
    at gstpad.c line 2741
  • #15 gst_pad_forward
    at gstpad.c line 2695
  • #16 gst_pad_event_default
    at gstpad.c line 2792
  • #17 gst_pad_send_event_unchecked
    at gstpad.c line 4996
  • #18 gst_pad_push_event_unchecked
    at gstpad.c line 4685
  • #19 gst_pad_push_event
    at gstpad.c line 4808
  • #20 gst_type_find_element_src_event
    at gsttypefindelement.c line 519
  • #21 gst_pad_send_event_unchecked
    at gstpad.c line 4996
  • #22 gst_pad_push_event_unchecked
    at gstpad.c line 4685
  • #23 gst_pad_push_event
    at gstpad.c line 4808
  • #24 event_forward_func
    at gstpad.c line 2741
  • #25 gst_pad_forward
    at gstpad.c line 2695
  • #26 gst_pad_event_default
    at gstpad.c line 2792
  • #27 gst_pad_send_event_unchecked
    at gstpad.c line 4996
  • #28 gst_pad_send_event
    at gstpad.c line 5154
  • #29 gst_pad_link_full
    at gstpad.c line 2289
  • #30 XXXXXXXX
  • #31 new_decoded_pad_trambampoline
  • #32 g_cclosure_marshal_VOID__OBJECTv
    from /opt/oblong/deps/lib/libgobject-2.0.so.0
  • #33 ??
    from /opt/oblong/deps/lib/libgobject-2.0.so.0
  • #34 g_signal_emit_valist
    from /opt/oblong/deps/lib/libgobject-2.0.so.0
  • #35 g_signal_emit
    from /opt/oblong/deps/lib/libgobject-2.0.so.0
  • #36 gst_element_add_pad
    at gstelement.c line 697
  • #37 gst_decode_bin_expose
    at gstdecodebin2.c line 3957
  • #38 source_pad_blocked_cb
    at gstdecodebin2.c line 4129
  • #39 probe_hook_marshal
    at gstpad.c line 3061
  • #40 g_hook_list_marshal
    from /opt/oblong/deps/lib/libglib-2.0.so.0
  • #41 do_probe_callbacks
    at gstpad.c line 3155
  • #42 gst_pad_push_event_unchecked
    at gstpad.c line 4652
  • #43 push_sticky
    at gstpad.c line 3324
  • #44 events_foreach
    at gstpad.c line 530
  • #45 check_sticky
    at gstpad.c line 3380
  • #46 gst_pad_peer_query
    at gstpad.c line 3581
  • #47 query_forward_func
    at gstpad.c line 2931
  • #48 gst_pad_forward
    at gstpad.c line 2695
  • #49 gst_pad_query_default
    at gstpad.c line 2995
  • #50 gst_pad_query
    at gstpad.c line 3465
  • #51 gst_pad_peer_query
    at gstpad.c line 3596
  • #52 query_forward_func
    at gstpad.c line 2931
  • #53 gst_pad_forward
    at gstpad.c line 2695
  • #54 gst_pad_query_default
    at gstpad.c line 2995
  • #55 gst_pad_query
    at gstpad.c line 3465
  • #56 gst_pad_peer_query
    at gstpad.c line 3596
  • #57 query_forward_func
    at gstpad.c line 2931
  • #58 gst_pad_forward
    at gstpad.c line 2695
  • #59 gst_pad_query_default
    at gstpad.c line 2995
  • #60 gst_pad_query
    at gstpad.c line 3465
  • #61 gst_pad_peer_query
    at gstpad.c line 3596
  • #62 gst_queue_push_one
    at gstqueue.c line 1184
  • #63 gst_queue_loop
    at gstqueue.c line 1238
  • #64 gst_task_func
    at gsttask.c line 316
  • #65 ??
    from /opt/oblong/deps/lib/libglib-2.0.so.0
  • #66 ??
    from /opt/oblong/deps/lib/libglib-2.0.so.0
  • #67 ??
    from /usr/lib/libGL.so.1
  • #68 start_thread
    at pthread_create.c line 308
  • #69 clone
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S line 112
  • #70 ??

Comment 6 Aleix Conchillo Flaqué 2013-06-18 17:34:24 UTC
If I am not wrong the problem is that we are in gst_queue_push_one (#62) that is called from gst_queue_loop (#63) which is holding a mutex. And we end up in gst_queue_handle_src_event (#4) which takes the same mutex.
Comment 7 Aleix Conchillo Flaqué 2013-06-18 19:18:17 UTC
A g_mutex_trylock in gst_queue_handle_src_event seems to solve the issue. But I don't really know this code, so it might have side effects.

Everything works, though.

----------------

diff --git a/plugins/elements/gstqueue.c b/plugins/elements/gstqueue.c
index 765005d..f9bf545 100644
--- a/plugins/elements/gstqueue.c
+++ b/plugins/elements/gstqueue.c
@@ -138,6 +138,8 @@ enum
     goto label;                                                         \
 } G_STMT_END
 
+#define GST_QUEUE_MUTEX_TRYLOCK(q) g_mutex_trylock (&q->qlock)
+
 #define GST_QUEUE_MUTEX_UNLOCK(q) G_STMT_START {                        \
   g_mutex_unlock (&q->qlock);                                            \
 } G_STMT_END
@@ -1283,18 +1285,20 @@ gst_queue_handle_src_event (GstPad * pad, GstObject * parent, GstEvent * event)
 #endif
 
   switch (GST_EVENT_TYPE (event)) {
-    case GST_EVENT_RECONFIGURE:
-      GST_QUEUE_MUTEX_LOCK (queue);
+    case GST_EVENT_RECONFIGURE: {
+      gboolean locked = GST_QUEUE_MUTEX_TRYLOCK (queue);
       if (queue->srcresult == GST_FLOW_NOT_LINKED) {
         /* when we got not linked, assume downstream is linked again now and we
          * can try to start pushing again */
         queue->srcresult = GST_FLOW_OK;
         gst_pad_start_task (pad, (GstTaskFunction) gst_queue_loop, pad, NULL);
       }
-      GST_QUEUE_MUTEX_UNLOCK (queue);
+      if (locked)
+        GST_QUEUE_MUTEX_UNLOCK (queue);
 
       res = gst_pad_push_event (queue->sinkpad, event);
       break;
+    }
     default:
       res = gst_pad_event_default (pad, parent, event);
       break;
Comment 8 Sebastian Dröge (slomo) 2013-06-19 08:54:33 UTC
This should fix it, please test :) Thanks for your analysis

commit 2f8e572887219759cedd3d1cf9a044a6e083c8c7
Author: Sebastian Dröge <slomo@circular-chaos.org>
Date:   Wed Jun 19 10:53:21 2013 +0200

    queue: Don't hold the queue mutex while doing serialized queries downstream
    
    https://bugzilla.gnome.org/show_bug.cgi?id=702520
Comment 9 Aleix Conchillo Flaqué 2013-06-19 18:09:43 UTC
Yes, that fixed it. Thanks!