After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 745319 - queue: can lock up the pipeline on serialized queries when downstream returns errors
queue: can lock up the pipeline on serialized queries when downstream returns...
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gstreamer (core)
1.4.5
Other Linux
: Normal normal
: 1.5.1
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2015-02-28 04:38 UTC by Mohammed Sameer
Modified: 2015-03-09 07:28 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Mohammed Sameer 2015-02-28 04:38:20 UTC
I have a simple pipeline like this:
gst-launch-1.0 playbin video-sink=fakesink uri=file:///path/to/file

I am using a custom decoder element which gets picked up by playbin.

The decoder has a task running on its src pad to push decoded buffers downstream which fails due to caps negotiation failing.

handle_frame then starts to return GST_FLOW_NOT_NEGOTIATED as well.

The problem is vqueue locks up and the pipeline hangs.

I traced it farther and the problem seems to be in gstqueue

decoder tries to finish a decoded frame
gst_queue_chain gets called with a buffer
gst_queue_loop wakes up because a buffer got pushed to the queue
gst_queue_push_one picks up the buffer and then returns a flow error
_loop() goes to out_flushing which will pause the task but it will 

the custom decoder is still not aware of this error tries to finish a new frame
but since caps negotiation has not succeeded then gstvideodecoder will try to negotiate and send an allocation query which will freeze waiting on gstqueue query_handled which will never get signaled because the task has been paused.

The stack trace from gdb indicates that:
  • #0 __libc_do_syscall
    at ../ports/sysdeps/unix/sysv/linux/arm/libc-do-syscall.S line 43
  • #1 __pthread_cond_wait
    at pthread_cond_wait.c line 187
  • #2 g_cond_wait
    at gthread-posix.c line 753
  • #3 gst_queue_handle_sink_query
    at gstqueue.c line 922
  • #4 gst_pad_query
    at gstpad.c line 3584
  • #5 gst_pad_peer_query
    at gstpad.c line 3712
  • #6 query_forward_func
    at gstpad.c line 3040
  • #7 gst_pad_forward
    at gstpad.c line 2796
  • #8 gst_pad_query_default
    at gstpad.c line 3104
  • #9 gst_pad_query
    at gstpad.c line 3584
  • #10 gst_pad_peer_query
    at gstpad.c line 3712
  • #11 query_forward_func
    at gstpad.c line 3040
  • #12 gst_pad_forward
    at gstpad.c line 2796
  • #13 gst_pad_query_default
    at gstpad.c line 3104
  • #14 gst_pad_query
    at gstpad.c line 3584
  • #15 gst_pad_peer_query
    at gstpad.c line 3712
  • #16 ??
    from /usr/lib/gstreamer-1.0/libgstdeinterlace.so
  • #17 gst_pad_query
    at gstpad.c line 3584
  • #18 gst_pad_peer_query
    at gstpad.c line 3712
  • #19 gst_base_transform_default_propose_allocation
  • #20 gst_video_filter_propose_allocation
    at gstvideofilter.c line 64
  • #21 gst_base_transform_default_query
    at gstbasetransform.c line 1482
  • #22 gst_pad_query
    at gstpad.c line 3584
  • #23 gst_pad_peer_query
    at gstpad.c line 3712
  • #24 query_forward_func
    at gstpad.c line 3040
  • #25 gst_pad_forward
    at gstpad.c line 2796
  • #26 gst_pad_query_default
    at gstpad.c line 3104
  • #27 gst_pad_query
    at gstpad.c line 3584
  • #28 gst_pad_peer_query
    at gstpad.c line 3712
  • #29 query_forward_func
    at gstpad.c line 3040
  • #30 gst_pad_forward
    at gstpad.c line 2796
  • #31 gst_pad_query_default
    at gstpad.c line 3104
  • #32 gst_pad_query
    at gstpad.c line 3584
  • #33 gst_pad_peer_query
    at gstpad.c line 3712
  • #34 query_forward_func
    at gstpad.c line 3040
  • #35 gst_pad_forward
    at gstpad.c line 2796
  • #36 gst_pad_query_default
    at gstpad.c line 3104
  • #37 gst_pad_query
    at gstpad.c line 3584
  • #38 gst_pad_peer_query
    at gstpad.c line 3712
  • #39 query_forward_func
    at gstpad.c line 3040
  • #40 gst_pad_forward
    at gstpad.c line 2796
  • #41 gst_pad_query_default
    at gstpad.c line 3104
  • #42 gst_pad_query
    at gstpad.c line 3584
  • #43 gst_pad_peer_query
    at gstpad.c line 3712
  • #44 query_forward_func
    at gstpad.c line 3040
  • #45 gst_pad_forward
    at gstpad.c line 2796
  • #46 gst_pad_query_default
    at gstpad.c line 3104
  • #47 gst_pad_query
    at gstpad.c line 3584
  • #48 gst_pad_peer_query
    at gstpad.c line 3712
  • #49 query_forward_func
    at gstpad.c line 3040
  • #50 gst_pad_forward
    at gstpad.c line 2796
  • #51 gst_pad_query_default
    at gstpad.c line 3104
  • #52 gst_decode_pad_query
    at gstdecodebin2.c line 4365
  • #53 gst_pad_query
    at gstpad.c line 3584
  • #54 gst_pad_peer_query
    at gstpad.c line 3712
  • #55 gst_video_decoder_negotiate_pool
    at gstvideodecoder.c line 3293
  • #56 gst_video_decoder_negotiate_unlocked
    at gstvideodecoder.c line 3443
  • #57 gst_video_decoder_finish_frame
    at gstvideodecoder.c line 2649
  • #4 gst_pad_query
    at gstpad.c line 3584
$1 = {mini_object = {type = 21345144, refcount = 1, lockstate = 0, flags = 0, 
    copy = 0xb6ed7c15 <_gst_query_copy>, dispose = 0x0, free = 0xb6ed7161 <_gst_query_free>, 
    n_qdata = 0, qdata = 0x0}, type = GST_QUERY_ALLOCATION}

I have been reading the code and the only thing I can think of is a race condition:

gst_queue_handle_sink_query
queue->srcresult is set to GST_FLOW_OK before g_cond_wait gets invoked (gstqueue.c:909 and 917)

meanwhile gst_queue_loop updates queue->srcresult (line 1299) and the cond never gets signaled and we wait forever.

I tried with a few other decoders and I am not sure why my custom decoder is causing this ;-)
Comment 1 Sebastian Dröge (slomo) 2015-03-01 17:35:49 UTC
Can you write a testcase for this? And maybe in queue (and probably queue2 and multiqueue) we should first check for the srcresult before waiting for the GCond... might this already fix your problem?
Comment 2 Sebastian Dröge (slomo) 2015-03-03 11:06:03 UTC
Looking at the code again, it already checks the srcresult before waiting in the macro. So in theory what you described should never happen.
Comment 3 Sebastian Dröge (slomo) 2015-03-03 11:50:05 UTC
Actually... please test this commit :)

commit a941b4651ce769c87243c84050120c8d244588f6
Author: Sebastian Dröge <sebastian@centricular.com>
Date:   Tue Mar 3 12:48:34 2015 +0100

    queue: Wake up the query function on errors from the loop function
    
    Otherwise we might wait forever for serialized queries to be handled as the
    loop function is stopped and as such we will never ever dequeue the query and
    handle it.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=745319
Comment 4 Sebastian Dröge (slomo) 2015-03-03 11:55:10 UTC
This one is also relevant, same problem really:

commit bc77a3fa0a610187d468c38ce83ceec2e9a79688
Author: Sebastian Dröge <sebastian@centricular.com>
Date:   Tue Mar 3 12:53:13 2015 +0100

    queue2: Signal the sinkpad thread if a flow error happened
    
    It might still be waiting for a query to be handled, or the queue to become
    empty again for the next item. Also if downstream returns FLUSHING, flush the
    queue like we do in queue and multiqueue.
Comment 5 Mohammed Sameer 2015-03-08 17:50:36 UTC
I tried creating a test case but miserably failed :(

I also cannot reproduce it anymore. Apparently the fixes I put in the custom decoder fixed that issue somehow. I will try harder to reproduce it but it will need some more time.
Comment 6 Sebastian Dröge (slomo) 2015-03-09 07:28:58 UTC
Let's assume this is fixed then for now. Please reopen otherwise.