GNOME Bugzilla – Bug 692358
appsrc deadlock setting the pipeline to NULL state
Last modified: 2013-03-12 10:21:17 UTC
if block property is set to TRUE on appsrc element when an application push a buffer if the internal appsrc queue is full the appsrc will block on this line: g_cond_wait (priv->cond, priv->mutex); if in this condition an error message is sent on the bus and the app set the pipeline state to NULL, appsrc will crash, any hint on how to handle this? Maybe send an eos inside dispose?
Do you have a small test case?
Not for now sorry, I did some more debug and found a workaround, I'll need to explain better my use case: - pipeline1: src ! ... ! appsink sync=false - pipeline2: appsrc block=true ! queue ! fdsink sync=false fdsink write to a socket, the receiver read speed can vary, I'm using gst_app_sink_set_callbacks and in the appsink callback, so in the streaming thread, I do: gst_app_src_push_buffer(GST_APP_SRC(my_appsrc), buf); this blocks, as excepted, when the client is slow to read the data, I added some g_print first and after this line, just to be sure: http://cgit.freedesktop.org/gstreamer/gst-plugins-base/tree/gst-libs/gst/app/gstappsrc.c?h=0.10#n1476 now if the client disconnects I get an error on the bus (broken pipe or similar), I cannot set to null the "first" pipeline since gst_app_src_push_buffer is blocked and setting the state to NULL never return, instead I can: - send eos to appsrc, this unlock gst_app_src_push_buffer - does not send any new buffers to appsrc - set to null the first pipeline - wait for an error on the appsrc pipeline, I get this one from appsrc: streaming task paused, reason error (-5)) - set to null the appsrc pipeline If I set to null the appsrc pipeline as I get the first bus error ofetn works but any attempt to set to null the appsink pipeline will block/segfault, gst_app_src_push_buffer is blocked even if the appsrc element was disposed and finalized, probably gst_app_src_push_buffer appsrc should unblock the push_buffer method and unref the buffer when the pipeline is disposed/finalized, I'll try to provide a test case in future
Could you provide a full stack trace of all threads (thread apply all bt) from when it's blocked?
(gdb) thread apply all bt
+ Trace 231423
Thread 18 (Thread 0x7fffe5d0d700 (LWP 31039))
Thread 1 (Thread 0x7ffff7fc9740 (LWP 30953))
What is the pipeline exactly? Is there an indirect loop in the pipeline somehow? Do I read the stack trace correctly that you have your own sink (either derived or going via appsink), which pushes buffers to an appsrc? According to the stack trace the basesink's change_state function (called from your app when setting pipeline state to NULL) is waiting for the streaming thread to finish / give up the preroll lock. This should have happened as response to the ->unlock vfunc. Since the basesink's streaming thread is blocked in the appsrc push, the unlock vfunc needs to set the appsrc's state to NULL, so the push_buffer unblocks and the basesink's streaming thread can shut down, and the state change function can continue to shut down the other elements.
I have two pipeline: first pipeline: filesrc location=... ! matroskademux ! queue ! h264parse ! appsink sync=false emit-signals=false async=false preroll-queue-len=1 from the appsink streaming thread of the above pipeline I push buffers in the second one with gst_app_src_push_buffer, second pipeline: appsrc block=true ! queue ! matroskamux ! fdsink sync=false fd=..
I have the bus message error for the second pipeline if the client (fdsink) disconnect before the end of the stream and the state change to null deadlock on the first pipeline if I don't use the described workaround
Created attachment 234291 [details] small test app that reproduce the issue To reproduce close the xvimagesink window so an error is posted on the bus, please note that the deadlock does not happen always, the test app run the same pipeline 100 times, I was able to reproduce at most after 40 try, here is the backtrace from the test app
+ Trace 231429
Thread 1 (Thread 0x7ffff7fcc700 (LWP 9584))
the issue is fixed adding: g_cond_broadcast (priv->cond); inside gst_app_src_stop after setting priv->flusing to TRUE, if you think this is the correct solution I can send a patch
Created attachment 234295 [details] [review] fix deadlock here is the patch, tested on 0.10 should apply and work on 1.0 too
Right, so I can reproduce the deadlock with your test case in 0.10, and it's probably still present in 1.0. I am not sure if your solution is really right. I think the problem is ultimately that we allow buffers to be pushed into the appsrc even in NULL state, both before start and after shutdown. As part of the downward state change, the _unlock() and _unlock_stop() functions will be called, which will signal on the GCond. After that, no more buffers should be accepted IMHO, thus preventing any blocking after that point. However, it's a slight change in behaviour. Will need to see if that's acceptable. Should be ok to change after stopping at least, hopefully.
(In reply to comment #11) > Right, so I can reproduce the deadlock with your test case in 0.10, and it's > probably still present in 1.0. > > I am not sure if your solution is really right. > > I think the problem is ultimately that we allow buffers to be pushed into the > appsrc even in NULL state, both before start and after shutdown. > > As part of the downward state change, the _unlock() and _unlock_stop() > functions will be called, which will signal on the GCond. After that, no more > buffers should be accepted IMHO, thus preventing any blocking after that point. I agree, that's the behaviour that is consistent with the GStreamer concepts in general.
Comment on attachment 234295 [details] [review] fix deadlock I don't think this patch is the correct solution.
I can provide an updated patch but I need some more info please, do you think would be acceptable add a check for priv->started here: http://cgit.freedesktop.org/gstreamer/gst-plugins-base/tree/gst-libs/gst/app/gstappsrc.c?h=0.10#n1443 similar to the one for priv->flushing and set priv->started to FALSE in _unlock()? This way the buffers will be flushed if not started and appsrc is started and so accept buffers only after calling gst_app_src_start
or should I add another boolean to check if buffers could be accepted? and set this boolean to TRUE in start to FALSE in unlock and to TRUE in unlock_stop?
Between the call to start() and the call to stop() buffers would be accepted, otherwise not. So just looking at priv->started should be enough, and could be done around line 1443 similar to flushing and eos.
Comment on attachment 234295 [details] [review] fix deadlock After some more discussions this patch was considered correct... I'll push it after the test application was converted to a unit test (see tests/check/elements/appsrc.c for examples)
Created attachment 238624 [details] [review] first ugly attempt to make a test case I leaved some g_print for now, you can see the deadlock without the patch, please suggest how to improve the test case
Created attachment 238675 [details] [review] appsrc: Fix deadlock setting the pipeline in NULL state when block=true Patch to apply cleanly against 1.0 The testcase always deadlocks, with or without your patch. Didn't look closer
Created attachment 238677 [details] [review] patch for 1.0
Created attachment 238678 [details] [review] patch for 0.10
commit 2a1dc7ca5608fecf01061ee3414f3f777e724ae3 Author: Nicola Murino <nicola.murino@gmail.com> Date: Tue Mar 12 10:32:44 2013 +0100 appsrc: fix deadlock setting pipeline in NULL state with block=true