Bug 692358 – appsrc deadlock setting the pipeline to NULL state

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 692358 - appsrc deadlock setting the pipeline to NULL state


Summary:	appsrc deadlock setting the pipeline to NULL state


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gst-plugins-base
Version:	unspecified
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	1.1.1
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2013-01-23 09:06 UTC by Nicola
Modified:	2013-03-12 10:21 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
small test app that reproduce the issue (7.61 KB, text/x-csrc) 2013-01-24 10:28 UTC, Nicola		Details
fix deadlock (749 bytes, patch) 2013-01-24 11:20 UTC, Nicola	reviewed	Details \| Review
first ugly attempt to make a test case (5.60 KB, patch) 2013-03-11 21:32 UTC, Nicola	none	Details \| Review
appsrc: Fix deadlock setting the pipeline in NULL state when block=true (5.86 KB, patch) 2013-03-12 08:35 UTC, Sebastian Dröge (slomo)	none	Details \| Review
patch for 1.0 (5.88 KB, patch) 2013-03-12 09:51 UTC, Nicola	committed	Details \| Review
patch for 0.10 (5.79 KB, patch) 2013-03-12 09:52 UTC, Nicola	committed	Details \| Review

Description Nicola 2013-01-23 09:06:29 UTC

if block property is set to TRUE on appsrc element when an application push a buffer if the internal appsrc queue is full the appsrc will block on this line:

g_cond_wait (priv->cond, priv->mutex);

if in this condition an error message is sent on the bus and the app set the pipeline state to NULL, appsrc will crash, 

any hint on how to handle this? Maybe send an eos inside dispose?

Comment 1 Tim-Philipp Müller 2013-01-23 09:11:58 UTC

Do you have a small test case?

Comment 2 Nicola 2013-01-23 14:46:56 UTC

Not for now sorry, I did some more debug and found a workaround, I'll need to explain better my use case:

- pipeline1: src ! ... ! appsink sync=false
- pipeline2: appsrc block=true ! queue ! fdsink sync=false

fdsink write to a socket, the receiver read speed can vary, 

I'm using gst_app_sink_set_callbacks and in the appsink callback, so in the streaming thread, I do:

gst_app_src_push_buffer(GST_APP_SRC(my_appsrc), buf);

this blocks, as excepted, when the client is slow to read the data, I added some g_print first and after this line, just to be sure:

http://cgit.freedesktop.org/gstreamer/gst-plugins-base/tree/gst-libs/gst/app/gstappsrc.c?h=0.10#n1476

now if the client disconnects I get an error on the bus (broken pipe or similar), I cannot set to null the "first" pipeline since gst_app_src_push_buffer is blocked and setting the state to NULL never return, instead I can:

- send eos to appsrc, this unlock gst_app_src_push_buffer
- does not send any new buffers to appsrc
- set to null the first pipeline
- wait for an error on the appsrc pipeline, I get this one from appsrc: streaming task paused, reason error (-5))
- set to null the appsrc pipeline

If I set to null the appsrc pipeline as I get the first bus error ofetn works but any attempt to set to null the appsink pipeline will block/segfault, gst_app_src_push_buffer is blocked even if the appsrc element was disposed and finalized,

probably gst_app_src_push_buffer appsrc should unblock the push_buffer method and unref the buffer when the pipeline is disposed/finalized,

I'll try to provide a test case in future

Comment 3 Tim-Philipp Müller 2013-01-23 14:53:17 UTC

Could you provide a full stack trace of all threads (thread apply all bt) from when it's blocked?

Comment 4 Nicola 2013-01-23 15:31:01 UTC

(gdb) thread apply all bt

+ Trace 231423

Thread 18 (Thread 0x7fffe5d0d700 (LWP 31039))

#0 __lll_lock_wait
at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S line 132
#1 pthread_cond_wait
at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S line 287
#2 g_cond_wait
at /build/buildd/glib2.0-2.32.3/./glib/gthread-posix.c line 746
#3 gst_app_src_push_buffer_full
at gstappsrc.c line 1476
#4 MediaSender::pushVideoBuffer(_GstBuffer*)
#5 MediaReader::handle_new_video_buffer(_GstAppSink*, void*)
#6 gst_app_sink_render_common
at gstappsink.c line 795
#7 gst_base_sink_render_object
at gstbasesink.c line 3012
#8 gst_base_sink_queue_object_unlocked
#9 gst_base_sink_chain_unlocked
at gstbasesink.c line 3675
#10 gst_base_sink_chain_main
at gstbasesink.c line 3713
#11 gst_pad_push
at gstpad.c line 4715
#12 gst_base_parse_push_frame
at gstbaseparse.c line 1967
#13 gst_base_parse_chain
at gstbaseparse.c line 2303
#14 gst_h264_parse_chain
at gsth264parse.c line 1948
#15 gst_pad_push
at gstpad.c line 4715
#16 gst_queue_push_one
at gstqueue.c line 1156
#17 gst_queue_loop
at gstqueue.c line 1264
#18 gst_task_func
at gsttask.c line 328
#19 g_thread_pool_thread_proxy
at /build/buildd/glib2.0-2.32.3/./glib/gthreadpool.c line 309
#20 g_thread_proxy
at /build/buildd/glib2.0-2.32.3/./glib/gthread.c line 801
#21 start_thread
at pthread_create.c line 308
#22 clone
at ../sysdeps/unix/sysv/linux/x86_64/clone.S line 112
#23 ??

Thread 1 (Thread 0x7ffff7fc9740 (LWP 30953))

#0 __lll_lock_wait
at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S line 132
#1 _L_lock_1006
from /lib/x86_64-linux-gnu/libpthread.so.0
#2 __pthread_mutex_lock
at pthread_mutex_lock.c line 101
#3 g_mutex_lock
at /build/buildd/glib2.0-2.32.3/./glib/gthread-posix.c line 208
#4 gst_base_sink_change_state
at gstbasesink.c line 5099
#5 gst_element_change_state
at gstelement.c line 2782
#6 gst_element_set_state_func
at gstelement.c line 2738
#7 gst_bin_element_set_state
at gstbin.c line 2273
#8 gst_bin_change_state_func
at gstbin.c line 2579
#9 gst_pipeline_change_state
at gstpipeline.c line 484
#10 gst_element_change_state
at gstelement.c line 2782
#11 gst_element_set_state_func
at gstelement.c line 2738
#12 MediaReader::cleanReader(bool, bool)
#13 TaskWorker::onSenderError()
#14 TaskWorker::qt_static_metacall(QObject*, QMetaObject::Call, int, void**)
#15 QMetaObject::activate
at kernel/qobject.cpp line 3547
#16 MediaSender::cleanMedia(bool)
#17 MediaSender::on_message(_GstBus*, _GstMessage*, MediaSender*)
#18 gst_bus_source_dispatch
#19 g_main_dispatch
at /build/buildd/glib2.0-2.32.3/./glib/gmain.c line 2539
#20 g_main_context_dispatch
at /build/buildd/glib2.0-2.32.3/./glib/gmain.c line 3075
#21 g_main_context_iterate
at /build/buildd/glib2.0-2.32.3/./glib/gmain.c line 3146
#22 g_main_context_iterate
at /build/buildd/glib2.0-2.32.3/./glib/gmain.c line 3083
#23 g_main_context_iteration
at /build/buildd/glib2.0-2.32.3/./glib/gmain.c line 3207
#24 QEventDispatcherGlib::processEvents
at kernel/qeventdispatcher_glib.cpp line 424
#25 QEventLoop::processEvents
at kernel/qeventloop.cpp line 149
#26 QEventLoop::exec
at kernel/qeventloop.cpp line 204
#27 QCoreApplication::exec
at kernel/qcoreapplication.cpp line 1148
#28 main

Comment 5 Tim-Philipp Müller 2013-01-23 23:45:45 UTC

What is the pipeline exactly?

Is there an indirect loop in the pipeline somehow? Do I read the stack trace correctly that you have your own sink (either derived or going via appsink), which pushes buffers to an appsrc?

According to the stack trace the basesink's change_state function (called from your app when setting pipeline state to NULL) is waiting for the streaming thread to finish / give up the preroll lock. This should have happened as response to the ->unlock vfunc. Since the basesink's streaming thread is blocked in the appsrc push, the unlock vfunc needs to set the appsrc's state to NULL, so the push_buffer unblocks and the basesink's streaming thread can shut down, and the state change function can continue to shut down the other elements.

Comment 6 Nicola 2013-01-24 00:09:33 UTC

I have two pipeline:


first pipeline:

filesrc location=... ! matroskademux ! queue ! h264parse ! appsink sync=false emit-signals=false async=false preroll-queue-len=1

from the appsink streaming thread of the above pipeline I push buffers in the second one with gst_app_src_push_buffer, 

second pipeline:

appsrc block=true ! queue ! matroskamux ! fdsink sync=false fd=..

Comment 7 Nicola 2013-01-24 00:12:45 UTC

I have the bus message error for the second pipeline if the client (fdsink) disconnect before the end of the stream and the state change to null deadlock on the first pipeline if I don't use the described workaround

Comment 8 Nicola 2013-01-24 10:28:12 UTC

Created attachment 234291 [details]
small test app that reproduce the issue

To reproduce close the xvimagesink window so an error is posted on the bus, please note that the deadlock does not happen always, the test app run the same pipeline 100 times, I was able to reproduce at most after 40 try, here is the backtrace from the test app

+ Trace 231429

Thread 1 (Thread 0x7ffff7fcc700 (LWP 9584))

#0 __lll_lock_wait
at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S line 132
#1 _L_lock_1006
from /lib/x86_64-linux-gnu/libpthread.so.0
#2 __pthread_mutex_lock
at pthread_mutex_lock.c line 101
#3 g_mutex_lock
at /build/buildd/glib2.0-2.32.3/./glib/gthread-posix.c line 208
#4 gst_base_sink_change_state
at gstbasesink.c line 5099
#5 gst_element_change_state
at gstelement.c line 2782
#6 gst_element_set_state_func
at gstelement.c line 2738
#7 gst_bin_element_set_state
at gstbin.c line 2273
#8 gst_bin_change_state_func
at gstbin.c line 2579
#9 gst_pipeline_change_state
at gstpipeline.c line 484
#10 gst_element_change_state
at gstelement.c line 2782
#11 gst_element_set_state_func
at gstelement.c line 2738
#12 on_sink_message
#13 gst_bus_source_dispatch
at gstbus.c line 764
#14 g_main_dispatch
at /build/buildd/glib2.0-2.32.3/./glib/gmain.c line 2539
#15 g_main_context_dispatch
#16 g_main_context_iterate
at /build/buildd/glib2.0-2.32.3/./glib/gmain.c line 3146
#17 g_main_context_iterate
at /build/buildd/glib2.0-2.32.3/./glib/gmain.c line 3083
#18 g_main_loop_run
at /build/buildd/glib2.0-2.32.3/./glib/gmain.c line 3340
#19 main

Comment 9 Nicola 2013-01-24 11:10:25 UTC

the issue is fixed adding:

g_cond_broadcast (priv->cond);

inside gst_app_src_stop after setting priv->flusing to TRUE, if you think this is the correct solution I can send a patch

Comment 10 Nicola 2013-01-24 11:20:46 UTC

Created attachment 234295 [details] [review]
fix deadlock

here is the patch, tested on 0.10 should apply and work on 1.0 too

Comment 11 Tim-Philipp Müller 2013-02-08 11:26:01 UTC

Right, so I can reproduce the deadlock with your test case in 0.10, and it's probably still present in 1.0.

I am not sure if your solution is really right.

I think the problem is ultimately that we allow buffers to be pushed into the appsrc even in NULL state, both before start and after shutdown.

As part of the downward state change, the _unlock() and _unlock_stop() functions will be called, which will signal on the GCond. After that, no more buffers should be accepted IMHO, thus preventing any blocking after that point.

However, it's a slight change in behaviour. Will need to see if that's acceptable. Should be ok to change after stopping at least, hopefully.

Comment 12 Sebastian Dröge (slomo) 2013-02-20 11:28:48 UTC

(In reply to comment #11)
> Right, so I can reproduce the deadlock with your test case in 0.10, and it's
> probably still present in 1.0.
> 
> I am not sure if your solution is really right.
> 
> I think the problem is ultimately that we allow buffers to be pushed into the
> appsrc even in NULL state, both before start and after shutdown.
> 
> As part of the downward state change, the _unlock() and _unlock_stop()
> functions will be called, which will signal on the GCond. After that, no more
> buffers should be accepted IMHO, thus preventing any blocking after that point.

I agree, that's the behaviour that is consistent with the GStreamer concepts in general.

Comment 13 Tim-Philipp Müller 2013-03-11 11:55:10 UTC

Comment on attachment 234295 [details] [review]
fix deadlock

I don't think this patch is the correct solution.

Comment 14 Nicola 2013-03-11 12:53:22 UTC

I can provide an updated patch but I need some more info please, do you think would be acceptable add a check for priv->started here:

http://cgit.freedesktop.org/gstreamer/gst-plugins-base/tree/gst-libs/gst/app/gstappsrc.c?h=0.10#n1443

similar to the one for priv->flushing 

and set priv->started to FALSE in _unlock()? This way the buffers will be flushed if not started and appsrc is started and so accept buffers only after calling gst_app_src_start

Comment 15 Nicola 2013-03-11 13:00:51 UTC

or should I add another boolean to check if buffers could be accepted? and set this boolean to TRUE in start to FALSE in unlock and to TRUE in unlock_stop?

Comment 16 Sebastian Dröge (slomo) 2013-03-11 13:19:19 UTC

Between the call to start() and the call to stop() buffers would be accepted, otherwise not. So just looking at priv->started should be enough, and could be done around line 1443 similar to flushing and eos.

Comment 17 Sebastian Dröge (slomo) 2013-03-11 13:44:48 UTC

Comment on attachment 234295 [details] [review]
fix deadlock

After some more discussions this patch was considered correct... I'll push it after the test application was converted to a unit test (see tests/check/elements/appsrc.c for examples)

Comment 18 Nicola 2013-03-11 21:32:36 UTC

Created attachment 238624 [details] [review]
first ugly attempt to make a test case

I leaved some g_print for now, you can see the deadlock without the patch, please suggest how to improve the test case

Comment 19 Sebastian Dröge (slomo) 2013-03-12 08:35:22 UTC

Created attachment 238675 [details] [review]
appsrc: Fix deadlock setting the pipeline in NULL state when block=true

Patch to apply cleanly against 1.0

The testcase always deadlocks, with or without your patch. Didn't look closer

Comment 20 Nicola 2013-03-12 09:51:29 UTC

Created attachment 238677 [details] [review]
patch for 1.0

Comment 21 Nicola 2013-03-12 09:52:55 UTC

Created attachment 238678 [details] [review]
patch for 0.10

Comment 22 Sebastian Dröge (slomo) 2013-03-12 10:21:17 UTC

commit 2a1dc7ca5608fecf01061ee3414f3f777e724ae3
Author: Nicola Murino <nicola.murino@gmail.com>
Date:   Tue Mar 12 10:32:44 2013 +0100

    appsrc: fix deadlock setting pipeline in NULL state with block=true