Bug 755400 – splitmuxsink: deadlocks when inserted into running pipeline

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 755400 - splitmuxsink: deadlocks when inserted into running pipeline


Summary:	splitmuxsink: deadlocks when inserted into running pipeline


Status:	RESOLVED OBSOLETE

Product:	GStreamer
Classification:	Platform
Component:	gst-plugins-good
Version:	git master
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	git master
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2015-09-22 09:48 UTC by Jesper Larsen
Modified:	2018-11-03 15:04 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
test case (6.37 KB, text/x-csrc) 2015-09-22 09:48 UTC, Jesper Larsen		Details
debug-log-1 (16.42 KB, text/plain) 2015-09-22 09:49 UTC, Jesper Larsen		Details
debug-log-2 (62.65 KB, text/plain) 2015-09-22 09:50 UTC, Jesper Larsen		Details
deadlock fix (2.29 KB, patch) 2016-04-12 15:36 UTC, Vincent Penquerc'h	none	Details \| Review

Description Jesper Larsen 2015-09-22 09:48:45 UTC

Created attachment 311839 [details]
test case

Inserting splitmuxsink into a running pipeline causes a deadlock for the configuration described here.
The use case is to have a live source of video+audio running continuously, with the possibility of starting a recording using splitmuxsink. 

A test case is provided as an attachment.

The initial pipeline is as follows:

audiotestsrc is-live=true ! faac ! aacparse ! audio/mpeg,stream-format=raw ! queue ! fakesink
videotestsrc is-live=true ! x264enc key-int-max=10 ! h264parse ! video/x-h264,alignment=au,stream-format=avc ! queue ! fakesink

After 10s blocking pad probes are installed on the src pads of the two queues. The fakesinks are removed from the pipeline, and a new splitmuxsink is created and inserted.
The state of the splitmuxsink is set to PLAYING, and the block probes are removed.

A few buffers are received on the sinkpad of the multiqueue inside the splitmuxsink, but then dataflow stops.

Using GDB, I can see that the streaming thread of the audio queue srcpad is waiting in gstsplitmuxsink.c:1071

          GST_LOG_OBJECT (pad, "Sleeping for GOP start");
------>   GST_SPLITMUX_WAIT (splitmux);
          GST_LOG_OBJECT (pad, "Done sleeping for GOP start state now %d",
              splitmux->state);

The thread for the video src pad is waiting in gstmultiqueue.c:1938 with an allocation type query.

1925           GST_DEBUG_OBJECT (mq,
1926               "SingleQueue %d : Enqueuing query %p of type %s with id %d",
1927               sq->id, query, GST_QUERY_TYPE_NAME (query), curid);
1928           GST_MULTI_QUEUE_MUTEX_UNLOCK (mq);
1929           res = gst_data_queue_push (sq->queue, (GstDataQueueItem *) item);
1930           GST_MULTI_QUEUE_MUTEX_LOCK (mq);
1931           /* it might be that the query has been taken out of the queue
1932            * while we were unlocked. So, we need to check if the last
1933            * handled query is the same one than the one we just
1934            * pushed. If it is, we don't need to wait for the condition
1935            * variable, otherwise we wait for the condition variable to
1936            * be signaled. */
1937           if (sq->last_handled_query != query)
1938             g_cond_wait (&sq->query_handled, &mq->qlock);

For debugging purposes, I tried dropping any allocation queries on the srcpads of the queues. This changes the behaviour a bit. A few more buffers reaches the multiqueue, but dataflow still stalls.

The audio streaming thread still stalls waiting for GOP complete. The video thread is waiting in gstdataqueue.c:520, which seems to indicate that the queue is full. Several GOPs are queued, but the output file is still empty.

The tests have been done using 1.5.91. Pipeline works just fine if the splitmuxsink is used from the beginning instead of the fakesinks.

Comment 1 Jesper Larsen 2015-09-22 09:49:41 UTC

Created attachment 311840 [details]
debug-log-1

GST_DEBUG=*sink*:6 log with allocation queries

Comment 2 Jesper Larsen 2015-09-22 09:50:25 UTC

Created attachment 311841 [details]
debug-log-2

GST_DEBUG=*sink*:6 log without allocation queries

Comment 3 Vincent Penquerc'h 2016-04-07 11:22:08 UTC

Took me a bit to realize, but the element you're setting to PLAYING is the containing pipeline, rather than the element. The pipeline is already in PLAYING so that's why the splitmuxsink bin never switches (blocks in preroll).

Once this is found and changed, the pipeline works with your probe dropping the allocation queries. If they're not dropped, it still blocks, and that is not clear why to me.

Comment 4 Vincent Penquerc'h 2016-04-11 11:52:47 UTC

What splitmuxsink does is rather obscure. The hang is due to splitmuxsink waiting for events such as EOS, flush, and misc more, while in the streaming thread (in response to the push of a buffer). However, this is the thread that will be pushing the allocation query that the upstream multiqueue is sending.
I feel like this should be resolved by the wait ending in a "normal" way (ie, not an EOS nor flush), leading to the buffer push returning, and the (serialized) allocation query going next. Since taking out the allocation query makes it work, this implies this is not the case, however.

BTW, the pad block code in the test case needs locking, as it is racy.

Comment 5 Vincent Penquerc'h 2016-04-12 15:36:08 UTC

Created attachment 325811 [details] [review]
deadlock fix

I could not find a good fix. This is not due to non-keyframes being seen before the first keyframe (the most obvious difference when inserting the element in a running pipeline) as discarding those does not fix anything. In the end, discarding the allocation query in the multiqueue probe does help, even though not a great fix.

Comment 6 GStreamer system administrator 2018-11-03 15:04:32 UTC

-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/issues/224.