Bug 652562 – basesink: stuck in GST_PAD_PREROLL_WAIT

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 652562 - basesink: stuck in GST_PAD_PREROLL_WAIT


Summary:	basesink: stuck in GST_PAD_PREROLL_WAIT


Status:	RESOLVED INVALID

Product:	GStreamer
Classification:	Platform
Component:	gstreamer (core)
Version:	git master
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	NONE
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2011-06-14 13:30 UTC by Philip Jägenstedt
Modified:	2011-06-26 09:52 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
2 samples of silence (48 bytes, audio/wave) 2011-06-14 13:31 UTC, Philip Jägenstedt	Details

Description Philip Jägenstedt 2011-06-14 13:30:53 UTC

This is a bug that I'm seeing with Opera's GStreamer integration on certain test cases, and it appears like it might be a GStreamer bug. The test case loads two <audio> elements with a very short (2 samples of silence) WAVE file, which I'll attach.

The pipeline constructed is like this:

$ gst-launch-0.10 filesrc location=2samples.wav ! wavparse ! queue ! audioconvert ! audioresample ! volume ! autoaudiosink

At random, one such pipeline will fail to preroll, getting stuck in gst_base_sink_wait_preroll. The full backtrace at that point is:

(gdb) thread apply all bt

+ Trace 227475


As you can see there are two pipelines here, of which 1 is waiting for preroll. GstMediaPlayer::ThreadFunc (part of our integration) is not doing anything interesting at this point, only waiting for messages from GStreamer.

In thread 880, the event being processed is a GST_EVENT_EOS.

I have a few main suspects:

1. The WAVE file is exceptionally short, so it seems likely that there is somewhere a race condition triggered by the EOS event being fired much earlier than it usually is. I haven't looked in wavparse, but that might be a good place to look around.

2. We normally post a seek to position 0 immediately after reaching GST_STATE_PAUSED. This is because we sometimes want to seek to some non-zero offset and just haven't special-cased the typical case to avoid the seek. Doing this seek triggers the GST_PAD_PREROLL_SIGNAL in gst_base_sink_preroll_queue_flush. I presume this is rather atypical use of the API, and that there may be race conditions hidden here.

3. pulsesink itself is failing randomly due to something with PulseAudio. I have no particular reason to suspect this, the pulsesink has been less than reliable in the past.

Comment 1 Philip Jägenstedt 2011-06-14 13:31:23 UTC

Created attachment 189903 [details]
2 samples of silence

Comment 2 Philip Jägenstedt 2011-06-14 13:35:01 UTC

This is the state of basesink while waiting for GST_PAD_PREROLL_WAIT:

(gdb) print *sink
$29 = {element = {object = {object = {g_type_instance = {g_class = 0x50959a0}, ref_count = 2, qdata = 0x0}, refcount = 0, lock = 0x3e0d7d0, name = 0x45ca1a0 "opaudiosink-actual-sink-pulse", name_prefix = 0x0, parent = 0x7fffe4cde4d0, flags = 32, _gst_reserved = 0x0}, state_lock = 0x3e0ef40, state_cond = 0x3b411c0, state_cookie = 2, current_state = GST_STATE_PAUSED, next_state = GST_STATE_VOID_PENDING, pending_state = GST_STATE_VOID_PENDING, last_return = GST_STATE_CHANGE_SUCCESS, bus = 0x3dc60f0, clock = 0x0, base_time = 0, numpads = 1, pads = 0x7fffcc8dcae0, numsrcpads = 0, srcpads = 0x0, numsinkpads = 1, sinkpads = 0x4531600, pads_cookie = 1, abidata = {ABI = {target_state = GST_STATE_PAUSED, start_time = 0}, _gst_reserved = {0x3, 0x0, 0x0, 0x0}}}, sinkpad = 0x3e03da0, pad_mode = GST_ACTIVATE_PUSH, offset = 0, can_activate_pull = 0, can_activate_push = 1, preroll_queue = 0x7fffcc990940, preroll_queue_max_len = 0, preroll_queued = 1, buffers_queued = 0, events_queued = 0, eos = 0, eos_queued = 0, need_preroll = 1, have_preroll = 1, playing_async = 0, have_newsegment = 0, segment = {rate = 1, abs_rate = 1, format = GST_FORMAT_UNDEFINED, flags = GST_SEEK_FLAG_NONE, start = 0, stop = -1, time = 0, accum = 0, last_stop = 0, duration = -1, applied_rate = 1, _gst_reserved = '\000' <repeats 23 times>}, clock_id = 0x0, end_time = 0, sync = 1, flushing = 0, abidata = {ABI = {clip_segment = 0x4b0b570, max_lateness = -1, running = 0}, _gst_reserved = {0x4b0b570, 0xffffffffffffffff, 0x0 <repeats 17 times>}}, priv = 0x44938b0}

I can't tell frmo this if data has flown through the sink before the EOS event, but perhaps someone else can.

Comment 3 Philip Jägenstedt 2011-06-14 14:18:11 UTC

It seems that even when our tests involving this setup pass, we still have a thread waiting for preroll. I've stepped through a typical such scenario and this is what happens:

 1. GST_PAD_PREROLL_WAIT - from gst_pad_push (4 byte buffer, should be the full raw PCM data)

+ Trace 227476

#0 gst_base_sink_wait_preroll
at gstbasesink.c line 2268
#1 gst_base_sink_do_preroll
at gstbasesink.c line 2345
#2 gst_base_sink_do_sync
at gstbasesink.c line 2531
#3 gst_base_sink_render_object
at gstbasesink.c line 2976
#4 gst_base_sink_queue_object_unlocked
at gstbasesink.c line 3293
#5 gst_base_sink_chain_unlocked
at gstbasesink.c line 3672
#6 gst_base_sink_chain_main
at gstbasesink.c line 3710
#7 gst_base_sink_chain
at gstbasesink.c line 3739
#8 gst_pad_chain_data_unchecked
at gstpad.c line 4248
#9 gst_pad_push_data
at gstpad.c line 4480
#10 gst_pad_push
at gstpad.c line 4705
#11 gst_proxy_pad_chain_default
at gstghostpad.c line 272
#12 gst_pad_chain_data_unchecked
at gstpad.c line 4248
#13 gst_pad_push_data
at gstpad.c line 4480
#14 gst_pad_push
at gstpad.c line 4705
#15 gst_base_transform_chain
at gstbasetransform.c line 2474
#16 gst_pad_chain_data_unchecked
at gstpad.c line 4248
#17 gst_pad_push_data
at gstpad.c line 4480
#18 gst_pad_push
at gstpad.c line 4705
#19 gst_base_transform_chain
at gstbasetransform.c line 2474
#20 gst_pad_chain_data_unchecked
at gstpad.c line 4248
#21 gst_pad_push_data
at gstpad.c line 4480
#22 gst_pad_push
at gstpad.c line 4705
#23 gst_base_transform_chain
at gstbasetransform.c line 2474
#24 gst_pad_chain_data_unchecked
at gstpad.c line 4248
#25 gst_pad_push_data
at gstpad.c line 4480
#26 gst_pad_push
at gstpad.c line 4705
#27 gst_queue_push_one
at gstqueue.c line 1155
#28 gst_queue_loop
at gstqueue.c line 1263
#29 gst_task_func
at gsttask.c line 318
#30 default_func
at gsttaskpool.c line 70
#31 ??
from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#32 ??
from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#33 start_thread
from /lib/x86_64-linux-gnu/libpthread.so.0
#34 clone
from /lib/x86_64-linux-gnu/libc.so.6
#35 ??
#0 gst_base_sink_preroll_queue_flush
at gstbasesink.c line 1498
#1 gst_base_sink_set_flushing
at gstbasesink.c line 4202
#2 gst_base_sink_flush_start
at gstbasesink.c line 3375
#3 gst_base_sink_event
at gstbasesink.c line 3509
#4 gst_pad_send_event
at gstpad.c line 5399
#5 gst_pad_push_event
at gstpad.c line 5251
#6 gst_proxy_pad_event_default
at gstghostpad.c line 143
#7 gst_pad_send_event
at gstpad.c line 5399
#8 gst_pad_push_event
at gstpad.c line 5251
#9 gst_base_transform_sink_event
at gstbasetransform.c line 2037
#10 gst_pad_send_event
at gstpad.c line 5399
#11 gst_pad_push_event
at gstpad.c line 5251
#12 gst_base_transform_sink_event
at gstbasetransform.c line 2037
#13 gst_pad_send_event
at gstpad.c line 5399
#14 gst_pad_push_event
at gstpad.c line 5251
#15 gst_base_transform_sink_event
at gstbasetransform.c line 2037
#16 gst_pad_send_event
at gstpad.c line 5399
#17 gst_pad_push_event
at gstpad.c line 5251
#18 gst_queue_handle_sink_event
at gstqueue.c line 825
#19 gst_pad_send_event
at gstpad.c line 5399
#20 gst_pad_push_event
at gstpad.c line 5251
#21 gst_proxy_pad_event_default
at gstghostpad.c line 143
#22 gst_pad_send_event
at gstpad.c line 5399
#23 gst_pad_push_event
at gstpad.c line 5251
#24 gst_proxy_pad_event_default
at gstghostpad.c line 143
#25 gst_pad_send_event
at gstpad.c line 5399
#26 gst_pad_push_event
at gstpad.c line 5251
#27 gst_wavparse_perform_seek
at gstwavparse.c line 879
#28 gst_wavparse_srcpad_event
at gstwavparse.c line 2564
#29 gst_pad_send_event
at gstpad.c line 5399
#30 gst_pad_push_event
at gstpad.c line 5251
#31 gst_proxy_pad_event_default
at gstghostpad.c line 143
#32 gst_pad_send_event
at gstpad.c line 5399
#33 gst_pad_push_event
at gstpad.c line 5251
#34 gst_proxy_pad_event_default
at gstghostpad.c line 143
#35 gst_pad_send_event
at gstpad.c line 5399
#36 gst_pad_push_event
at gstpad.c line 5251
#37 gst_queue_handle_src_event
at gstqueue.c line 1312
#38 gst_pad_send_event
at gstpad.c line 5399
#39 gst_pad_push_event
at gstpad.c line 5251
#40 gst_base_transform_src_eventfunc
at gstbasetransform.c line 2160
#41 gst_base_transform_src_event
at gstbasetransform.c line 2125
#42 gst_pad_send_event
at gstpad.c line 5399
#43 gst_pad_push_event
at gstpad.c line 5251
#44 gst_base_transform_src_eventfunc
at gstbasetransform.c line 2160
#45 gst_base_transform_src_event
at gstbasetransform.c line 2125
#46 gst_pad_send_event
at gstpad.c line 5399
#47 gst_pad_push_event
at gstpad.c line 5251
#48 gst_base_transform_src_eventfunc
at gstbasetransform.c line 2160
#49 gst_base_transform_src_event
at gstbasetransform.c line 2125
#50 gst_pad_send_event
at gstpad.c line 5399
#51 gst_pad_push_event
at gstpad.c line 5251
#52 gst_proxy_pad_event_default
at gstghostpad.c line 143
#53 gst_pad_send_event
at gstpad.c line 5399
#54 gst_pad_push_event
at gstpad.c line 5251
#55 gst_base_sink_send_event
at gstbasesink.c line 4521
#56 gst_element_send_event
at gstelement.c line 1634
#57 gst_bin_send_event
at gstbin.c line 2702
#58 gst_element_send_event
at gstelement.c line 1634
#59 gst_bin_send_event
at gstbin.c line 2702
#60 gst_element_send_event
at gstelement.c line 1634
#61 gst_bin_send_event
at gstbin.c line 2702
#62 gst_element_send_event
at gstelement.c line 1634
#63 gst_element_seek
at gstelement.c line 1674
#64 gst_element_seek_simple
at gstutils.c line 2420
#65 GstMediaPlayer::ThreadFunc
at ../platforms/media_backends/gst/gstmediaplayer.cpp line 422
#66 ??
from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#67 start_thread
from /lib/x86_64-linux-gnu/libpthread.so.0
#68 clone
from /lib/x86_64-linux-gnu/libc.so.6
#69 ??
#0 gst_base_sink_wait_preroll
at gstbasesink.c line 2268
#1 gst_base_sink_do_preroll
at gstbasesink.c line 2345
#2 gst_base_sink_do_sync
at gstbasesink.c line 2531
#3 gst_base_sink_render_object
at gstbasesink.c line 2976
#4 gst_base_sink_queue_object_unlocked
at gstbasesink.c line 3293
#5 gst_base_sink_chain_unlocked
at gstbasesink.c line 3672
#6 gst_base_sink_chain_main
at gstbasesink.c line 3710
#7 gst_base_sink_chain
at gstbasesink.c line 3739
#8 gst_pad_chain_data_unchecked
at gstpad.c line 4248
#9 gst_pad_push_data
at gstpad.c line 4480
#10 gst_pad_push
at gstpad.c line 4705
#11 gst_proxy_pad_chain_default
at gstghostpad.c line 272
#12 gst_pad_chain_data_unchecked
at gstpad.c line 4248
#13 gst_pad_push_data
at gstpad.c line 4480
#14 gst_pad_push
at gstpad.c line 4705
#15 gst_base_transform_chain
at gstbasetransform.c line 2474
#16 gst_pad_chain_data_unchecked
at gstpad.c line 4248
#17 gst_pad_push_data
at gstpad.c line 4480
#18 gst_pad_push
at gstpad.c line 4705
#19 gst_base_transform_chain
at gstbasetransform.c line 2474
#20 gst_pad_chain_data_unchecked
at gstpad.c line 4248
#21 gst_pad_push_data
at gstpad.c line 4480
#22 gst_pad_push
at gstpad.c line 4705
#23 gst_base_transform_chain
at gstbasetransform.c line 2474
#24 gst_pad_chain_data_unchecked
at gstpad.c line 4248
#25 gst_pad_push_data
at gstpad.c line 4480
#26 gst_pad_push
at gstpad.c line 4705
#27 gst_queue_push_one
at gstqueue.c line 1155
#28 gst_queue_loop
at gstqueue.c line 1263
#29 gst_task_func
at gsttask.c line 318
#30 default_func
at gsttaskpool.c line 70
#31 ??
from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#32 ??
from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#33 start_thread
from /lib/x86_64-linux-gnu/libpthread.so.0
#34 clone
from /lib/x86_64-linux-gnu/libc.so.6
#35 ??


 4. Nothing! The thread in step 3 is still waiting.

Will investigate why we're not getting a second preroll, or if basesink should be waiting for it at all.

Comment 4 Philip Jägenstedt 2011-06-14 14:19:01 UTC

That got a bit mangled by the stack trace being pulled out, don't miss steps 2 and 3 which are in the trace.

Comment 5 Philip Jägenstedt 2011-06-14 14:54:00 UTC

It looks like my understanding here may be a bit flawed. Even when removing our "seek to 0" code and using a more normal-sized WAVE file, I still have a thread waiting for preroll when the element is paused.

Is preroll not supposed to happen while the element is paused?

Comment 6 Philip Jägenstedt 2011-06-14 15:15:46 UTC

In such a case, it's gst_base_sink_change_state that calls GST_PAD_PREROLL_SIGNAL for GST_STATE_CHANGE_PAUSED_TO_PLAYING, as such:

        if (basesink->eos) {
          GstMessage *message;

          /* need to post EOS message here */
          GST_DEBUG_OBJECT (basesink, "Now posting EOS");
          message = gst_message_new_eos (GST_OBJECT_CAST (basesink));
          gst_message_set_seqnum (message, basesink->priv->seqnum);
          gst_element_post_message (GST_ELEMENT_CAST (basesink), message);
        } else {
          GST_DEBUG_OBJECT (basesink, "signal preroll");
          GST_PAD_PREROLL_SIGNAL (basesink->sinkpad);
        }

Note especially that GST_PAD_PREROLL_SIGNAL is called only if basesink->eos is not set. When the bug reproduces it is when the eos event has been posted, so it seems quite likely that there's a relation. In such a case, GST_PAD_PREROLL_SIGNAL would never be called here, so the other thread will be left hanging in GST_PAD_PREROLL_WAIT.

Looking at the state of things in a normal (non-hanging) case, it appears as though there may be some missing locking/synchronization for sink->need_preroll. Thread 9 and 14 are the interesting ones, see backtrace at bottom.

Thread 9 checked if (G_LIKELY (sink->need_preroll)) before going into gst_base_sink_wait_preroll, but thread 14 sets basesink->need_preroll = FALSE without making any strong guarantee that those waiting for preroll will be notified in the case if (basesink->eos). Is the assumption that if we have basesink->eos, then no thread can possibly be waiting for preroll? That seems to not be a correct assumption, and possibly the root cause here.

(gdb) thread apply all bt

+ Trace 227479

Thread 14 (Thread 0x7fffeae2c700 (LWP 17195))

#0 gst_base_sink_change_state
at gstbasesink.c line 5006
#1 gst_base_audio_sink_change_state
at gstbaseaudiosink.c line 2016
#2 gst_pulsesink_change_state
at pulsesink.c line 2751
#3 gst_element_change_state
at gstelement.c line 2728
#4 gst_element_set_state_func
at gstelement.c line 2684
#5 gst_element_set_state
at gstelement.c line 2585
#6 gst_bin_element_set_state
at gstbin.c line 2209
#7 gst_bin_change_state_func
at gstbin.c line 2518
#8 gst_auto_audio_sink_change_state
at gstautoaudiosink.c line 398
#9 gst_element_change_state
at gstelement.c line 2728
#10 gst_element_set_state_func
at gstelement.c line 2684
#11 gst_element_set_state
at gstelement.c line 2585
#12 gst_bin_element_set_state
at gstbin.c line 2209
#13 gst_bin_change_state_func
at gstbin.c line 2518
#14 gst_element_change_state
at gstelement.c line 2728
#15 gst_element_set_state_func
at gstelement.c line 2684
#16 gst_element_set_state
at gstelement.c line 2585
#17 gst_bin_element_set_state
at gstbin.c line 2209
#18 gst_bin_change_state_func
at gstbin.c line 2518
#19 gst_pipeline_change_state
at gstpipeline.c line 482
#20 gst_element_change_state
at gstelement.c line 2728
#21 gst_element_set_state_func
at gstelement.c line 2684
#22 gst_element_set_state
at gstelement.c line 2585
#23 GstMediaPlayer::ThreadFunc
at ../platforms/media_backends/gst/gstmediaplayer.cpp line 290
#24 ??
from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#25 start_thread
from /lib/x86_64-linux-gnu/libpthread.so.0
#26 clone
from /lib/x86_64-linux-gnu/libc.so.6
#27 ??

Comment 7 Philip Jägenstedt 2011-06-15 11:12:22 UTC

Trying out the hypothesis that this was due to the logic in GST_STATE_CHANGE_PAUSED_TO_PLAYING I added some printf-ing to see which path was taken, but with this added I couldn't reproduce. However, I didn't see either branch ever being taken, so this particular piece of code is probably not the culprit.

Comment 8 Philip Jägenstedt 2011-06-15 12:15:51 UTC

After chatting with wtay on IRC it seems I've been chasing after the wrong thing. Apparently it's expected that there will be a thread waiting for preroll while the pipeline is PAUSED. I should try to instead figure out why the pipeline isn't reaching the PAUSED state at all when this bug reproduces...

Comment 9 Philip Jägenstedt 2011-06-15 15:56:50 UTC

First, sorry for all the spamming. I was indeed heading in the wrong direction.

The pipeline is in fact reaching PAUSED, I deduced that it wasn't from the fact that a side-effect of it reaching PAUSED never happened:

		case GST_MESSAGE_STATE_CHANGED:
		if (GST_MESSAGE_SRC(msg) == GST_OBJECT_CAST(pipeline))
		{
			GstState oldstate, newstate, pending;
			gst_message_parse_state_changed (msg, &oldstate, &newstate, &pending);

			if (oldstate == GST_STATE_READY && newstate == GST_STATE_PAUSED)
			{
				GstFormat format = GST_FORMAT_TIME;
				gint64 duration;
				if (gst_element_query_duration (pipeline, &format, &duration)
					&& format == GST_FORMAT_TIME)
				{
					// post a gst_message_new_duration to the application
					// we do this because GST_MESSAGE_DURATION usually isn't posted by GStreamer, it seems
				}
			}

The side-effect that never happened was posting that duration message. However, it wasn't because the pipeline wasn't reaching PAUSED, it was because gst_element_query_duration failed. When this happens, seeking to offset 0 also fails. I'm not sure why either fails, but that is the problem to go after, not what I've been chasing in this bug.

Closing as invalid.

Comment 10 Tim-Philipp Müller 2011-06-15 16:01:14 UTC

On a side note, it's much easier to check for the ASYNC_DONE message, that saves you the src==pipeline check and the oldstate/newstate check.

Comment 11 Philip Jägenstedt 2011-06-17 09:58:30 UTC

What was actually wrong:
    
In theory, it is OK for a pull source to return smaller or larger buffers than was requested from _create, and most code handles this. However, few pull sources except Opera's seem to do this, so in practice it isn't very well tested and this is the second bug that turned out to be caused by it.
    
In this case, it is gst_riff_read_chunk that assumes that a short read implies EOS, so decoding simply failed.

I'm not sure if I should file a bug on gst_riff_read_chunk or not, I have instead changed Opera so that _create never returns short buffers except at EOS.

Comment 12 Tim-Philipp Müller 2011-06-17 10:10:17 UTC

> In theory, it is OK for a pull source to return smaller or larger buffers than
> was requested from _create

Citation needed? :)

> In this case, it is gst_riff_read_chunk that assumes that a short read implies
> EOS, so decoding simply failed.
> 
> I'm not sure if I should file a bug on gst_riff_read_chunk or not, I have
> instead changed Opera so that _create never returns short buffers except at
> EOS.

I believe gst_riff_read_chunk() is correct in assuming this, and I don't think it's OK for a pull source to return a buffer smaller than what was requested in other circumstances.

Comment 13 Philip Jägenstedt 2011-06-17 10:45:17 UTC

(In reply to comment #12)
> > In theory, it is OK for a pull source to return smaller or larger buffers than
> > was requested from _create
> 
> Citation needed? :)

Wow, digging in history here proved embarrassing and amusing.

The documentation was updated to "When the subclass returns GST_FLOW_OK, it MUST return a buffer of the requested size unless fewer bytes are available because an EOS condition is near." in https://bugzilla.gnome.org/show_bug.cgi?id=617733 which was filed by none other than me!

In that bug, I even claimed that I would fix Opera to respect this, but obviously didn't and even managed to completely suppress it from memory.

Now I've actually fixed it. Yay me.

Comment 14 Sebastian Dröge (slomo) 2011-06-26 09:52:12 UTC

FWIW the only case where you are allowed to return *less* than requested in _create() is at the end of stream and then immediately afterwards an EOS event should be sent.