GNOME Bugzilla – Bug 635701
rtspsrc: seeking is broken
Last modified: 2015-05-05 13:37:39 UTC
Try this test stream with git gst in totem: rtsp://195.134.224.14/data/ByContainer/Video/MP4/h263/CIF/AAC/MP4_h263_CIF_30FPS_445Kbps_AAC_44.1KHz_127Kbps_60sec(4.1Mb)_BBB(hinted).mp4 It's indicated as seekable, but most attempts at changing the position lead to audio and video being stuck. The server is supposed to be Helix 14. Not sure if there is a server side issue as well; with most other media players I don't even get that far :) As pointed out by Stefan, gst-discoverer chokes on it as well: Analyzing rtsp://195.134.224.14/data/ByContainer/Video/MP4/h263/CIF/AAC/MP4_h263_CIF_30FPS_445Kbps_AAC_44.1KHz_127Kbps_60sec(4.1Mb)_BBB(hinted).mp4 (gst-discoverer:31836): GLib-GObject-WARNING **: instance with invalid (NULL) class pointer (gst-discoverer:31836): GLib-GObject-CRITICAL **: g_signal_handler_disconnect: assertion `G_TYPE_CHECK_INSTANCE (instance)' failed (gst-discoverer:31836): GLib-GObject-WARNING **: instance with invalid (NULL) class pointer (gst-discoverer:31836): GLib-GObject-CRITICAL **: g_signal_handler_disconnect: assertion `G_TYPE_CHECK_INSTANCE (instance)' failed Done discovering rtsp://195.134.224.14/data/ByContainer/Video/MP4/h263/CIF/AAC/MP4_h263_CIF_30FPS_445Kbps_AAC_44.1KHz_127Kbps_60sec(4.1Mb)_BBB(hinted).mp4
Apparently Stefan already has the fix for the discoverer problem at least :)
Moving to rtspsrc for now.
*** Bug 673490 has been marked as a duplicate of this bug. ***
*** Bug 740596 has been marked as a duplicate of this bug. ***
*** Bug 740509 has been marked as a duplicate of this bug. ***
Created attachment 300315 [details] [review] basedepayload: Fix generated segment This fixes playback position in RTSP.
Created attachment 300317 [details] [review] rtspsrc: Fix segment in TCP mode It is expected that buffers are time-stamped with running time. Set a segment accordingly. In this case we pick 0,-1 as this is what udpsrc would do. Depayloaders will update the segment to reflect the playback position.
Remains one blocker issue, if you pause, seek, play it does not work. Will have a look tomorrow. It's also broken in totem, which I suspect to always seek in paused. Hopefully it's not waiting for an async done, since this source is live it will never happen.
Also test with gst-plugins-base/tests/examples/playback-test :) Going back to PAUSED before seeking is the only way how you can guarantee a deterministic change of states and arrival of messages, that's why totem (and other code) do that. If they wait for async-done that's a problem here indeed, instead they should wait for the state-change messages I guess.
Comment on attachment 300317 [details] [review] rtspsrc: Fix segment in TCP mode Seems reasonable, now at least UDP and TCP are working exactly the same
Review of attachment 300315 [details] [review]: Seems correct. For the PAUSED case this probably also behaves different depending on whether the clock continues running in paused (system clock) or stays at the same position (most audio clocks). What are exactly the symptoms with PAUSED, what running times do we output and what would be expected at this point (clock_time - base_time)? ::: gst-libs/gst/rtp/gstrtpbasedepayload.c @@ +588,2 @@ segment.time = priv->npt_start; + segment.position = 0; Shouldn't position be pts? Not that this value matters much anyway ;)
There is still an async-done, even if it's a live pipeline. And if it's a flushing seek, I would expect there to be an async-done after the seek as well, no?
Good point. The sinks will still flush, then lose state... then the pipeline should go to PAUSED (pending=PAUSED). Then later it prerolls again and you get async-done. Only that the source and other things stay in PLAYING while all this happens.
(In reply to Sebastian Dröge (slomo) from comment #11) > Review of attachment 300315 [details] [review] [review]: > > Seems correct. > > For the PAUSED case this probably also behaves different depending on > whether the clock continues running in paused (system clock) or stays at the > same position (most audio clocks). What are exactly the symptoms with > PAUSED, what running times do we output and what would be expected at this > point (clock_time - base_time)? When we go in pause, seek, and then in playing, the jitterbuffer does not output anything anymore. I suspect a bug in the fact that seeking in this case is handle using a slightly different code path in rtspsrc. My intestention today was to check that, and maybe make sure we have a single path. When going to pause before seeking, part of the perform_seek() code is skipped, and is then executed within the streaming thread loop. I was wondering if in fact everything should be done in the streaming thread. I believe doing blocking network operation in the seeker thread isn't ideal. > > ::: gst-libs/gst/rtp/gstrtpbasedepayload.c > @@ +588,2 @@ > segment.time = priv->npt_start; > + segment.position = 0; > > Shouldn't position be pts? Not that this value matters much anyway ;) I was not sure, so I left it as it was before. Is is the position from start, in which case it should be segment.start, hence PTS, or the position as an offset from start ?
(In reply to Tim-Philipp Müller from comment #12) > There is still an async-done, even if it's a live pipeline. And if it's a > flushing seek, I would expect there to be an async-done after the seek as > well, no? That need to be checked. I'll have a look today.
(In reply to Sebastian Dröge (slomo) from comment #13) > Good point. The sinks will still flush, then lose state... then the pipeline > should go to PAUSED (pending=PAUSED). Then later it prerolls again and you > get async-done. Only that the source and other things stay in PLAYING while > all this happens. Except there is no preroll of course.
(In reply to Nicolas Dufresne (stormer) from comment #16) > (In reply to Sebastian Dröge (slomo) from comment #13) > > Good point. The sinks will still flush, then lose state... then the pipeline > > should go to PAUSED (pending=PAUSED). Then later it prerolls again and you > > get async-done. Only that the source and other things stay in PLAYING while > > all this happens. > > Except there is no preroll of course. The sinks still pre-roll in live pipelines.
Created attachment 300399 [details] [review] basedepayload: Fix generated segment This fixes playback position in RTSP.
This update is for the position mention, there it also fixes the unit test, while the unit test was right that stop = -1 should result in segment.stop = -1 (oops).
Created attachment 300400 [details] [review] basedepayload: Fix generated segment This fixes playback position in RTSP.
Attachment 300317 [details] pushed as 84725d6 - rtspsrc: Fix segment in TCP mode
Attachment 300400 [details] pushed as 802ad73 - basedepayload: Fix generated segment
What I found yestertday is that in the case we do pause/seek/play, the internal UDP src get put to play before we have pushed the segment. I think we should lock our internal element state and manage it ourself, to avoid this kind of unwanted behaviour. We need the PLAY response to produce the proper caps, and that needs to be set on the source before they are set to playing.
Created attachment 300487 [details] [review] basedepay: Handle initial gaps and no clock-base When generating segment, we can't assume the first buffer is actually the first expected one. If it's not, we need to adjust the segment to start a bit before. Additionally, we if don't know when the stream is suppose to have started (no clock-base in caps), it means we need to keep everything in running time and only rely on jitterbuffer to synchronize.
Comment on attachment 300487 [details] [review] basedepay: Handle initial gaps and no clock-base Wrong bug.
Comment on attachment 300487 [details] [review] basedepay: Handle initial gaps and no clock-base Oops, actually, it's the right bug.
Created attachment 300493 [details] [review] basedepay: Handle initial gaps and no clock-base When generating segment, we can't assume the first buffer is actually the first expected one. If it's not, we need to adjust the segment to start a bit before. Additionally, we if don't know when the stream is suppose to have started (no clock-base in caps), it means we need to keep everything in running time and only rely on jitterbuffer to synchronize.
Created attachment 300500 [details] [review] basesrc: Flush-stop should not start task in paused The flush-stop event should not restart the task unless the element is in playing state. This was breaking seek in pause with the rtpsrc.
Created attachment 300501 [details] [review] basesrc: Flush-stop starts live task in paused The flush-stop event should not restart the task for live sources unless the element is playing. This was breaking seeks in pause with the rtpsrc.
Attachment 300501 [details] pushed as 208696a - basesrc: Flush-stop starts live task in paused
Works now. Attachment 300493 [details] pushed as b7facba - basedepay: Handle initial gaps and no clock-base
802ad7310337a7436fea20486ee54d94b2142816 fixes live playback apparently. Run this from gst-rtsp-server/examples: ./test-launch "( videotestsrc is-live=true ! video/x-raw,width=640,height=480 ! warptv ! videoconvert ! queue ! vp8enc ! rtpvp8pay name=pay0 pt=96 )" And then just playbin on the stream. That commit breaks it. Changing is-live=false makes it work again.
Works fine with H264 and Theora, so this is specific to VP8.
Actually, I get exact same playback (a burst followed by a pause) with VP8 regardless of the value of is-live.
Reverting back to the commit before 4b2142816 does not fix anything. How did you conclude that this patch was causing your issues ? Would it be possible to describe the issues you are having ?
After more testing, I observed that the problem I see goes away when the pattern has less complexity (like pattern=18). In that case, the problem was caused by hidden latency in vp8enc, which can be fixed by setting a deadline. test-launch "( videotestsrc is-live=0 ! video/x-raw,width=640,height=480 ! queue ! vp8enc deadline=1 ! rtpvp8pay name=pay0 )" When deadline is 0 (the default) there is no way to guess what vp8enc latency will be. With the deadline > 0, the latency becomes under control. I don't think it make any sense to reopen this bug for this, as it's not the same kind of issues (it's not actually an issue, just a miss-configure pipeline for live).
Ok, then the h264 problem is maybe something different :) It happens even more often here when you use 25fps instead of 30fps (or even 10fps!). And 802ad7310337a7436fea20486ee54d94b2142816 is the difference between it working fine or not. ./test-launch "videotestsrc is-live=true pattern=18 ! timeoverlay ! video/x-raw,width=1280,height=720,framerate=25/1 ! x264enc tune=zerolatency ! h264parse ! rtph264pay name=pay0 pt=96"
And remove the tune=zerolatency to actually make it fail :) This seems to be some kind of race condition
Can you explain in which way it fails ? Just saying it fail isn't useful, as a proof I could not reproduce last time and reverting the patch you pointed didn't make any difference.
I have been testing a bit with with slomo today. Failing in this context means that simply playing the stream with gst-play-1.0 will drop a large amount of frames constantly (i.e. it stutters heavily). A few other remarks: - Using a machine with more CPU cores will make the problem worse (e.g. we tested on a 12-core Xeon, limiting the number of CPU cores via taskset for the test-launch process makes the problem go away) - It only happens with gst-play-1.0 of GST HEAD, i.e. running test-launch with slomo's test arguments above and a gst-play-1.0 from 1.4.5 stable, will not drop any frames. - When running test-launch and gst-play-1.0 on separate machines, different timing through network latency might hide the race condition, although I haven't tried using lower framerates on separate machines (which makes the problem worse in general). So for reproducing this issue, I'd recommend running server/client on the same machine. Reverting gst-plugins-base back to d54d51d (just before this bug's commit) makes the problem go away for me as well.
@stormer : You can probably reproduce it using the test-launch pipeline slomo talked about and different framerates. Here 25 runs fine and 20 stutters, giving exactly 1fps.
It looks like things are coming 200-250ms too late out of the jitterbuffer
No, actually the problem is that the segment event is wrong. Because of the adjustment in rtpbasedepayload. Looking at why now
And it's because of a gap of 2 packets between seqnum-base and the actually first received packet. Which due to interesting calculations gives a start position of 2.43s in the segment instead of 0s.
commit a73631a29dccdc123a0072c2b7f9a594d1c6222a Author: Sebastian Dröge <sebastian@centricular.com> Date: Tue May 5 15:35:46 2015 +0200 streamsynchronizer: Don't override segment.base from upstream with 0 Upstream might want to use it to properly map timestamps to running/stream times, if we just override it with 0 synchronization will be just wrong. For this we remove some old 0.10 code related to segment accumulation, and remove some more code that is useless now, and accumulate the group start time (aka segment.base offset) manually now. https://bugzilla.gnome.org/show_bug.cgi?id=635701