Bug 635701 – rtspsrc: seeking is broken

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 635701 - rtspsrc: seeking is broken


Summary:	rtspsrc: seeking is broken


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gst-plugins-good
Version:	git master
Hardware:	Other Linux

Importance:	Normal blocker
Target Milestone:	1.5.1
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Duplicates:	673490 740596 (view as bug list)
Depends on:
Blocks:

Reported:	2010-11-24 15:39 UTC by René Stadler
Modified:	2015-05-05 13:37 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
basedepayload: Fix generated segment (2.33 KB, patch) 2015-03-25 22:43 UTC, Nicolas Dufresne (ndufresne)	accepted-commit_now	Details \| Review
rtspsrc: Fix segment in TCP mode (4.11 KB, patch) 2015-03-25 22:46 UTC, Nicolas Dufresne (ndufresne)	committed	Details \| Review
basedepayload: Fix generated segment (3.59 KB, patch) 2015-03-26 21:38 UTC, Nicolas Dufresne (ndufresne)	none	Details \| Review
basedepayload: Fix generated segment (3.44 KB, patch) 2015-03-26 21:45 UTC, Nicolas Dufresne (ndufresne)	committed	Details \| Review
basedepay: Handle initial gaps and no clock-base (5.99 KB, patch) 2015-03-27 20:27 UTC, Nicolas Dufresne (ndufresne)	none	Details \| Review
basedepay: Handle initial gaps and no clock-base (10.92 KB, patch) 2015-03-27 21:57 UTC, Nicolas Dufresne (ndufresne)	committed	Details \| Review
basesrc: Flush-stop should not start task in paused (1.11 KB, patch) 2015-03-27 22:59 UTC, Nicolas Dufresne (ndufresne)	none	Details \| Review
basesrc: Flush-stop starts live task in paused (1.11 KB, patch) 2015-03-27 23:02 UTC, Nicolas Dufresne (ndufresne)	committed	Details \| Review

Description René Stadler 2010-11-24 15:39:54 UTC

Try this test stream with git gst in totem:

rtsp://195.134.224.14/data/ByContainer/Video/MP4/h263/CIF/AAC/MP4_h263_CIF_30FPS_445Kbps_AAC_44.1KHz_127Kbps_60sec(4.1Mb)_BBB(hinted).mp4

It's indicated as seekable, but most attempts at changing the position lead to audio and video being stuck. The server is supposed to be Helix 14. Not sure if there is a server side issue as well; with most other media players I don't even get that far :)

As pointed out by Stefan, gst-discoverer chokes on it as well:

  Analyzing rtsp://195.134.224.14/data/ByContainer/Video/MP4/h263/CIF/AAC/MP4_h263_CIF_30FPS_445Kbps_AAC_44.1KHz_127Kbps_60sec(4.1Mb)_BBB(hinted).mp4

  (gst-discoverer:31836): GLib-GObject-WARNING **: instance with invalid (NULL) class pointer

  (gst-discoverer:31836): GLib-GObject-CRITICAL **: g_signal_handler_disconnect: assertion `G_TYPE_CHECK_INSTANCE (instance)' failed

  (gst-discoverer:31836): GLib-GObject-WARNING **: instance with invalid (NULL) class pointer

  (gst-discoverer:31836): GLib-GObject-CRITICAL **: g_signal_handler_disconnect: assertion `G_TYPE_CHECK_INSTANCE (instance)' failed
Done discovering rtsp://195.134.224.14/data/ByContainer/Video/MP4/h263/CIF/AAC/MP4_h263_CIF_30FPS_445Kbps_AAC_44.1KHz_127Kbps_60sec(4.1Mb)_BBB(hinted).mp4

Comment 1 René Stadler 2010-11-24 15:45:44 UTC

Apparently Stefan already has the fix for the discoverer problem at least :)

Comment 2 Tim-Philipp Müller 2012-02-18 17:00:14 UTC

Moving to rtspsrc for now.

Comment 3 Nicolas Dufresne (ndufresne) 2015-03-25 22:21:56 UTC

*** Bug 673490 has been marked as a duplicate of this bug. ***

Comment 4 Nicolas Dufresne (ndufresne) 2015-03-25 22:28:10 UTC

*** Bug 740596 has been marked as a duplicate of this bug. ***

Comment 5 Nicolas Dufresne (ndufresne) 2015-03-25 22:28:41 UTC

*** Bug 740509 has been marked as a duplicate of this bug. ***

Comment 6 Nicolas Dufresne (ndufresne) 2015-03-25 22:43:54 UTC

Created attachment 300315 [details] [review]
basedepayload: Fix generated segment

This fixes playback position in RTSP.

Comment 7 Nicolas Dufresne (ndufresne) 2015-03-25 22:46:08 UTC

Created attachment 300317 [details] [review]
rtspsrc: Fix segment in TCP mode

It is expected that buffers are time-stamped with running time. Set
a segment accordingly. In this case we pick 0,-1 as this is what udpsrc
would do. Depayloaders will update the segment to reflect the playback
position.

Comment 8 Nicolas Dufresne (ndufresne) 2015-03-25 22:49:26 UTC

Remains one blocker issue, if you pause, seek, play it does not work. Will have a look tomorrow. It's also broken in totem, which I suspect to always seek in paused. Hopefully it's not waiting for an async done, since this source is live it will never happen.

Comment 9 Sebastian Dröge (slomo) 2015-03-26 09:02:17 UTC

Also test with gst-plugins-base/tests/examples/playback-test :) Going back to PAUSED before seeking is the only way how you can guarantee a deterministic change of states and arrival of messages, that's why totem (and other code) do that.

If they wait for async-done that's a problem here indeed, instead they should wait for the state-change messages I guess.

Comment 10 Sebastian Dröge (slomo) 2015-03-26 09:02:33 UTC

Comment on attachment 300317 [details] [review]
rtspsrc: Fix segment in TCP mode

Seems reasonable, now at least UDP and TCP are working exactly the same

Comment 11 Sebastian Dröge (slomo) 2015-03-26 09:06:48 UTC

Review of attachment 300315 [details] [review]:

Seems correct.

For the PAUSED case this probably also behaves different depending on whether the clock continues running in paused (system clock) or stays at the same position (most audio clocks). What are exactly the symptoms with PAUSED, what running times do we output and what would be expected at this point (clock_time - base_time)?

::: gst-libs/gst/rtp/gstrtpbasedepayload.c
@@ +588,2 @@
   segment.time = priv->npt_start;
+  segment.position = 0;

Shouldn't position be pts? Not that this value matters much anyway ;)

Comment 12 Tim-Philipp Müller 2015-03-26 09:52:56 UTC

There is still an async-done, even if it's a live pipeline. And if it's a flushing seek, I would expect there to be an async-done after the seek as well, no?

Comment 13 Sebastian Dröge (slomo) 2015-03-26 10:21:08 UTC

Good point. The sinks will still flush, then lose state... then the pipeline should go to PAUSED (pending=PAUSED). Then later it prerolls again and you get async-done. Only that the source and other things stay in PLAYING while all this happens.

Comment 14 Nicolas Dufresne (ndufresne) 2015-03-26 12:19:51 UTC

(In reply to Sebastian Dröge (slomo) from comment #11)
> Review of attachment 300315 [details] [review] [review]:
> 
> Seems correct.
> 
> For the PAUSED case this probably also behaves different depending on
> whether the clock continues running in paused (system clock) or stays at the
> same position (most audio clocks). What are exactly the symptoms with
> PAUSED, what running times do we output and what would be expected at this
> point (clock_time - base_time)?

When we go in pause, seek, and then in playing, the jitterbuffer does not output anything anymore. I suspect a bug in the fact that seeking in this case is handle using a slightly different code path in rtspsrc. My intestention today was to check that, and maybe make sure we have a single path. When going to pause before seeking, part of the perform_seek() code is skipped, and is then executed within the streaming thread loop. I was wondering if in fact everything should be done in the streaming thread. I believe doing blocking network operation in the seeker thread isn't ideal.

> 
> ::: gst-libs/gst/rtp/gstrtpbasedepayload.c
> @@ +588,2 @@
>    segment.time = priv->npt_start;
> +  segment.position = 0;
> 
> Shouldn't position be pts? Not that this value matters much anyway ;)

I was not sure, so I left it as it was before. Is is the position from start, in which case it should be segment.start, hence PTS, or the position as an offset from start ?

Comment 15 Nicolas Dufresne (ndufresne) 2015-03-26 12:20:32 UTC

(In reply to Tim-Philipp Müller from comment #12)
> There is still an async-done, even if it's a live pipeline. And if it's a
> flushing seek, I would expect there to be an async-done after the seek as
> well, no?

That need to be checked. I'll have a look today.

Comment 16 Nicolas Dufresne (ndufresne) 2015-03-26 12:21:23 UTC

(In reply to Sebastian Dröge (slomo) from comment #13)
> Good point. The sinks will still flush, then lose state... then the pipeline
> should go to PAUSED (pending=PAUSED). Then later it prerolls again and you
> get async-done. Only that the source and other things stay in PLAYING while
> all this happens.

Except there is no preroll of course.

Comment 17 Sebastian Dröge (slomo) 2015-03-26 13:37:15 UTC

(In reply to Nicolas Dufresne (stormer) from comment #16)
> (In reply to Sebastian Dröge (slomo) from comment #13)
> > Good point. The sinks will still flush, then lose state... then the pipeline
> > should go to PAUSED (pending=PAUSED). Then later it prerolls again and you
> > get async-done. Only that the source and other things stay in PLAYING while
> > all this happens.
> 
> Except there is no preroll of course.

The sinks still pre-roll in live pipelines.

Comment 18 Nicolas Dufresne (ndufresne) 2015-03-26 21:38:07 UTC

Created attachment 300399 [details] [review]
basedepayload: Fix generated segment

This fixes playback position in RTSP.

Comment 19 Nicolas Dufresne (ndufresne) 2015-03-26 21:40:01 UTC

This update is for the position mention, there it also fixes the unit test, while the unit test was right that stop = -1 should result in segment.stop = -1 (oops).

Comment 20 Nicolas Dufresne (ndufresne) 2015-03-26 21:45:07 UTC

Created attachment 300400 [details] [review]
basedepayload: Fix generated segment

This fixes playback position in RTSP.

Comment 21 Nicolas Dufresne (ndufresne) 2015-03-26 21:59:51 UTC

Attachment 300317 [details] pushed as 84725d6 - rtspsrc: Fix segment in TCP mode

Comment 22 Nicolas Dufresne (ndufresne) 2015-03-26 22:00:08 UTC

Attachment 300400 [details] pushed as 802ad73 - basedepayload: Fix generated segment

Comment 23 Nicolas Dufresne (ndufresne) 2015-03-27 11:47:19 UTC

What I found yestertday is that in the case we do pause/seek/play, the internal UDP src get put to play before we have pushed the segment. I think we should lock our internal element state and manage it ourself, to avoid this kind of unwanted behaviour. We need the PLAY response to produce the proper caps, and that needs to be set on the source before they are set to playing.

Comment 24 Nicolas Dufresne (ndufresne) 2015-03-27 20:27:07 UTC

Created attachment 300487 [details] [review]
basedepay: Handle initial gaps and no clock-base

When generating segment, we can't assume the first buffer is actually
the first expected one. If it's not, we need to adjust the segment to
start a bit before.

Additionally, we if don't know when the stream is suppose to have
started (no clock-base in caps), it means we need to keep everything in
running time and only rely on jitterbuffer to synchronize.

Comment 25 Nicolas Dufresne (ndufresne) 2015-03-27 20:41:28 UTC

Comment on attachment 300487 [details] [review]
basedepay: Handle initial gaps and no clock-base

Wrong bug.

Comment 26 Nicolas Dufresne (ndufresne) 2015-03-27 20:42:07 UTC

Comment on attachment 300487 [details] [review]
basedepay: Handle initial gaps and no clock-base

Oops, actually, it's the right bug.

Comment 27 Nicolas Dufresne (ndufresne) 2015-03-27 21:57:59 UTC

Created attachment 300493 [details] [review]
basedepay: Handle initial gaps and no clock-base

When generating segment, we can't assume the first buffer is actually
the first expected one. If it's not, we need to adjust the segment to
start a bit before.

Additionally, we if don't know when the stream is suppose to have
started (no clock-base in caps), it means we need to keep everything in
running time and only rely on jitterbuffer to synchronize.

Comment 28 Nicolas Dufresne (ndufresne) 2015-03-27 22:59:30 UTC

Created attachment 300500 [details] [review]
basesrc: Flush-stop should not start task in paused

The flush-stop event should not restart the task unless the element is
in playing state. This was breaking seek in pause with the rtpsrc.

Comment 29 Nicolas Dufresne (ndufresne) 2015-03-27 23:02:13 UTC

Created attachment 300501 [details] [review]
basesrc: Flush-stop starts live task in paused

The flush-stop event should not restart the task for live sources unless
the element is playing. This was breaking seeks in pause with the rtpsrc.

Comment 30 Nicolas Dufresne (ndufresne) 2015-03-27 23:17:54 UTC

Attachment 300501 [details] pushed as 208696a - basesrc: Flush-stop starts live task in paused

Comment 31 Nicolas Dufresne (ndufresne) 2015-03-27 23:18:50 UTC

Works now.

Attachment 300493 [details] pushed as b7facba - basedepay: Handle initial gaps and no clock-base

Comment 32 Sebastian Dröge (slomo) 2015-04-30 14:31:22 UTC

802ad7310337a7436fea20486ee54d94b2142816 fixes live playback apparently. Run this from gst-rtsp-server/examples:
./test-launch "( videotestsrc is-live=true ! video/x-raw,width=640,height=480 ! warptv ! videoconvert ! queue ! vp8enc ! rtpvp8pay name=pay0 pt=96 )"

And then just playbin on the stream. That commit breaks it. Changing is-live=false makes it work again.

Comment 33 Nicolas Dufresne (ndufresne) 2015-04-30 20:24:07 UTC

Works fine with H264 and Theora, so this is specific to VP8.

Comment 34 Nicolas Dufresne (ndufresne) 2015-04-30 20:25:24 UTC

Actually, I get exact same playback (a burst followed by a pause) with VP8 regardless of the value of is-live.

Comment 35 Nicolas Dufresne (ndufresne) 2015-04-30 20:34:41 UTC

Reverting back to the commit before 4b2142816 does not fix anything.

How did you conclude that this patch was causing your issues ? 
Would it be possible to describe the issues you are having ?

Comment 36 Nicolas Dufresne (ndufresne) 2015-04-30 21:54:24 UTC

After more testing, I observed that the problem I see goes away when the pattern has less complexity (like pattern=18). In that case, the problem was caused by hidden latency in vp8enc, which can be fixed by setting a deadline.

test-launch "( videotestsrc is-live=0 ! video/x-raw,width=640,height=480 ! queue ! vp8enc deadline=1 ! rtpvp8pay name=pay0 )"

When deadline is 0 (the default) there is no way to guess what vp8enc latency will be. With the deadline > 0, the latency becomes under control. I don't think it make any sense to reopen this bug for this, as it's not the same kind of issues (it's not actually an issue, just a miss-configure pipeline for live).

Comment 37 Sebastian Dröge (slomo) 2015-05-04 13:54:06 UTC

Ok, then the h264 problem is maybe something different :) It happens even more often here when you use 25fps instead of 30fps (or even 10fps!). And 802ad7310337a7436fea20486ee54d94b2142816 is the difference between it working fine or not.

./test-launch "videotestsrc is-live=true pattern=18 ! timeoverlay ! video/x-raw,width=1280,height=720,framerate=25/1 ! x264enc tune=zerolatency ! h264parse ! rtph264pay name=pay0 pt=96"

Comment 38 Sebastian Dröge (slomo) 2015-05-04 13:59:44 UTC

And remove the tune=zerolatency to actually make it fail :) This seems to be some kind of race condition

Comment 39 Nicolas Dufresne (ndufresne) 2015-05-04 14:49:08 UTC

Can you explain in which way it fails ? Just saying it fail isn't useful, as a proof I could not reproduce last time and reverting the patch you pointed didn't make any difference.

Comment 40 Heinrich Fink 2015-05-04 18:03:05 UTC

I have been testing a bit with with slomo today. Failing in this context means that simply playing the stream with gst-play-1.0 will drop a large amount of frames constantly (i.e. it stutters heavily). 

A few other remarks:

- Using a machine with more CPU cores will make the problem worse (e.g. we tested on a 12-core Xeon, limiting the number of CPU cores via taskset for the test-launch process makes the problem go away)

- It only happens with gst-play-1.0 of GST HEAD, i.e. running test-launch with slomo's test arguments above and a gst-play-1.0 from 1.4.5 stable, will not drop any frames.

- When running test-launch and gst-play-1.0 on separate machines, different timing through network latency might hide the race condition, although I haven't tried using lower framerates on separate machines (which makes the problem worse in general). So for reproducing this issue, I'd recommend running server/client on the same machine.

Reverting gst-plugins-base back to d54d51d (just before this bug's commit) makes the problem go away for me as well.

Comment 41 Vivia Nikolaidou 2015-05-05 08:47:18 UTC

@stormer : You can probably reproduce it using the test-launch pipeline slomo talked about and different framerates. Here 25 runs fine and 20 stutters, giving exactly 1fps.

Comment 42 Sebastian Dröge (slomo) 2015-05-05 10:55:26 UTC

It looks like things are coming 200-250ms too late out of the jitterbuffer

Comment 43 Sebastian Dröge (slomo) 2015-05-05 11:07:09 UTC

No, actually the problem is that the segment event is wrong. Because of the adjustment in rtpbasedepayload. Looking at why now

Comment 44 Sebastian Dröge (slomo) 2015-05-05 11:23:16 UTC

And it's because of a gap of 2 packets between seqnum-base and the actually first received packet. Which due to interesting calculations gives a start position of 2.43s in the segment instead of 0s.

Comment 45 Sebastian Dröge (slomo) 2015-05-05 13:37:39 UTC

commit a73631a29dccdc123a0072c2b7f9a594d1c6222a
Author: Sebastian Dröge <sebastian@centricular.com>
Date:   Tue May 5 15:35:46 2015 +0200

    streamsynchronizer: Don't override segment.base from upstream with 0
    
    Upstream might want to use it to properly map timestamps to running/stream
    times, if we just override it with 0 synchronization will be just wrong.
    
    For this we remove some old 0.10 code related to segment accumulation, and
    remove some more code that is useless now, and accumulate the group start time
    (aka segment.base offset) manually now.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=635701