Bug 640610 – basesink: QoS events are wrong in live pipelines

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 640610 - basesink: QoS events are wrong in live pipelines


Summary:	basesink: QoS events are wrong in live pipelines


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gstreamer (core)
Version:	git master
Hardware:	Other Linux

Importance:	Normal blocker
Target Milestone:	1.15.1
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:	796723

Reported:	2011-01-26 02:15 UTC by Olivier Crête
Modified:	2019-01-06 03:01 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
basesink: calculate rstop when get_times() can't (3.41 KB, patch) 2011-01-26 11:41 UTC, Felipe Contreras (banned)	rejected	Details \| Review
basesink: Account for max-lateness in QoS event (1.27 KB, patch) 2015-05-04 21:03 UTC, Nicolas Dufresne (ndufresne)	rejected	Details \| Review
basesink: Add processing deadline (7.17 KB, patch) 2015-05-04 21:33 UTC, Olivier Crête	none	Details \| Review
basesink: Add processing deadline (7.93 KB, patch) 2015-11-03 00:31 UTC, Olivier Crête	none	Details \| Review
basesink: Add processing deadline (7.83 KB, patch) 2017-10-20 16:43 UTC, Olivier Crête	none	Details \| Review
videosink: Set processing deadline to 15ms (1.22 KB, patch) 2017-10-20 16:43 UTC, Olivier Crête	committed	Details \| Review
basesink: Add processing deadline (7.89 KB, patch) 2018-07-07 13:14 UTC, Nicolas Dufresne (ndufresne)	committed	Details \| Review

Description Olivier Crête 2011-01-26 02:15:55 UTC

We have two problem cases:

1. "videotestsrc is-live=1 ! xvimagesink"

If one looks at the QoS events this generates, "diff" is always > 0 .. So it thinks all frames are late. Also, "proportion" is always just slightly about 1.. But in fact, my computer is very fast and could probably do much more processing. So "proportion" should be close to 0.

2. "videotestsrc is-live=1 ! element-that-drops-framerate-and-duration ! xvimagesink"

That magical element could be a network using RTP. Since the sink doesn't know the framerate or duration, the "stop" time is always NONE. So the only thing that prevents frames from being dropped is the "max-lateness", which is often not enough if there is enough processing hapenning.  Also, in the QoS event, the "proportion" computation is completely off. Especially if it knows the duration of some frames but not others.

I think the cause of these two problems is that we ignore the compute latency. I think there should be a global variable somewhere that is added to the latency of every pipeline if it is live instead of ignoring the problem like we currently do.

Comment 1 Edward Hervey 2011-01-26 10:37:50 UTC

Comments regarding the first pipeline:
* It's just above 1.0 because videotestsrc doesn't report the image creation time (compute latency).
* For a live pipeline of both consumers and produces (once you take into account latency) the ideal target is indeed 1.0. 1.0 means the buffers are always arriving "on-time" in the sinks (not too early, not too late).
* If videotestsrc was to take into account the average compute latency (after some buffers) and report that you'd end up with a perfect 1.0

... But you don't want to report that compute latency ! If you have a cpu usage spike all of a sudden, you *want* to drop to catch up. If you start reporting the cpu usage as latency... you'll never drop :)

Comment 2 Felipe Contreras (banned) 2011-01-26 11:41:53 UTC

Created attachment 179362 [details] [review]
basesink: calculate rstop when get_times() can't

The solution to the second problem IMO is to calculate the stop time by using the previous start time as reference.

Comment 3 Olivier Crête 2011-01-26 16:56:29 UTC

For the record, I dislike any solution that estimates the current framerate. Maybe if its unknown, basesink should just never drop buffers and let upstream elements do it based on QoS events ?

Comment 4 Olivier Crête 2011-01-26 16:57:30 UTC

The problem with having proportion tend towards 1.0 in a live pipeline is that upstream elements can't know if they're able to do extra processing or not..

Comment 5 Olivier Crête 2011-01-26 18:02:54 UTC

Also, I didn't mean report the compute latency, I meant allow for some compute latency before we start dropping frames... Currently it does that in secret (by allowing fps/1 of compute latency), but without a known fps this fails. We should have an acceptable compute latency instead.

Comment 6 Felipe Contreras (banned) 2011-01-26 18:23:59 UTC

And how would this "compute latency" be calculated. Should each encoder/decoder be aware how much maximum latency the algorithm would take for the specified settings?

The only way to find that out is to actually measure different configurations, make some guesses and write some heuristics. Do we expect everyone writing codecs to do that?

People would either put random values, or calculate this latency on-the-fly.

And if this latency is calculated on-the-fly, as Edward mentioned, there might be spikes, and which point you would be dropping buffers unnecessarily.

Supposing an extreme case where the "compute latency" is calculated by perfect heuristics based on profile, level, framesize, etc. and there's a very low framerate. Chances are this "compute latency" would be way below the real limit (second / fps). If something else in the system slows things down we again might end up dropping buffers unnecessarily.

My proposed patch doesn't have any of those issues.

Comment 7 Olivier Crête 2011-01-26 18:34:58 UTC

No, my idea is that the "acceptable compute latency" is an arbitrary number decided by the application which knows how much is acceptable and how much is not. If the CPU can't keep up, frames will be dropped (and as early as possible in the pipeline if encoder/decoders use the QoS events).

Edward also came up with another simple case which always drops frames:
videotestsrc is-live=True ! jpegenc ! jpegdec ! xvimagesink

Comment 8 Edward Hervey 2011-01-26 18:40:09 UTC

Comment on attachment 179362 [details] [review]
basesink: calculate rstop when get_times() can't

This is unrelated to the issue at hand. Even with valid framerate and buffer durations we would still drop frames.

Comment 9 Felipe Contreras (banned) 2011-01-26 18:57:40 UTC

(In reply to comment #8)
> (From update of attachment 179362 [details] [review])
> This is unrelated to the issue at hand. Even with valid framerate and buffer
> durations we would still drop frames.

Ok, created bug #640665 then, to make it the issue at hand there.

Comment 10 Edward Hervey 2011-01-26 21:13:47 UTC

Been pondering this some more.

From part-latency.txt:
"The latency is the time it takes for a sample captured at timestamp 0 to reach the sink."

The Fact is that elements that report latencies can only report latencies for which they are certain (in the case of an audio/video sources how big the capture buffers are, in the case of decoders how many buffers they will need to buffer up before they can decode them, in the case of rtpjitterbuffer how long it will delay the packets,...).
Those are *known* latencies (which can be updated, but there are determined/explicit formulas for calculating them based on the input). They do *not* depend on cpu speed/availability.
If you had an element that was doing some processing backed with a dsp with guaranteed/constant processing time based on input, you *could* report that latency (audio processing algorithms come to mind here).

The reason we don't add the processing time is that we can't know for sure whether that processing time is too much or not (you might be sharing the cpu with other processes who are doing cpu intensive tasks, the codec might be going through a high processing phase, ...).

What we *should* end up with is:

sync_time = running_timestamp + reported_delay_latency (+ expected_processing_latency)

Where reported_delay_latency is what the latency GstQuery currently returns.

The problem is :
How do we calculate that expected_processing_latency over which we need to start dropping ?

Elements can't figure that out on their own (expect in rare circumstances) since it would require figuring out the processing complexity based on input data and available cpu speed.

Therefore we do need to offer a way for pipeline users to provide an expected_processing_latency.

This will be somewhat tricky, since we'll need to ensure not to *exceed* the minimum of reported maximum latencies, else we'll end up with live sources under-running. Or maybe we shouldn't care and have application developer insert queue elements to compensate for that.

Something like gst_bin_set_additional_min_latency() could do it. And it would increase the min_latency in bin_query_latency_done().

Comment 11 Felipe Contreras (banned) 2011-01-26 21:29:05 UTC

(In reply to comment #10)
>   The Fact is that elements that report latencies can only report latencies for
> which they are certain (in the case of an audio/video sources how big the
> capture buffers are, in the case of decoders how many buffers they will need to
> buffer up before they can decode them, in the case of rtpjitterbuffer how long
> it will delay the packets,...).
>   Those are *known* latencies (which can be updated, but there are
> determined/explicit formulas for calculating them based on the input). They do
> *not* depend on cpu speed/availability.

But of course in order to calculate those latencies, the framerate is needed, which in the rtp example, it's not.

>   If you had an element that was doing some processing backed with a dsp with
> guaranteed/constant processing time based on input, you *could* report that
> latency (audio processing algorithms come to mind here).

Just to bring this a bit closer to reality. The DSP is not a magic box, just like any other CPU, it cannot know beforehand exactly how much time a certain input need to be processed. The best you can say (if the DSP is ruining a real-tile OS), is that _after_ the processing you would receive the response after exactly a certain amount of time. However, by _you_ I mean the GPP, which of course runs linux, which is not real-time and there's tons of things that could delay such response.

IOW; there's no way to report that latency.

Comment 12 Olivier Crête 2011-01-26 21:54:51 UTC

(In reply to comment #11)
> (In reply to comment #10)
> >   The Fact is that elements that report latencies can only report latencies for
> > which they are certain (in the case of an audio/video sources how big the
> > capture buffers are, in the case of decoders how many buffers they will need to
> > buffer up before they can decode them, in the case of rtpjitterbuffer how long
> > it will delay the packets,...).
> >   Those are *known* latencies (which can be updated, but there are
> > determined/explicit formulas for calculating them based on the input). They do
> > *not* depend on cpu speed/availability.
> 
> But of course in order to calculate those latencies, the framerate is needed,
> which in the rtp example, it's not.

If you have latencies that can be calculated from the framerate, then I guess you have to know the framerate.. Or if its something like RTP, I guess you have to set an arbitrary deadline and drop frames if they are according to that deadline.


> >   If you had an element that was doing some processing backed with a dsp with
> > guaranteed/constant processing time based on input, you *could* report that
> > latency (audio processing algorithms come to mind here).
> 
> Just to bring this a bit closer to reality. The DSP is not a magic box, just
> like any other CPU, it cannot know beforehand exactly how much time a certain
> input need to be processed. The best you can say (if the DSP is ruining a
> real-tile OS), is that _after_ the processing you would receive the response
> after exactly a certain amount of time. However, by _you_ I mean the GPP, which
> of course runs linux, which is not real-time and there's tons of things that
> could delay such response.
> 
> IOW; there's no way to report that latency.

For the processing/transfer latency related to the DSP, I think this is just like any other processing latency and that the application (maybe the platform?) should just use a higher value for the minimum_latency. Otherwise I think Edward's approach is sound. I would call is processing_deadline or something though or max_processing_time. Otherwise it's quite confusing if you don't understand the internals of GStreamer's latency processing.

Comment 13 Edward Hervey 2011-01-27 10:22:36 UTC

Yes, deadline is a better naming.

Comment 14 Felipe Contreras (banned) 2011-01-28 15:31:03 UTC

And how do you calculate this "processing_deadline" if you have no duration? This deadline is essentially the rate at which the buffers are coming from the network; this buffer must finish processing before the next one comes. Of course, you never know when that is going to be.

The best you can do is say "the amount of time it will take for the next buffer to arrive, is probably around the amount of time this buffer took to arrive". That is the most accurate "processing_deadline", anything else would be more unreliable.

Comment 15 Olivier Crête 2011-01-28 18:32:34 UTC

No no, the deadline is not computed. It's arbitrarily decided based on the application. And I would say it can be more than then "duration" of one frame (assuming that some frames are easier to decode than others)..

Comment 16 Felipe Contreras (banned) 2011-01-28 21:22:20 UTC

(In reply to comment #15)
> No no, the deadline is not computed. It's arbitrarily decided based on the
> application. And I would say it can be more than then "duration" of one frame
> (assuming that some frames are easier to decode than others)..

And what if the application doesn't (e.g. gst-launch)? It would be 0, right? Or what if the application does it wrong (2 times the duration)? Everything would go to hell if the processing is too heavy, right?

Comment 17 Olivier Crête 2011-01-28 21:39:57 UTC

I would suggest having a default that's not zero.. Maybe 50ms or something and maybe we could have an environment var to be able to change the default for the whole system/session.

If the application sets it too high, it will just queue up more frames before playing them, so instead of dropping 1 frames out of 2 if you have half the necessary speed, then it would drops 2 out of 4 (for example).

Comment 18 Nicolas Dufresne (ndufresne) 2015-05-04 21:03:21 UTC

Created attachment 302899 [details] [review]
basesink: Account for max-lateness in QoS event

QoS event contain contains the diff between the target time on the
clock and the actual time it was received on the sink. When the diff
is negative, element like fpsdisplaysink assumes that the buffer is
dropped. The problem is that it is only dropped by the sink if the diff
exceeds the max-lateness configured. Substracting this amount to the
diff fixed fpsdisplay sink for the live case where are arrived on target
or slightly after, while being all rendered.

Comment 19 Nicolas Dufresne (ndufresne) 2015-05-04 21:05:14 UTC

This is a tentative fix, I'm looking for input about that.

Comment 20 Sebastian Dröge (slomo) 2015-05-04 21:13:27 UTC

I don't think that's correct. The buffers *are* too late, and the diff shows how much they are too late.

Comment 21 Olivier Crête 2015-05-04 21:33:26 UTC

Created attachment 302900 [details] [review]
basesink: Add processing deadline

Proposing a different approach from Nicolas's, which matches what I suggested a couple
years ago. In this approach, we just add an arbitrary number to the latency. My original
idea was to do that in GstBin, but it may make more sense to do it in each sink, so they can
change the default value.

The processing deadline is the acceptable amount of time to process the media
in a live pipeline before it reaches the sink. This is on top of the algorithmic
latency that is normally reported by the latency query. This should make
pipelines such as "v4lsrc ! xvimagesink" not claim that all frames are late
in the QoS events. Ideally, this should replace max_lateness for most applications.

Comment 22 Nicolas Dufresne (ndufresne) 2015-05-04 21:41:52 UTC

We could expose dropped/renderered count in the QoS event then, since diff will never tell if a the buffer was rendered or not. Actually, even with all these proposal, the droppped/rendered count will be wrong in fpsdisplaysink, since we might be running in sync=false.

Comment 23 Olivier Crête 2015-11-03 00:31:19 UTC

Created attachment 314690 [details] [review]
basesink: Add processing deadline

Updated version of the patch which posts a WARNING message if the processing deadline
exceeds what the pipeline can buffer.

Do you think we should merge this? If not, why?

Comment 24 Sebastian Dröge (slomo) 2015-11-03 07:26:51 UTC

Review of attachment 314690 [details] [review]:

::: libs/gst/base/gstbasesink.c
@@ +545,3 @@
+   * for processing the buffer. This is added to the latency of live
+   * pipelines.
+   */

Since: 1.8

Maybe explain when you would want to use that, especially compared to max-lateness.

@@ +1404,3 @@
+ * Maximum amount of time (in nanoseconds) that the pipeline can take
+ * for processing the buffer. This is added to the latency of live
+ * pipelines.

How would a subclass decide on a good value for this? Based on how much time it thinks it will take to get buffers from render() to the display? Or prepare() to render()?

@@ +1406,3 @@
+ * pipelines.
+ *
+ * This function is usually called by subclasses.

Since: 1.8

@@ +1437,3 @@
+ * the processing deadline.
+ *
+ * Returns: the processing deadline

Since: 1.8

Comment 25 Olivier Crête 2015-11-04 21:02:24 UTC

(In reply to Sebastian Dröge (slomo) from comment #24)
> @@ +1404,3 @@
> + * Maximum amount of time (in nanoseconds) that the pipeline can take
> + * for processing the buffer. This is added to the latency of live
> + * pipelines.
> 
> How would a subclass decide on a good value for this? Based on how much time
> it thinks it will take to get buffers from render() to the display? Or
> prepare() to render()?

Maybe we shouldn't expose it, it's mostly meant for subclass that want to override the "processing-deadline" property.

The value is really meant to be selected by the application as an acceptable deadline. My original idea was to set this and default max-lateness to 0, but I'm worried it would have unintented side-effects.

Comment 26 Olivier Crête 2017-10-20 16:42:23 UTC

Coming back on this.

After giving this more though, I think we should default this to 0 and have GstVideoSink do
- max-lateness = 5ms
- processing-deadline = 15ms

This gives it some time to do decoding, but also gives it a bit of leeway. And it shouldn't modify the behaviour of existing pipelines except that they will report a bit more latency and give a warning if they have zero queuing.

This also makes the frame submission times a lot smoother in the cases where there is no latency, so it's much likelier that it will block on the clock.

Comment 27 Olivier Crête 2017-10-20 16:43:01 UTC

Created attachment 361967 [details] [review]
basesink: Add processing deadline

Same as previous version, except that it sets the default to 0.

Comment 28 Olivier Crête 2017-10-20 16:43:19 UTC

Created attachment 361968 [details] [review]
videosink: Set processing deadline to 15ms

This roughly corresponds to one frame at 60fps, and leave 5ms
of max_lateness to not change the existing behaviour.

Comment 29 Nicolas Dufresne (ndufresne) 2018-07-07 13:14:04 UTC

Created attachment 372972 [details] [review]
basesink: Add processing deadline

The processing deadline is the acceptable amount of time to process the media
in a live pipeline before it reaches the sink. This is on top of the algorithmic
latency that is normally reported by the latency query. This should make
pipelines such as "v4lsrc ! xvimagesink" not claim that all frames are late
in the QoS events. Ideally, this should replace max_lateness for most applications.

Comment 30 Nicolas Dufresne (ndufresne) 2018-07-07 13:14:35 UTC

Comment on attachment 372972 [details] [review]
basesink: Add processing deadline

Attachment 372972 [details] pushed as a7f9c80 - basesink: Add processing deadline

Comment 31 Nicolas Dufresne (ndufresne) 2018-07-07 13:18:09 UTC

Attachment 361968 [details] pushed as 28e5f81 - videosink: Set processing deadline to 15ms

Comment 32 Jan Schmidt 2018-07-14 14:12:00 UTC

The test_pipeline_reset_start_time test in core has been failing since the basesink commit went in.

gst/gstpipeline.c:622:F:pipeline tests:test_pipeline_reset_start_time:0: Pipeline position is not at least 50millisecond (reported 35339525 nanoseconds)

Comment 33 Nicolas Dufresne (ndufresne) 2018-07-14 14:15:23 UTC

Ok, let's mark as blocker to make sure we don't forget.

Comment 34 Nicolas Dufresne (ndufresne) 2018-07-15 13:36:25 UTC

So it fails because the pipeline used has now 20ms latency, which means over 55ms the pipeline position will only be 35ms. Though, I think that the default for processing-latency is wrong here, fakesink should not introduce 20ms latency. Specially that this value is from the VideoSink originally.

We should leave the default to 0, and then VideoSink can set this to 20ms or something inline with the framerate, and AudioSink would set it to something inline with latency-time.

Comment 35 Jan Schmidt 2018-07-16 14:08:31 UTC

I agree we should make the default 0 to keep existing behaviour, but it's still weird that it's affecting the test pipeline that isn't live.

Comment 36 Nicolas Dufresne (ndufresne) 2018-07-16 14:36:31 UTC

I went back to 1.14 and ran:

GST_DEBUG="basesink:5" gst-launch-1.0 fakesrc do-timestamp=1 sizetype=fixed sizemax=4096 datarate=$((4096*100))  ! fakesink silent=0 sync=1 2>&1 | grep latency
0:00:00.022531958 28899 0x557046588520 DEBUG               basesink gstbasesink.c:1168:gst_base_sink_query_latency:<fakesink0> we are ready for LATENCY query
0:00:00.022548676 28899 0x557046588520 DEBUG               basesink gstbasesink.c:1210:gst_base_sink_query_latency:<fakesink0> latency query: live: 1, have_latency 1, upstream: 0, min 0:00:00.000000000, max 99:99:99.999999999


Notice "live: 1". Olivier's patch didn't change that behaviour. It's just that now we have 20ms instead of 0ms.

Comment 37 Olivier Crête 2018-07-16 17:08:59 UTC

I didn't realize how exactly the latency code worked in basesink, I've fixed it now...

The reason is I pushed the change to basesink for everyone is so that pipelines like udpsrc ! videodecoder ! udpsink, because before this fix, those pipelines end up dropping all packets unless QoS is involved.

I've also added some unit tests for the processing-deadline:

commit 4f3aea316acd38ce498dd7130db76e91a76490d7
Author: Olivier Crête <olivier.crete@collabora.com>
Date:   Mon Jul 16 11:51:05 2018 -0400

    sink: Only add processing latency if upstream is live
    
    Only add it if upstream is live, otherwise leave the latency at 0.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=640610

commit f79b871a6a400c411313eb82a71b3a3475c0ac99
Author: Olivier Crête <olivier.crete@collabora.com>
Date:   Mon Jul 16 11:50:36 2018 -0400

    pipeline tests: Add test for processing latency

Comment 38 Tim-Philipp Müller 2019-01-05 19:44:58 UTC

Just for my own understanding: this now delays the output of every sink that syncs against the clock by an extra 20ms, right? If upstream is live.

e.g.

 gst-launch-1.0 v4l2src num-buffers=10 ! fakesink silent=false sync=true processing-deadline=3000000000 -v

Takes 3 seconds to get started.

Is this a good idea in terms of backwards compatibility? Or did I misunderstand something?

Comment 39 Olivier Crête 2019-01-06 03:01:27 UTC

Only if you have a live source upstream. In that case it does add 20ms, but previously the real latency was actually controlled by max-lateness, so you would think there was no latency, but it was not actually true. And if you enabled QoS, it would drop every packet making it unusable.

Yes it's a small change in behaviour, but I argue that the previous behaviour just didn't work at all.