Bug 743905 – multiqueue: handle the BUFFERING query

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 743905 - multiqueue: handle the BUFFERING query


Summary:	multiqueue: handle the BUFFERING query


Status:	RESOLVED WONTFIX

Product:	GStreamer
Classification:	Platform
Component:	gstreamer (core)
Version:	git master
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	git master
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2015-02-03 04:12 UTC by Duncan Palmer
Modified:	2018-07-20 06:29 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
This patch adds support for the BUFFERING query. (1.52 KB, patch) 2015-02-03 04:12 UTC, Duncan Palmer	none	Details \| Review

Description Duncan Palmer 2015-02-03 04:12:46 UTC

Created attachment 295990 [details] [review]
This patch adds support for the BUFFERING query. 

Currently, the multiqueue element doesn't handle the BUFFERING query, whereas it's counterpart queue2 does. 

The attached patch adds support for the BUFFERING query. 

I call on update_buffering() when processing the query. This is different to how queue2 operates - it just determines whether the element should be buffering, but doesn't update the internal data structures which keep track on whether the element is buffering. This seems inconsistent to me - code which is listening on BUFFERING events may get a different impression of things than code which is using the query.

Comment 1 Duncan Palmer 2015-03-17 22:44:45 UTC

Can anyone comment on this?

Comment 2 Duncan Palmer 2015-04-15 05:01:03 UTC

Thiago?

Comment 3 Thiago Sousa Santos 2015-08-03 11:44:08 UTC

Hi, sorry for the late reply.

So if you have a playbin pipeline and your patch we stop getting replies from queue2 and will get from multiqueue. What exactly is your goal with this patch?

There is another thing to consider here: When playing from http, for example, queue2 that is upstream is running in ytes while the multiqueue is running in time format. Upstream should know better about the latest data that was downloaded/buffered. It seems to make sense to let upstream reply and only act if there was no reply. Also, would it make sense to sum the upstream result with the multiqueue's buffered totals? This can get complicated if upstream is dealing with muxed content and multiqueue dealing with decoded...

Perhaps if you give us a better overview of the use case you are trying to cover we can design some solution for upstream.

Recommended reading: http://cgit.freedesktop.org/gstreamer/gstreamer/tree/docs/design/part-buffering.txt

Comment 4 Duncan Palmer 2015-08-07 06:42:14 UTC

Hi Thiago, no worries on the late reply.

The use case I'm trying to hit;

- I have a pipeline for playing back HLS, created by uridecodebin. This has a multiqueue element which decodebin configures for buffering (I set the uridecodebin buffer-duration property). This element emits BUFFERING messages during playback.
- Our software has a module which creates the pipeline, and once it's reached the paused state, it hands it over to another module. This second module subscribes for messages from the pipeline, and then uses a buffering query to determine what the current buffering state is.
- Seeing as queue2 supports the buffering query, I thought it would also make sense to have the multiqueue support it; my thinking was that both elements support buffering, so why not support the buffering query in both.

It's worth noting that the multiqueue will only respond to a buffering query if it's configured to buffer.

In the case of ABR content, I think it makes sense to buffer using the multiqueue, with the queue size specified in units of time; see as we're adaptive, there's no point buffering in a queue2 element by units of time (since it has no useful notion of time with ABR content), or by units of bytes, since we don't know in advance what our likely bitrate will be. Would you agree?

We also support streaming over http (progressive download). In this case, decodebin adds a queue2, and configures it, instead of the multiqueue, for buffering. This is fine, and everything works as it should. The multiqueue doesn't reply to buffering queries since it's not configured to buffer.

Comment 5 Duncan Palmer 2015-12-09 01:11:15 UTC

Hi Thiago, have you have any time to consider this further?

Comment 6 Thiago Sousa Santos 2015-12-10 14:21:38 UTC

(In reply to Duncan Palmer from comment #4)
> Hi Thiago, no worries on the late reply.
> 
> The use case I'm trying to hit;
> 
> - Our software has a module which creates the pipeline, and once it's
> reached the paused state, it hands it over to another module. This second
> module subscribes for messages from the pipeline, and then uses a buffering
> query to determine what the current buffering state is.

Why isn't just getting the buffering level from the messages enough here? What do you gain by using queries?

> - Seeing as queue2 supports the buffering query, I thought it would also
> make sense to have the multiqueue support it; my thinking was that both
> elements support buffering, so why not support the buffering query in both.

Yes, it makes sense to me. Went to revisit playbin's buffering and it seems like this is a good idea. Would like to confirm with Edward to check if this is aligned with decodebin3/playbin3 plans.

> 
> It's worth noting that the multiqueue will only respond to a buffering query
> if it's configured to buffer.

ACK

> 
> In the case of ABR content, I think it makes sense to buffer using the
> multiqueue, with the queue size specified in units of time; see as we're
> adaptive, there's no point buffering in a queue2 element by units of time
> (since it has no useful notion of time with ABR content), or by units of
> bytes, since we don't know in advance what our likely bitrate will be. Would
> you agree?

Yes, your idea makes sense to me. Would like a second opinion here to make sure I didn't miss something.

Comment 7 Thiago Sousa Santos 2015-12-10 14:23:05 UTC

Review of attachment 295990 [details] [review]:

::: plugins/elements/gstmultiqueue.c
@@ +1985,3 @@
+
+        update_buffering (mq, sq);
+        gst_query_set_buffering_percent (query, mq->buffering, mq->percent);

You're returning the overall buffering percent and not the specific pad buffering level. Is this intentional?

If you query on the video sink you can get the video level, if you query on the audio sink you could want the audio level.

playbin could be responsible for aggregating the results from multiple sinks if the query was sent to the pipeline. It would be tricky to aggregate input/output rates and buffering ranges, though.

Comment 8 Jan Schmidt 2015-12-11 19:59:18 UTC

Just a note - with decodebin3 / playbin3, adaptive demux stream buffering is again handled by queue2. queue2 uses the bitrate reported by the adaptive demuxer (via tag event) to do buffering in TIME. The adaptive demuxer either knows the bitrate directly (from the manifest) or knows the nominal duration and size in bytes of each fragment - which is a good enough bitrate for buffering purposes.

Comment 9 Duncan Palmer 2015-12-17 12:20:20 UTC

(In reply to Thiago Sousa Santos from comment #6)

> Why isn't just getting the buffering level from the messages enough here?
> What do you gain by using queries?

The reason we do this is because we have a module which creates our pipeline, and hands it off to another module to manage (one of those historical things we'd like to change now...). So, the simplest way for the module managing the pipeline to know whether it's buffering is to query it.

So, in answer to your question, I think the only thing to be gained is flexibility in making use of the pipeline.

Comment 10 Duncan Palmer 2015-12-17 12:27:03 UTC

(In reply to Thiago Sousa Santos from comment #7)

> You're returning the overall buffering percent and not the specific pad
> buffering level. Is this intentional?

It is, but I'm not sure whether it's the right thing to do or not, and I'm happy to change. I set the overall buffering percent, as this is the number used to decide whether the multiqueue is in the buffering state.

> If you query on the video sink you can get the video level, if you query on
> the audio sink you could want the audio level.

A potential problem with this approach is that if someone sends a buffering query to the pipeline, they aren't going to know what the returned percentage figure means (I believe the pipeline will fire the query into the first sink on it's list?); they would have to specifically send the query to the sink they're interested in if they want to make use of the percentage figure.

> playbin could be responsible for aggregating the results from multiple sinks
> if the query was sent to the pipeline. It would be tricky to aggregate
> input/output rates and buffering ranges, though.

Not all of us can use playbin tho..

Comment 11 Thiago Sousa Santos 2015-12-22 12:17:24 UTC

(In reply to Duncan Palmer from comment #10)
> (In reply to Thiago Sousa Santos from comment #7)
> 
> > You're returning the overall buffering percent and not the specific pad
> > buffering level. Is this intentional?
> 
> It is, but I'm not sure whether it's the right thing to do or not, and I'm
> happy to change. I set the overall buffering percent, as this is the number
> used to decide whether the multiqueue is in the buffering state.
> 
> > If you query on the video sink you can get the video level, if you query on
> > the audio sink you could want the audio level.
> 
> A potential problem with this approach is that if someone sends a buffering
> query to the pipeline, they aren't going to know what the returned
> percentage figure means (I believe the pipeline will fire the query into the
> first sink on it's list?); they would have to specifically send the query to
> the sink they're interested in if they want to make use of the percentage
> figure.
> 
> > playbin could be responsible for aggregating the results from multiple sinks
> > if the query was sent to the pipeline. It would be tricky to aggregate
> > input/output rates and buffering ranges, though.
> 
> Not all of us can use playbin tho..

I think the queries arriving on pads should only account for the buffering on that pad/queue. Imagine instead of multiqueue you were using separate queues. We end up with the same aggregation problem again. Besides if you return an aggregated value there is no way to get the individual ones.

I'd propose returning the individual queues buffering if the query arrives on the pad, multiqueue could aggregate if the query is sent directly to it as an element query. GstBin could also aggregate returns if it gets a query, it already does some of that for other queries.

The only remaining question is how to properly aggregate the returns from multiple queues?

Comment 12 Duncan Palmer 2015-12-23 12:24:19 UTC

(In reply to Thiago Sousa Santos from comment #11)

> I'd propose returning the individual queues buffering if the query arrives
> on the pad, multiqueue could aggregate if the query is sent directly to it
> as an element query. GstBin could also aggregate returns if it gets a query,
> it already does some of that for other queries.

No problem.

> The only remaining question is how to properly aggregate the returns from
> multiple queues?

We could implement something like this;
- buffering, percent
  - Nothing, unless all queries return buffering, percent.
  - If any queue is buffering, then buffering is True, and we take percent
    from the same queue (note that this may not be the lowest percent value).
  - If no queue is buffering, percent is the lowest value returned.
- Stats
  - Nothing, unless all queries return stats.
  - avg_in, avg_out are averages of each result.
  - buffering_left is the lowest value returned.
- single range
  - Nothing, unless all queries return a single range.
  - start is the highest value returned, and stop the lowest. start must be <= stop.
  - estimated_total is the lowest value returned.
- multiple ranges
  - Nothing, unless all queries return multiples ranges.
  - The intersections of all ranges.

Problems I see here;
- I'm not really sure what makes sense with the single range. It depends on
  what you're using it for. Use cases which spring to mind;
   - Seeking to a buffered location. But, in this case you're likely to be
     using a single queue before a demux anyway.
   - Displaying a range of buffered data. You could make a case here for what
     I've suggested, or using the lowest value of start and highest value of
     stop. I think what I've suggested makes more sense tho.
- Mutliple ranges support as I've suggested is complex, and I'm not really
  sure what the point of having that support would be; as with my example in
  the previous point, I think if you wanted to seek to a buffered range, you'd
  likely be using a single queue before a demux.

I think I'd be inclined to discard data for multiple ranges, but implement
everything else as I've suggested.

Comment 13 Edward Hervey 2018-05-12 06:57:25 UTC

I would really like to avoid adding even more stuff like this in multiqueue. It shouldn't be used for buffering, really.

Going back a few steps, you seem to be creating custom pipelines, but you can create your own pipelines and reap the benefits of playbin3 by:
* using urisourcebin for your sources (will take care of buffering)
* using decodebin3 (should act as a simple replacement of decodebin3)

Have you tested with those setup ?

Comment 14 Edward Hervey 2018-07-18 09:25:42 UTC

ping ?

Comment 15 Duncan Palmer 2018-07-18 12:07:35 UTC

ack

Sorry Edwin, I've not actively been using gstreamer over the last year or two, and missed this...

I believe this issue was raised when decodebin3 was in it's early stages of development. I haven't used decodebin3, but it certainly did look to have the potential to replace the somewhat custom pipeline creation we were doing. The custom solution was required as there were issues with dynamic bitrate demuxers which were not well addressed by decodebin2.

I'd say that since this issue hasn't scratched an itch with anyone else, and because decodebin3 now exists to make life easier, that it'd probably be a good idea to close it.

Comment 16 Edward Hervey 2018-07-20 06:29:40 UTC

Thanks for the reply. This should indeed be properly handled by db3/pb3.

Note that even if you are using custom pipelines, you can use urisourcebin (which has smart buffering logic in it) if you don't want to use decodebin3 afterwards.

 Closing.