GNOME Bugzilla – Bug 698851
playbin: ability to mix or play multiple audio and text streams simultaneously
Last modified: 2013-05-29 18:14:59 UTC
See: https://bugs.webkit.org/show_bug.cgi?id=103771 QtWebKit, WebKitGTK and EFLWebKit all use GStreamer for audio and video elements. The HTML5 spec requires that we be able to play multiple audio and text streams simultaneously, but playbin only allows us to select one. Sebastian Dröge suggested: > Some ideas for the implementation would be 1) allowing the application to set a N->1 "stream combiner" element on playbin for the A/V/T streams, or 2) some API to select between selection/mixing and allow to set different mixing options (volume per audio stream, video stream positions/scaling, ...) I'm not sure if this would be enough for text tracks, since we can't just mix them, we need to be able to tell them apart. Maybe a stream combiner could do that though.
The idea would be to have three properties: {audio,video,text}-stream-combiner. If unset, playbin would use input-selector as before. The application could set any element or bin here that implements the same interface as input-selector, namely having multiple request sinkpads and a single always srcpad. For the text case it is a bit complicated, I agree, as you want the N input streams but don't produce any output to GStreamer and instead just handle the text yourself. I think for this we would want to allow to have no srcpad at all on the stream combiner.
I'm not sure the existing "adder" element is up to the task for the audio version, since it's relatively difficult to turn tracks on and off (it looks like the only way is to block or unlink them), plus it doesn't seem to do any synchronization (but live-adder does?).
Well, that's why I propose to make properties where you can set whatever you want what does what you need. So you could just put liveadder there if that's what you need, or a bin that includes behaviour for enabling/disabling streams, or .... :) Apart from that, yes, adder does not do any sensible synchronization. That's another bug though, it should do things similar to audioringbuffer for synchronization instead of just mixing samples together as they arrive.
Here's a potential problem: Playbin creates its input-selectors on-demand, when the first stream of that type is added, then removes the input-selector when the last stream goes away. It does that because having an input-selector with no inputs locks up the pipeline. So, setting a stream combiner as a property on playbin before it starts is dangerous. We could set the stream-combiner after the pipeline starts and finds its sources, using the "*-changed" callbacks. It would be weird, and fairly complex because of the need to block the pads and switch them, but maybe it's acceptable. It seems like the best option might be to make the property a string, which defaults to "input-selector", and then use it as the first parameter to gst_element_factory_make.
If we make a string to be passed to gst_element_factory_make, we would still need a property to read the stream combiner, and you'd have to check it and set stuff up every time it changes, so maybe that's a good reason to just make it required to set the stream combiner with the "*-changed" callbacks.
Can't we simply replace the input-selector with a stream-combiner. Then if you set current-audio, you end up with a single linked to the combiner and with a new property you would to enable multiple audio streams (or none), which would make the current-audio to be be -1 ? Would that keep things simple and usable ? Because a stream-combiner is a superset of a input-selector.
I think the idea was that a "stream combiner" is any element that has dynamic sinks that accept the correct caps, like input-selector or adder/liveadder. Then if someone wanted to mix all of the audio with playbin, they could do: GstElement* adder = gst_element_factory_make("adder", NULL); g_object_set(playbin, "audio-stream-combiner", adder, NULL); For what I want to do (implement the HTML5 spec), we'll actually need a better adder, which synchronized streams and allows them to be turned on and off (probably the same element you're thinking of), plus an appsink with dynamic pads (presumably easy to build out of normal appsinks), which doesn't need an output at all. The reason for the *-stream-combiner property is that we don't necessarily want to do the same thing with all of the streams.
Oh, and the problem with just replacing input-selector with a generic stream combiner is that: * It's not really clear what it would do with videos. * In my use case (HTML5 spec), I need to know which stream each text cue is associated with, so just combining them doesn't work.
.(In reply to comment #4) > Here's a potential problem: Playbin creates its input-selectors on-demand, when > the first stream of that type is added, then removes the input-selector when > the last stream goes away. It does that because having an input-selector with > no inputs locks up the pipeline. So, setting a stream combiner as a property on > playbin before it starts is dangerous. I don't see the problem :) The properties would hold an element instance, that is removed/added to the bin as need (just as is done now with input-selector, just that it's not destroyed and recreated like input-selector). It shouldn't be a factory name because that would limit its usefulness (you need to register your private stream combiner element). (In reply to comment #6) > Can't we simply replace the input-selector with a stream-combiner. Then if you > set current-audio, you end up with a single linked to the combiner and with a > new property you would to enable multiple audio streams (or none), which would > make the current-audio to be be -1 ? Would that keep things simple and usable ? > Because a stream-combiner is a superset of a input-selector. stream-combiner of encodebin? That's broken in different ways, and Olivier already proposed to just drop it and replace it by funnel. However this is not about stream-combiner or input-selector, but allowing custom combination elements, e.g. adder or videomixer or your fancy bin that does stuff :) (In reply to comment #7) > For what I want to do (implement the HTML5 spec), we'll actually need a better > adder, which synchronized streams and allows them to be turned on and off > (probably the same element you're thinking of), plus an appsink with dynamic > pads (presumably easy to build out of normal appsinks), which doesn't need an > output at all. That's unrelated from this bug and should definitely be done. Current adder is only useful in very rare scenarios because it just does not properly synchronize streams to their running times. The addition of properties to mute streams or change their volume can easily be added to adder, and should be added there. Just make them properties on the sinkpads of adder, similar to what is done in videomixer to specify e.g. the position of a stream. > The reason for the *-stream-combiner property is that we don't necessarily want > to do the same thing with all of the streams. Yes, there should be one property per stream type. (In reply to comment #8) > Oh, and the problem with just replacing input-selector with a generic stream > combiner is that: > > * It's not really clear what it would do with videos. videomixer, you could do some picture-in-picture thing for example. > * In my use case (HTML5 spec), I need to know which stream each text cue is > associated with, so just combining them doesn't work. One way of doing that would be to have a custom stream combiner for the text, that muxes all text streams together (so you can split them apart later again) and then have a custom text-sink that does whatever you want to do with the different text streams.
In case anyone is interested, I'm working on this here: https://github.com/brendanlong/gst-plugins-base (branch: playbin-stream-combiners) I still haven't had much time to work on this, but the current version basically does the same thing the audio-sink, video-sink and text-sink properties do, and it works with my trivial example (setting audio-stream-combiner to an adder). The main things I need to do now are: * Make sure the stream combiners are being added and removed properly (no leaks, or deleting custom elements). * Disallow input-selector specific things like current-audio/video/text when the stream combiner isn't an input-selector. * Rename existing `selector` variables to `combiner`. I plan to do this as the last commit to make this easier to review. I believe this version will ignore changes to the stream combiners once a video starts playing, since it sets then up in the activate_group function, but that's what the -sink properties do, and I assume it would be fairly complicated to switch these out during playback.
Created attachment 245000 [details] Example of setting audio-stream-combiner to an 'adder' Run like: python3 playbin-multi.py https://dl.dropboxusercontent.com/u/61100892/test.mkv The test.mkv file was generated with two videotestsrc and two audiotestsrc. I don't remember the exact command, but this will give you something similar: gst-git gst-launch-1.0 videotestsrc num-buffers=100 ! theoraenc ! matroskamux name=mux ! filesink location=example.mkv videotestsrc num-buffers=100 pattern=checkers-8 ! theoraenc ! mux. audiotestsrc num-buffers=300 ! vorbisenc ! mux. audiotestsrc wave=white-noise num-buffers=300 ! vorbisenc ! mux.
Already looks like a really good start, going exactly in the right direction :) Thanks for working on that! stream-combiners should indeed be handled like sinks, so not allowed to be switched during playback.
I think I have the "not removing custom combiners" thing working. The best test I could think of was to set the stream to NULL, then PLAYING. Let me know if you have any better ideas. I also disallow setting current-{audio,video,text}. Another option would be to check if the selector has the "active-pad" property. There's no need for anything special with getting those properties, since they'll just stay -1. I'm still putting off renaming because it'll make the patch impossible to read.
Created attachment 245091 [details] [review] Working patch, without renaming selector to combiner yet I think this patch does everything necessary except renaming various "selectors" to "combiners". I can do that and re-submit this if you'd prefer, but I assume it would be harder to read. If you'd prefer to see the original commits, they're in the `playbin-stream-combiners` branch here: https://github.com/brendanlong/gst-plugins-base
Oh, and I haven't done anything to make it handle stream combiners with no srcpad yet.
(In reply to comment #15) > Oh, and I haven't done anything to make it handle stream combiners with no > srcpad yet. I wouldn't allow that, and instead let the streamcombiner generate a custom format that is then used by the sink. E.g. if you need all subtitle streams together downstream, create a new subtitle format that allows you to relate each subtitle buffer to the separate streams in the end, for example by just adding a GstMeta to the buffers. Allowing no-srcpad stream combiners will make the playbin code quite awful, ↑ would fit the design much better.
Review of attachment 245091 [details] [review]: ::: gst/playback/gstplaybin2.c @@ +1821,3 @@ GstPad *old_sinkpad; + if (!g_object_class_find_property (G_OBJECT_GET_CLASS (sinkpad), + "always-ok")) Instead of this I would add checks for all of the (relevant) input-selector properties... and if the custom selectors allows one of them, allow to use it through playbin. Only only do these property checks after requesting a pad from the stream combiner (for the pad properties) and when selecting the stream combiner (for the element properties) and cache this in GstSourceSelect @@ +2166,3 @@ + if (combiner) { + gst_bin_add (GST_BIN (playbin), combiner); + gst_object_ref_sink (combiner); Handle this exactly like the sinks, i.e. basically copy the set_sink() function and also include the behaviour from playsink for the sinks. Especially don't add the combiner to the bin/remove it from the bin here. Store them in the GstSourceGroup and GstSourceSelect at the same place where otherwise the input-selector would be created, and add/remove them from the bin at the same time as the input-selector would currently be added/removed. And store an additional reference to the input-selector and the stream combiner in the GstSourceGroup, which is then decreased again after it is removed from the bin to make sure you don't destroy custom stream combiners when they're removed from the bin but that you destroy input-selectors.
(In reply to comment #17) > Instead of this I would add checks for all of the (relevant) input-selector > properties... and if the custom selectors allows one of them, allow to use it > through playbin. Isn't that what I'm doing? Do you just want all of the checks up-front, instead of as-needed? > Handle this exactly like the sinks, i.e. basically copy the set_sink() function > and also include the behaviour from playsink for the sinks. That's fixed. > Store them in the GstSourceGroup and GstSourceSelect at the same place where > otherwise the input-selector would be created, Already done, except I also store them in playbin->*_stream_combiner, similar to playbin->*_sink. They're reffed and added to the group in activate_group (just like the sinks). > and add/remove them from the bin > at the same time as the input-selector would currently be added/removed. Should playbin be doing this? The sinks are just expected to be in the bin (gst_bin_add and gst_bin_remove are never called on them). It does seem more user-friendly to do that automatically, but it will also be inconsistent.
(In reply to comment #18) > (In reply to comment #17) > > Instead of this I would add checks for all of the (relevant) input-selector > > properties... and if the custom selectors allows one of them, allow to use it > > through playbin. > > Isn't that what I'm doing? Do you just want all of the checks up-front, instead > of as-needed? Yes, up-front. And also do this for the current-audio, etc properties. > > Store them in the GstSourceGroup and GstSourceSelect at the same place where > > otherwise the input-selector would be created, > > Already done, except I also store them in playbin->*_stream_combiner, similar > to playbin->*_sink. They're reffed and added to the group in activate_group > (just like the sinks). > > > and add/remove them from the bin > > at the same time as the input-selector would currently be added/removed. > > Should playbin be doing this? The sinks are just expected to be in the bin > (gst_bin_add and gst_bin_remove are never called on them). It does seem more > user-friendly to do that automatically, but it will also be inconsistent. Yes, playbin should do that at the same place where it currently does this with the input-selector. For the sinks this is all handled inside playsink, so look at the playsink code to get some ideas :)
And it seems it is currently still creating the input-selectors, just not using them and leaking them.
But it creates the selectors like this: if (select->selector == NULL && playbin->have_selector) { /* no selector, create one */ GST_DEBUG_OBJECT (playbin, "creating new input selector"); select->selector = gst_element_factory_make ("input-selector", NULL); select->selector != NULL because it uses the group->selector (the for loop before this section). So, it should just skip this entire section.
Indeed, I missed that. Sorry Maybe change that code for clarity to something like this: if (!select->selectr && playbin->*_stream_combiner) select->selector = gst_object_ref (playbin->*_stream_combiner); } else if (!select->selector) { select->select = gst_element_factory_make ("input-selector", NULL); }
I think the problem is that I setup the custom stream combiner in activate_group(), but it probably makes more sense to do it here, since that's where normal stream combiners are set up.
Created attachment 245279 [details] [review] Handle custom combiners more like normal ones, cache property checks I updated the handling of custom stream combiners so they're added in pad_added_cb (like the default input-selectors) instead of activate_group, with the only difference being this: if (custom_combiner) select->selector = gst_object_ref (custom_combiner); else select->selector = gst_element_factory_make ("input-selector", NULL); I also cache all of the property checks when I create the selector, and drop out of functions we know we can't handle as soon as possible. And I changed "*-stream-combiner" to expose the automatically created input-selectors. Individual commits are on GitHub of course.
Review of attachment 245279 [details] [review]: Looks good, just one minor comment below :) Are you aware of any remaining issues? I can't see any. Could you create a unit test for this new API, see tests/check/elements/playbin.c. Shouldn't be too difficult to add something there that just uses adder for multiple audio streams ::: gst/playback/gstplaybin2.c @@ +3002,3 @@ + g_object_class_find_property (G_OBJECT_GET_CLASS (select->selector), + "sync-mode") != NULL; + select->has_sync_streams = cache-buffers, sync-mode and sync-streams are very input-selector specific. I just only use them if input-selector is used
(In reply to comment #25) > cache-buffers, sync-mode and sync-streams are very input-selector specific. I > just only use them if input-selector is used I was hoping we would eventually have something that's a combination of adder and input-selector for audio streams (mix all selected streams). Would an element like that need the same properties set?
> I was hoping we would eventually have something that's a combination of adder > and input-selector for audio streams (mix all selected streams). Yes, I'd love to have such an element (same for video).
(In reply to comment #26) > (In reply to comment #25) > > cache-buffers, sync-mode and sync-streams are very input-selector specific. I > > just only use them if input-selector is used > > I was hoping we would eventually have something that's a combination of adder > and input-selector for audio streams (mix all selected streams). Would an > element like that need the same properties set? Well, you would unconditionally have these properties set on this input-selector and could do that when you create this bin. I don't think setting them on custom bins would be the job of playbin
> Well, you would unconditionally have these properties set on this > input-selector and could do that when you create this bin. I don't think > setting them on custom bins would be the job of playbin Makes sense to me.
Playbin's tests currently don't have any audio source. They just use redvideo:// (which only creates a video stream).
I assume all we need is a bin with two audiotestsrc's, but I'm not sure how I would register that as a URI handler.
Created attachment 245418 [details] [review] Everything except tests This patch doesn't set sync-streams, sync-mode and cache-buffers on custom selectors. I'll look into tests more tomorrow.
(In reply to comment #31) > I assume all we need is a bin with two audiotestsrc's, but I'm not sure how I > would register that as a URI handler. Erm, I meant tests/check/elements/playbin-compressed.c :) That one has custom sources/demuxers and decoders that can be used to build your scenario. Should probably be renamed to playbin-complex or something like that then :)
Comment on attachment 245418 [details] [review] Everything except tests Looks good, I'll push this with the tests :) On top of that, renaming the selector variables to stream_combiner would be nice.
Created attachment 245480 [details] [review] Add (failing) tests and rename selector to combiner I added a test, but when I run it I get: > elements/playbin-compressed.c:1660:F:general:test_raw_raw_audio_stream_adder_manual_sink:0: 'csink->n_raw' (0) is not equal to 'NBUFFERS' (100) Running with `GST_DEBUG=adder:5`, there's a bunch of lines like: > 0:00:00.968915322 3259 0x2ae310002e30 DEBUG adder gstadder.c:1355:gst_adder_collected:<adder0> no data available, must be EOS The problem seems to be here in gstadder.c /* get available bytes for reading, this can be 0 which could mean empty * buffers or EOS, which we will catch when we loop over the pads. */ outsize = gst_collect_pads_available (pads); /* can only happen when no pads to collect or all EOS */ if (outsize == 0) goto eos; I tried making the buffers actually have data in `gst_caps_src_create` by changing `buf = gst_buffer_new ()` to `buf = gst_buffer_new_wrapped (g_malloc0(1024), 1024);`, but that didn't change anything.
Using GST_DEBUG=adder:5,collectpads:5, and using `buf = gst_buffer_new_wrapped(malloc0(2), 2)`, I get weird output like this: > 0:00:00.035961553 6348 0x2ae864002e30 DEBUG collectpads gstcollectpads.c:1995:gst_collect_pads_chain:<collectpads0> Queuing buffer 0x2ae864020690 for pad adder0:sink_1 > 0:00:00.047100120 6348 0x2ae864002e30 DEBUG collectpads gstcollectpads.c:1295:gst_collect_pads_check_collected:<collectpads0> All active pads (2 + 0 >= 2) have data, calling gst_adder_collected > 0:00:00.047208794 6348 0x2ae864002e30 DEBUG collectpads gstcollectpads.c:1017:gst_collect_pads_available:<collectpads0> pad 0x2ae864033c00 has 0 bytes left > 0:00:00.047279473 6348 0x2ae864002e30 DEBUG collectpads gstcollectpads.c:1017:gst_collect_pads_available:<collectpads0> pad 0x2ae864036b10 has 0 bytes left > 0:00:00.036270811 6348 0x2ae874003720 DEBUG adder gstadder.c:521:gst_adder_query_latency:<adder0> Calculated total latency: live no, min 0:00:00.000000000, max 99:99:99.999999999 > 0:00:00.047325568 6348 0x2ae864002e30 DEBUG adder gstadder.c:1356:gst_adder_collected:<adder0> no data available, must be EOS collectpads has data so it calls gst_adder_collected, then in gst_adder_collected, there's no data?
Created attachment 245496 [details] [review] Patch with working test The problem was that `gst_codec_demuxer_chain` was creating a new buffer instead of copying the one it was given. I had trouble thinking of how to test this so this: * Makes sure setting audio-stream-combiner works * Makes sure unset stream-combiners are NULL * Makes sure the adder's state changes to paused (to prove that it's actually being used)
commit f763a2364fb9ad0c812adfa2a0ccb06d90a37dcd Author: Sebastian Dröge <sebastian.droege@collabora.co.uk> Date: Wed May 29 10:37:55 2013 +0200 playbin: Rename compressed unit test to complex It's not really about compressed streams anymore, but also about stream switching and stream combiners. commit 0dee7777ff93a09e9e7c74027be8dfdc0dd6ac0c Author: Sebastian Dröge <sebastian.droege@collabora.co.uk> Date: Wed May 29 10:35:11 2013 +0200 playbin: Set custom stream-combiners to NULL and unref before finalizing commit f45a102c2c9e3bb993d571dda73284f33b8fe39e Author: Brendan Long <b.long@cablelabs.com> Date: Tue May 28 10:59:22 2013 -0600 playbin: Add playbin audio-stream-combiner test using adder commit 53caa99a59eadf80c9725ed2e8c994d130d79ad6 Author: Brendan Long <b.long@cablelabs.com> Date: Tue May 28 11:23:56 2013 -0600 playbin: Rename select to combine and selector to combiner in playbin commit ba5f6cfe72ec4a46d4bba7cc00e1b9e221284e96 Author: Brendan Long <b.long@cablelabs.com> Date: Fri May 17 17:23:46 2013 -0600 playbin: Add support for custom stream-combiners This allows to chose something else than input-selector for multiple audio/video/text streams, e.g. an adder could be used for audio. It is needed for example to implement some of the more advanced HTML5 video features. https://bugzilla.gnome.org/show_bug.cgi?id=698851
What is missing here now is that playbin is currently leaking the custom stream combiners: ==31302== 3,034 (800 direct, 2,234 indirect) bytes in 1 blocks are definitely lost in loss record 3,092 of 3,134 ==31302== at 0x4C2BDEB: malloc (vg_replace_malloc.c:270) ==31302== by 0x53CED30: g_malloc (gmem.c:159) ==31302== by 0x53E3F92: g_slice_alloc (gslice.c:1003) ==31302== by 0x53E44E5: g_slice_alloc0 (gslice.c:1029) ==31302== by 0x51618A4: g_type_create_instance (gtype.c:1897) ==31302== by 0x5146757: g_object_constructor (gobject.c:1855) ==31302== by 0x5147D20: g_object_newv (gobject.c:1719) ==31302== by 0x412DB19: gst_element_factory_create (gstelementfactory.c:377) ==31302== by 0x412DC90: gst_element_factory_make (gstelementfactory.c:446) ==31302== by 0x408ED3: test_raw_raw_audio_stream_adder_manual_sink (playbin-compressed.c:1618) ==31302== by 0x40D6D60: srunner_run_all (check_run.c:372) ==31302== by 0x40D0ECD: gst_check_run_suite (gstcheck.c:689) ==31302== by 0x409897: main (playbin-compressed.c:2542) This does not happen if it choses to create its own input-selector. So something wrong with the storage of them probably, didn't find the problem yet unfortunately.
Created attachment 245564 [details] [review] Don't add an extra ref to custom combiners in pad_added_cb Short version: `gst_bin_add` refs custom_combiner for us, so we don't need to ref it ourselves. The problem is here in `pad_added_cb`: // custom_combiner has one ref if (custom_combiner) combine->combiner = gst_object_ref (custom_combiner); else combine->combiner = gst_element_factory_make ("input-selector", NULL); // now custom_combiner has two refs // default combiner has one ref // ... gst_bin_add (GST_BIN_CAST (playbin), combine->combiner); // custom_combiner has three refs // default combiner has one ref Then in `pad_removed_cb`: // custom combiner starts with three refs // default combiner starts with one ref gst_element_set_state (combine->combiner, GST_STATE_NULL); gst_bin_remove (GST_BIN_CAST (playbin), combine->combiner); // custom combiner has 2 refs // default combiner has 0 refs combine->combiner = NULL; The problem is that we add two ref's for custom combiners and only one for normal combiners. I'm actually confused about where the elements are being ref'd in `gst_bin_add` (I looked at the source and don't see it), but I can confirm that it's happening with logging: g_print("refs before add %d\n", GST_OBJECT_REFCOUNT_VALUE(combine->combiner)); // 2 gst_bin_add (GST_BIN_CAST (playbin), combine->combiner); g_print("refs after add %d\n", GST_OBJECT_REFCOUNT_VALUE(combine->combiner)); // 3 Is the `transfer full` annotation on `gst_bin_add` correct? I thought transfer full meant that the function doesn't add another reference.
(In reply to comment #40) > I'm actually confused about where the elements are being ref'd in `gst_bin_add` > (I looked at the source and don't see it), but I can confirm that it's > happening with logging: gst_bin_add() is actually doing a ref_sink. Which mean these a plausible: ref_sink(obj ref 1) = obj ref 1 ref_sink(obj ref 1) = obj ref 2 ref_sink(obj ref n) = obj ref n + 1 if n > 1 If you want to keep a reference on an object, other then the one owned by the bin, it's better to use ref_sink too.
Yes, sorry I think I wrote that wrong. If you look at my ref counts, you'll see that I took that into account -- the default combiner starts with one ref, and stays at one when added to the bin (because it does a `ref_sink`). The custom combiner starts with one ref (because we `ref_sink` it in `gst_play_bin_set_stream_combiner`), then we add one with our `ref_sink` and `gst_bin_add` adds another. We want one additional ref, but we end up with two.
(In reply to comment #42) > `gst_play_bin_set_stream_combiner`), then we add one with our `ref_sink` and I mean we add one with this (normal) `ref`.
Yes, that patch is correct. Forgot about the floating refs :)
commit 96aab6d8a54b547f99eaedb587bb9e538b7121ba Author: Brendan Long <b.long@cablelabs.com> Date: Wed May 29 10:15:36 2013 -0600 playbin: Don't take an extra reference to the custom stream combiners They are automatically reffed when added to the bin because they're already not floating anymore.