Bug 634407 – decodebin should expose pads in a deterministic order

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 634407 - decodebin should expose pads in a deterministic order


Summary:	decodebin should expose pads in a deterministic order


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gst-plugins-base
Version:	unspecified
Hardware:	Other Linux

Importance:	Normal enhancement
Target Milestone:	1.1.1
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Duplicates:	607466 (view as bug list)
Depends on:
Blocks:	601700

Reported:	2010-11-09 11:53 UTC by Stefan Sauer (gstreamer, gtkdoc dev)
Modified:	2013-01-19 13:06 UTC

See Also:	654830
GNOME target:	---
GNOME version:	---

Attachments
WIP to make pad order deterministic (998 bytes, patch) 2010-11-22 12:29 UTC, Stefan Sauer (gstreamer, gtkdoc dev)	needs-work	Details \| Review
WIP gst_util_make_stream_id function (4.24 KB, patch) 2012-03-13 22:21 UTC, René Stadler	needs-work	Details \| Review
event: Add new stream-id field to the stream-start event (6.42 KB, patch) 2012-07-29 12:26 UTC, Sebastian Dröge (slomo)	none	Details \| Review
event: Update for stream-start event API changes (6.86 KB, patch) 2012-07-29 12:27 UTC, Sebastian Dröge (slomo)	reviewed	Details \| Review
event: Add new stream-id field to the stream-start event (8.55 KB, patch) 2012-07-31 15:17 UTC, Sebastian Dröge (slomo)	none	Details \| Review
event: Add new stream-id field to the stream-start event (8.57 KB, patch) 2012-07-31 15:18 UTC, Sebastian Dröge (slomo)	reviewed	Details \| Review
event: Update for stream-start event API changes (5.85 KB, patch) 2012-07-31 15:18 UTC, Sebastian Dröge (slomo)	none	Details \| Review
event: Add new stream-id field to the stream-start event (9.47 KB, patch) 2012-08-01 14:43 UTC, Sebastian Dröge (slomo)	committed	Details \| Review
event: Update for stream-start event API changes (5.85 KB, patch) 2012-08-01 14:43 UTC, Sebastian Dröge (slomo)	committed	Details \| Review
oggdemux: Add stream-id to the stream-start event (3.80 KB, patch) 2012-08-05 11:52 UTC, Sebastian Dröge (slomo)	committed	Details \| Review

Description Stefan Sauer (gstreamer, gtkdoc dev) 2010-11-09 11:53:12 UTC

Right now the order of pads for multi-stream files is pretty random. It depends on which thread manages to push first.

One potential solution could be to add the sometimes pads on decodebin2 at the same time we see the sometimes pads on the demuxer.

Right now I use gst-discoverer and playbin2 and there is no way to map the discoverer streams to the "current-audio"/"current-video" tracks.

Comment 1 Stefan Sauer (gstreamer, gtkdoc dev) 2010-11-09 12:51:17 UTC

just for the record
13:40 < slomo> ensonic: you can't add pads to decodebin2 as soon as a demuxer adds them... you don't know if you can a) decode that stream and b) if it contains another container format (think of DV inside MXF)
13:40 < slomo> ensonic: but pads are added in a function called something with "expose" and there all pads are added at the same time. you could sort them like in playbin2 
               before really adding them
13:42 < wtay> you could keep track of the order in which they were added and then sort them when you finally expose the pads
13:42 < wtay> it's a bit tricky with demuxers in demuxers
13:44 < slomo> yes... but it's all possible ;)
13:44 < slomo> you could propagate the order downstream, i.e. demuxer 1 adds stream 5, this gets to demuxer 2 which then adds stream 5.1 and 5.2
13:44 < slomo> and in the end sort this

Comment 2 Stefan Sauer (gstreamer, gtkdoc dev) 2010-11-22 12:29:35 UTC

Created attachment 175025 [details] [review]
WIP to make pad order deterministic

My current attempt to fix the problem. Seems to not help though. This is the same issue as in #bug 607466.

Comment 3 Sebastian Dröge (slomo) 2011-05-09 09:34:31 UTC

Comment on attachment 175025 [details] [review]
WIP to make pad order deterministic

We need a better stream selection system in playbin2 for this IMHO, e.g. something to let the demuxer specify a default stream.

Comment 4 Christian Fredrik Kalager Schaller 2011-07-18 15:41:25 UTC

*** Bug 654830 has been marked as a duplicate of this bug. ***

Comment 5 Christian Fredrik Kalager Schaller 2011-07-18 15:41:53 UTC

Text from 654830:
It is currently quite hard to work with files that have multiple streams of the
same type using discoverer and uridecodebin and encodebin, as apart from the
technical characteristics there seems to be no way of identified each stream.
So especially when two streams are technically identical, ie. mp3 audio. It is
a pain to deal with in a 100% safe way.

It would be nice if we could come up with a system where
discoverer could create some kind of unique identifier for each stream
it finds. That way when we am connecting uridecodebin to encodebin or
other elements I can look for the pad with that unique identifier and
connect that, would probably simplify writing applications to handle
this case a lot. If discoverer and uridecodebin where also able to
automatically attach any language information to each stream, based on
tag data etc. that would also be great.

Comment 6 Christian Fredrik Kalager Schaller 2011-07-20 13:36:59 UTC

Had a quick chat with Edward today about this issue. It seems to be a bug that will be trivial to fix in the 0.11/1.0 branch. My guess is with 1.0 looming quite close on the horizon and that fixing this in the 0.10 branch is quite hard, this bug is not likely to be ever fixed for the 0.10 branch.

Comment 7 Stefan Sauer (gstreamer, gtkdoc dev) 2011-07-20 16:50:40 UTC

It think its fixable as we already sort the pads. I'll try to have another look at it.

Comment 8 Tim-Philipp Müller 2011-07-20 22:19:20 UTC

> Had a quick chat with Edward today about this issue. It seems to be a bug that
> will be trivial to fix in the 0.11/1.0 branch.

How so? Care to share some more details?

Comment 9 Vincent Penquerc'h 2011-10-07 12:31:37 UTC

Christian ? I'm curious too :)

Comment 10 Christian Fredrik Kalager Schaller 2011-12-12 16:28:31 UTC

<cschalle> bilboed, see bug tpm posted, but in general it refers to when I discovered it wasn't easy to 'track' a stream through discoverer/decodebin etc., you said it was hard to fix in 0.10, but that it would be a lot easier in 0.11/1.0
<bilboed> since in 0.11 we have sticky events
 and tags are events
 and you can query what events are stuck on pads ...
<cschalle> cymacs,  ^^
<bilboed> if we had demuxers/decoders pushing out identifier tags you could then identify streams
 then you'd be able to identify a pad corresponding to a certain stream
 even if that pad is several elements down the road

Comment 11 René Stadler 2011-12-13 00:29:32 UTC

I'm playing around with the following solution now:

Defining a new tag GST_TAG_STREAM_ID, of string type and "DECODED" flag (i.e. not meta data, like the DURATION tag for example). The semantics shall be such that the tag's value is modified by any element that splits up a stream into different ones. So demuxers have to care, but not (the usual) decoders. tee doesn't have to care (splits into identical copies), but deinterleave probably should.

When a demuxer (or demuxer-like) element forwards the tag to a new branch, it bases the new value for the tag on the upstream value by appending its own specific stream id followed by a colon (:). This rule produces values as such:

  "1:", "1:2:", "184917:4374391004:1:9:"

The trailing colon has the benefit of reminding people that it's a string :) but also allows you to quickly test that "1:2:" is a sub branch of "1:" by comparing the prefix (otherwise, "1" could be confused with "11", so field splitting would be required). Not sure if that could be useful to somebody though.

The local id that an element appends per pad must be unique among all the pads of course. It also has to be stable across multiple invocations in the same process (so that you can match ids between runs of discoverer and decodebin). Stability across the process boundary should not be required; the exact values are subject to change between different versions of an element for example.

Elements can derive the id in a format specific way. For example in Ogg and Matroska, there are serial numbers/track UIDs. In general, this means the ids' lexical sort order does not necessarily match any kind of stream/track order in the file. The first track could be "5:". Or "28904729847239842398472:".

Implementation wise, most of this can be encapsulated in some small helper function(s) if sticky events are used. Some kind of tag_list_add_stream_id_uint (list, sinkpad, my_local_id) perhaps (though should most likely not clutter up the tag list namespace?). That function can query the upstream value through gst_pad_get_sticky_event and do the magic to have the given uint value represented. Other variants for uint64 and string should make sense, or maybe just a single generic string variant where elements do the formatting on their own.

This could also be done in 0.10 as well. However this means each demuxer has to catch tag events to extract and store the upstream id. Also, I guess it is harder for applications to extract per-stream tag events without sticky events, so this is a lot less useful (right?).

Any opinions on the colon separator notation? One alternative would be to set the tag multiple times, and then gst_tag_merge_strings_with_comma would produce a single value to the application in the end.

Comment 12 Sebastian Dröge (slomo) 2011-12-13 09:52:56 UTC

Sounds like a good plan but IMHO this could be done in 0.10 too. playbin2 at least is catching tag events for applications, which makes this much easier from the application side.

Comment 13 Tim-Philipp Müller 2011-12-13 13:01:56 UTC

Is there a reason to mandate numbers? Might be nicer to be able to use other descriptors too (e.g. audio_00 or so).

Comment 14 René Stadler 2011-12-13 13:04:29 UTC

Ah yeah sure, can be anything (like hex-digit formatted hashes or whatever). It could also contain colons itself (which doesn't violate the prefix rule), for example I'm wondering if matroskademux should actually append two ids, one derived from the file (FILEUID) and one per stream.

Comment 15 Sebastian Dröge (slomo) 2011-12-13 13:26:36 UTC

And filesrc for example could have the URI of the file in the beginning, sounds great.

One thing I don't like though is that you want to allow the separate character inside the strings too. IMHO it shouldn't be allowed there to allow creating a tree of the file structure from the tags. The separator, if used inside the strings, should probably be escaped somehow.

Comment 16 René Stadler 2011-12-13 16:28:07 UTC

Yes, but you never know for sure which elements support it anyways, so the tree structure could always end up less detailed than it should be.

Comment 17 René Stadler 2011-12-17 22:41:47 UTC

Hmm, a pad-unique string identifier that is stable across invocations... The pad name should already satisfy that in 99% of all cases, right? :)

Now I want a different approach; how about we discover the pipeline's topology, restricted to a subset of elements we are interested in? Probably just looking at all elements with a 1-to-N topology, or looking at the factory classification for "Demuxer".

The benefit is that no changes to elements are required at all. Well, except for weird (and rare?) cases where e.g. audio_00 and audio_01 can be confused randomly, which just needs fixing.

Comment 18 Christian Fredrik Kalager Schaller 2011-12-17 22:49:17 UTC

The reason I filed this bug was that it ended up being a bit random what audio was on the pads I got in multistream audio files, so the english track was the first pad in some runs and the dutch for instance in the second.

Comment 19 René Stadler 2011-12-17 23:09:29 UTC

Yes, the order in which the final pads end up being exposed is random. I think Stefan's approach was to make this deterministic, but it seems to be tricky.

Edward's idea is to add an identifier in the form of a tag, so that the randomness doesn't matter. I like that idea, but using a tag is probably not the most practical solution.

I think a utility function is needed that iterates the pipeline starting at a given src pad, fetching the relevant names along the way to build the id.

Comment 20 Stefan Sauer (gstreamer, gtkdoc dev) 2011-12-18 13:15:08 UTC

I was only looking at the final ghost-pads that are exposed. The pad order inside the pipeline does not really matter (in this regard).

Comment 21 René Stadler 2012-03-13 22:21:01 UTC

Created attachment 209657 [details] [review]
WIP gst_util_make_stream_id function

This implements a function like I mentioned in comment 19. It takes a srcpad and an optional element klass string (a single one for now, so no slashes in there). It walks the pipeline graph upstream, crossing bin boundaries and traversing into them. Calling this on an audio pad of decodebin2, this can return for example:

"filesrc:src,typefind:src,matroskademux:audio_01,multiqueue:src2,ac3parse:src,capsfilter:src,a52dec:src"

So it describes the whole pipeline upstream from that pad (sink pad names could be added too, but are less useful). When you pass "demuxer" as the klass, this is limited to just:

"matroskademux:audio_01"

Comment 22 Sebastian Dröge (slomo) 2012-03-14 09:27:58 UTC

Why is this not implemented as a query?

Comment 23 René Stadler 2012-03-14 09:43:25 UTC

If this can live in gst_pad_query_default, I would  prefer it (simpler code). What I don't want is any code in specific elements. So this isn't a "typical" query in any way, just using the query system to conveniently peek at all the interesting pads.

I'll implement it as a default query handler to see how it looks.

Comment 24 Sebastian Dröge (slomo) 2012-03-14 09:51:15 UTC

IMHO this should be a query and could probably be handled in gst_pad_query_default() too. But I'm sure we also need to change the default handler for some special elements.

Comment 25 Tim-Philipp Müller 2012-03-16 00:14:20 UTC

I think we need three things in the end:

    1) allow demuxers to signal an *order* for
        their source pads, so that "the first"
        audio stream will always be first in the
        list (whichever list that is, playbin's list
        for example, or the order in which
        decodebin2/uridecodebin exposes the
        pads), and "the second" will always be
        second, etc.

    2) *stable unique identifiers* for different
        streams to match the same stream
        across multiple runs. This could tie in
        with the order thing from (1), but
        doesn't have to.

    3) a way for a demuxer to signal upper
         layers which stream is the *default*
         one; e.g. a container might have info
         that audio stream #3 or language XY
         is supposed to be the default, and
         we should be able to communicate
         that somehow.

These don't have to be all done at once, but there's some overlap it seems to me, so might not hurt to think if we can come up with something that takes care of all of these things, or at least 1) and 2).

Comment 26 Christian Fredrik Kalager Schaller 2012-04-03 13:53:34 UTC

I tried applying these patches against current git master (0.11). They do apply, but compilation of GStreamer core fails with:

gstutils.c: In function 'make_stream_id':
gstutils.c:3712:3: error: passing argument 2 of 'gst_iterator_next' from incompatible pointer type [-Werror]
../gst/gstiterator.h:254:19: note: expected 'struct GValue *' but argument is of type 'void **'
gstutils.c:3729:5: error: implicit declaration of function 'gst_element_factory_get_klass' [-Werror=implicit-function-declaration]
gstutils.c:3729:5: error: nested extern declaration of 'gst_element_factory_get_klass' [-Werror=nested-externs]
gstutils.c:3729:11: error: assignment makes pointer from integer without a cast [-Werror]
cc1: all warnings being treated as errors

Comment 27 Christian Fredrik Kalager Schaller 2012-07-17 09:01:17 UTC

Any chance of this issue being moved forward? Would be a great feature for GStreamer 1.0 if the stream selection stuff worked in discoverer and decodebin

Comment 28 René Stadler 2012-07-17 12:50:44 UTC

I need to try out the query idea still. I'm thinking it should be a generic "GST_QUERY_TOPOLOGY" thing which returns you the pipeline graph that leads up to that point (i.e. where does the data flow come from that leads here). The query might return a GList of pads (where entries alternate between src and sink pads).

The stream-id I explained before could be created from that by iterating the list and looking for "Demuxer" classified element factories (pad's parent), then including those element and pad names in the id to make it unique to that path, just as before.

Does this make sense to anyone? :)

Comment 29 Sebastian Dröge (slomo) 2012-07-17 12:55:37 UTC

Yes, and it could use the iterate-internal-pads query to get to the corresponding pad on the other side in the generic handler for this query.

A GList of pads is not enough though, it can be a tree that has the current pad at the root (just think of things like adder or interleave).

Comment 30 Tim-Philipp Müller 2012-07-17 13:04:49 UTC

Err, what? GLists? This sounds a bit complicated to me.

Also see my comments on bug #679315 btw.

Setting something like "/pipeline0/filesrc0/47184114/matroskademux0/audio_00" as name/nick with the tag event would do the job just fine IMHO (or as ID inside the tag list, or a stream id somewhere else). The "/pipeline0/filesrc0/47184114/" part would be queried from upstream and could reflect the topology (I wouldn't expect anything to parse that ID for any purpose).

Comment 31 René Stadler 2012-07-17 13:13:24 UTC

Sebastian: Right, the generic query needs to support the N-to-1 case. Can that be represented in a gobject-introspectible way without introducing new API? If handling the query return data is just as complicated as iterating the pipeline by yourself, the thing becomes moot.

Tim: Wrong bug number? Sounds like my original approach.

Comment 32 Tim-Philipp Müller 2012-07-17 13:28:42 UTC

Yes, sorry, that should've been bug #677619.

It's very similar to your approach from comment #21, just a bit more concise (IMHO, anyway) - I don't think we need to mention the pads when there's only one source pad, for example. (The number in the example was meant to be a hash of the filesrc URI btw)

To go back to my list in comment #25, this scheme would accommodate

 1) allow demuxer to signal an order, e.g. via:
      .../foodemux/001/audio_02
      .../foodemux/002/audio_00

 2) *stable unique identifiers* - taken care of
      via topology and URI hashes (if desired)

 3) a way for a demuxer to signal upper
      layers which stream is the *default*:
      Could let the demuxer do that via (1)
      as above - just put default first, or add
      :default suffix or so to ID, or add a
      boolean tag to the stream's taglist.


(I haven't read back the entire comment thread, so apologies if this doesn't address some other comment).

Comment 33 Stefan Sauer (gstreamer, gtkdoc dev) 2012-07-17 16:19:15 UTC

Would it work to use gst_object_get_path_string() to build the id (I know that this is the containment hierarchy and not the dataflow path)?

Comment 34 Sebastian Dröge (slomo) 2012-07-18 07:37:40 UTC

(In reply to comment #30)
> Err, what? GLists? This sounds a bit complicated to me.

Instead of GLists maybe use nested GstStructures as is done for the stream-topology message in decodebin2? This way it's easy to handle by bindings and extendable.

This is only about the topology query proposed by René, which is a good idea on its own IMHO, and could be used to solve this bug here and also the related ones about marking the default stream and other stream markers.

Comment 35 Sebastian Dröge (slomo) 2012-07-29 12:26:58 UTC

Created attachment 219835 [details] [review]
event: Add new stream-id field to the stream-start event

This is supposed to allow uniquely identifying a single stream.

Comment 36 Sebastian Dröge (slomo) 2012-07-29 12:27:06 UTC

Created attachment 219836 [details] [review]
event: Update for stream-start event API changes

Comment 37 Stefan Sauer (gstreamer, gtkdoc dev) 2012-07-30 18:35:59 UTC

Review of attachment 219836 [details] [review]:

::: libs/gst/base/gstbaseparse.c
@@ +2852,3 @@
   if (G_UNLIKELY (parse->priv->push_stream_start)) {
+    GstQuery *query;
+    gchar *uri;

maybe call this stream_id as in the random-number case it is not a uri (and we don't really require one anyway).

@@ +2863,3 @@
+    }
+
+    if (gst_pad_peer_query (parse->sinkpad, query)) {

The comment says MD5 and the code SHA256

::: libs/gst/base/gstbasesrc.c
@@ +840,3 @@
+    g_checksum_update (cs, (const guchar *) uri, strlen (uri));
+
+    } else {

urgs, copy'n'paste. But probably still too specific to move to a utility function?

Comment 38 Sebastian Dröge (slomo) 2012-07-31 15:17:07 UTC

Created attachment 219990 [details] [review]
event: Add new stream-id field to the stream-start event

This is supposed to allow uniquely identifying a single stream.

Comment 39 Sebastian Dröge (slomo) 2012-07-31 15:18:05 UTC

Created attachment 219991 [details] [review]
event: Add new stream-id field to the stream-start event

This is supposed to allow uniquely identifying a single stream.

Comment 40 Sebastian Dröge (slomo) 2012-07-31 15:18:13 UTC

Created attachment 219992 [details] [review]
event: Update for stream-start event API changes

Comment 41 Tim-Philipp Müller 2012-07-31 15:25:09 UTC

Looks much better than the first set IMHO.

Comment 42 Stefan Sauer (gstreamer, gtkdoc dev) 2012-07-31 20:37:35 UTC

Review of attachment 219991 [details] [review]:

::: gst/gstutils.c
@@ +3543,3 @@
+ * Creates a stream-id for the source #GstPad @pad by combining the
+ * upstream information with the optional @stream_id of the stream
+ * of @pad. @pad must have a parent #GstElement and must have zero

"parent #GstElement and must have" -> "parent #GstElement which must have"

@@ +3592,3 @@
+
+      /* And then generate an MD5 sum of the URI */
+  g_return_val_if_fail (parent->numsinkpads <= 1, NULL);

still MD5 <-> SHA256 :/

@@ +3617,3 @@
+  g_free (upstream_stream_id);
+
+      gst_event_unref (upstream_event);

perhaps:

if (stream_id) {
  gchar *id = g_strconcat (upstream_stream_id, "/", stream_id, NULL);
  g_free (upstream_stream_id);
  return id;
} else {
  return upstream_stream_id;
}

Comment 43 Sebastian Dröge (slomo) 2012-08-01 14:43:32 UTC

Created attachment 220064 [details] [review]
event: Add new stream-id field to the stream-start event

This is supposed to allow uniquely identifying a single stream.

Comment 44 Sebastian Dröge (slomo) 2012-08-01 14:43:47 UTC

Created attachment 220065 [details] [review]
event: Update for stream-start event API changes

Comment 45 Sebastian Dröge (slomo) 2012-08-03 11:21:45 UTC

Review of attachment 220065 [details] [review]:

I was wondering if it would make sense to add API to the base classes that allows the subclasses to set a stream-id, which is then used instead of the URI/random stream-id... something like gst_base_parse_set_stream_id(parse, const gchar *stream_id)

Any opinions?

Comment 46 Tim-Philipp Müller 2012-08-03 11:44:31 UTC

> I was wondering if it would make sense to add API to the base classes that
> allows the subclasses to set a stream-id, which is then used instead of the
> URI/random stream-id... something like gst_base_parse_set_stream_id(parse,
> const gchar *stream_id)
> 
> Any opinions?

No opinion either way really? Does it matter? Is it needed for a first implement of things? How/where/when would you use it?

Comment 47 Christian Fredrik Kalager Schaller 2012-08-03 12:28:53 UTC

My hope for this functionality is that I can get a name/object for each stream from discoverer and then when I want to plug things to decodebin I can use those names/objects to identify the stream I want, so in the sense that I hope to just use objects and not string recognition I am fine either way.

Comment 48 Sebastian Dröge (slomo) 2012-08-03 16:08:55 UTC

(In reply to comment #46)
> > I was wondering if it would make sense to add API to the base classes that
> > allows the subclasses to set a stream-id, which is then used instead of the
> > URI/random stream-id... something like gst_base_parse_set_stream_id(parse,
> > const gchar *stream_id)
> > 
> > Any opinions?
> 
> No opinion either way really? Does it matter? Is it needed for a first
> implement of things? How/where/when would you use it?

For the case when there's no URI of the source or when there's another identifier. I guess one could simply check if the srcpad already has a stream-start event in the utility function for that... but can be added later.

(In reply to comment #47)
> My hope for this functionality is that I can get a name/object for each stream
> from discoverer and then when I want to plug things to decodebin I can use
> those names/objects to identify the stream I want, so in the sense that I hope
> to just use objects and not string recognition I am fine either way.

This will only work for the case when the source implements the URI handler interface and has an URI... or the source somehow (see above) provides another deterministic stream-id. The default case would provide a random stream ID.

In any case, after these changes and after they are implemented wherever needed, you will be able to match streams by doing a string comparison of the stream-ids.

Comment 49 Stefan Sauer (gstreamer, gtkdoc dev) 2012-08-03 19:32:33 UTC

> (In reply to comment #47)
> > My hope for this functionality is that I can get a name/object for each stream
> > from discoverer and then when I want to plug things to decodebin I can use
> > those names/objects to identify the stream I want, so in the sense that I hope
> > to just use objects and not string recognition I am fine either way.
> 
> This will only work for the case when the source implements the URI handler
> interface and has an URI... or the source somehow (see above) provides another
> deterministic stream-id. The default case would provide a random stream ID.
> 
> In any case, after these changes and after they are implemented wherever
> needed, you will be able to match streams by doing a string comparison of the
> stream-ids.

How would the matching help if the stream-ids are generated with a random-number generator? Wouldn't it make more sense to hash the caps?

My use case is pretty much the same as Christians. I am using gst-discoverer to figure out the number and properties of streams and then want to configure playbin to play a certain stream.

Comment 50 Tim-Philipp Müller 2012-08-03 21:26:10 UTC

These use cases should be covered fine, since they're all URI-based, no?

Comment 51 Christian Fredrik Kalager Schaller 2012-08-04 05:54:19 UTC

I am getting confused, but isn't the source in this case uridecodebin? And yes it operates on URIs, but if a given file gives me a random audio stream as stream 1, and provides no other means of me to keep track of the audio stream across multiple sessions it all becomes very random. The other item is that I would ideally want a clear link between stream metadata and the streams, so if the metadata says stream 2 is French language and stream 3 is German, I need to know that I got the correct stream without having to listen to it. So maybe grabbing that information could be part of Sebastians suggestion for manually setting a deterministinc stream id? That the demuxer adds DE or FR to the front of the stream id if such information is available?

Comment 52 Sebastian Dröge (slomo) 2012-08-04 08:33:35 UTC

(In reply to comment #49)
> > (In reply to comment #47)
> > > My hope for this functionality is that I can get a name/object for each stream
> > > from discoverer and then when I want to plug things to decodebin I can use
> > > those names/objects to identify the stream I want, so in the sense that I hope
> > > to just use objects and not string recognition I am fine either way.
> > 
> > This will only work for the case when the source implements the URI handler
> > interface and has an URI... or the source somehow (see above) provides another
> > deterministic stream-id. The default case would provide a random stream ID.
> > 
> > In any case, after these changes and after they are implemented wherever
> > needed, you will be able to match streams by doing a string comparison of the
> > stream-ids.
> 
> How would the matching help if the stream-ids are generated with a
> random-number generator? Wouldn't it make more sense to hash the caps?

The PRNG is only used for the case when the URI query returns nothing and there's no upstream stream-id yet. Problem with hashing the caps is that they're not unique, not even in the same pipeline.
For the cases when there's no URI for the stream we would need another solution like the one I mentioned above, but that can be added later.

Also, with these stream-ids it would now be possible to sort the streams inside decodebin/playbin to have a deterministic order of pads/streams.

Comment 53 Stefan Sauer (gstreamer, gtkdoc dev) 2012-08-04 08:39:42 UTC

(In reply to comment #52)
> (In reply to comment #49)
> > 
> > How would the matching help if the stream-ids are generated with a
> > random-number generator? Wouldn't it make more sense to hash the caps?
> 
> The PRNG is only used for the case when the URI query returns nothing and
> there's no upstream stream-id yet. Problem with hashing the caps is that
> they're not unique, not even in the same pipeline.
> For the cases when there's no URI for the stream we would need another solution
> like the one I mentioned above, but that can be added later.
> 
> Also, with these stream-ids it would now be possible to sort the streams inside
> decodebin/playbin to have a deterministic order of pads/streams.

Okay, so it is uri + stream_id and who ever sets the stream id, needs to ensure that the stream id is picked in a deterministic manner? I probably have to see it end-to-end, as now I still wonder how this will work. We still need to pick stream_ids in a deterministic way, right?

Comment 54 Sebastian Dröge (slomo) 2012-08-05 11:52:16 UTC

Created attachment 220358 [details] [review]
oggdemux: Add stream-id to the stream-start event

Comment 55 Sebastian Dröge (slomo) 2012-08-05 11:53:59 UTC

(In reply to comment #54)
> Created an attachment (id=220358) [details] [review]
> oggdemux: Add stream-id to the stream-start event

Take this as an example, the stream-ids of the streams coming out of oggdemux will be "hash_of_uri/serialno_of_stream" for example.

I guess it would make sense to add a GST_WARNING() to the random-number-case of gst_pad_create_stream_id() as this is something that ideally should never happen.

Comment 56 Tim-Philipp Müller 2012-08-05 12:41:47 UTC

Looks like a gst_pad_create_stream_id_printf() might come in handy as well :)

I don't think we should worry too much about non-filesrc based cases for now, and even less about non-URI-based cases. Those IDs are not set in stone, and no one should extract information from them directly IMHO.

Comment 57 Stefan Sauer (gstreamer, gtkdoc dev) 2012-08-05 18:58:25 UTC

(In reply to comment #56)
> I don't think we should worry too much about non-filesrc based cases for now,
> and even less about non-URI-based cases. Those IDs are not set in stone, and no
> one should extract information from them directly IMHO.
+1

Comment 58 Sebastian Dröge (slomo) 2012-08-06 12:18:01 UTC

commit 41954ff8cd5ecb258e079078d93d1489eaccdcfd
Author: Sebastian Dröge <sebastian.droege@collabora.co.uk>
Date:   Sat Jul 28 08:37:00 2012 +0200

    event: Add new stream-id field to the stream-start event
    
    This is supposed to allow uniquely identifying a single stream.

Comment 59 Christian Fredrik Kalager Schaller 2012-10-01 08:19:42 UTC

Ok, now that 1.0 is out it would be good to return to this and https://bugzilla.gnome.org/show_bug.cgi?id=654830

to see if we can get all elements using Sebastians new feature. Supporting multitrack files in Transmageddon has been on my todo list for over a year now :)

Comment 60 Tim-Philipp Müller 2012-10-01 09:00:40 UTC

Christian, I believe everything you need is there now?

Comment 61 Tim-Philipp Müller 2012-11-26 17:42:12 UTC

*** Bug 607466 has been marked as a duplicate of this bug. ***

Comment 62 Tim-Philipp Müller 2013-01-19 13:06:23 UTC

commit c0926dc7cc9a0e180621ad454ee6eb54745f5de1
Author: Tim-Philipp Müller <tim@centricular.net>
Date:   Sat Jan 19 12:51:56 2013 +0000

    pad: add gst_pad_get_stream_id() utility function
    
    API: gst_pad_get_stream_id()


commit 500b86489923b8add387598f367eb8368522c9de
Author: Tim-Philipp Müller <tim@centricular.net>
Date:   Sat Jan 19 13:03:03 2013 +0000

    decodebin: try harder to always expose pads in the same order
    
    Use stream-id as sort criterion in addition to the media type.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=634407