Bug 667352 – [0.11] misc core API/bug comments/nitpicks

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 667352 - [0.11] misc core API/bug comments/nitpicks


Summary:	[0.11] misc core API/bug comments/nitpicks


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gstreamer (core)
Version:	0.11.x
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	0.11.x
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2012-01-05 12:35 UTC by Mark Nauwelaerts
Modified:	2012-09-25 12:08 UTC

See Also:
GNOME target:	---
GNOME version:	---

Description Mark Nauwelaerts 2012-01-05 12:35:49 UTC

Collecting a number of odd things/behaviour that pop up when browsing 0.11 core; maybe they are bugs, maybe as it is supposed to be.

* memory/buffer:
- no checking on modifying size in _unmap
- a number of operations on _buffer can be tricky with an outstanding _map
(notable removing a memory range, or performing another _map); no checking though
- what is to happen upon release/free with an outstanding _map (allowed or not, automagic stuff ?); no doc or sanity checking
- some adapter operations with outstanding _map can be likewise tricky (most notably another _map, _flush or _clear); see also bug #664133.

* (sticky) event:
- order of serialized events not preserved;
a sticky event could be received, not sent downstream and stored, and then a subsequent event is sent directly downstream without sending the stored first
* a FLUSH_STOP does not clear any sticky event, so the latter can turn up after a flush, which seems to be very (too?) sticky
* when a sticky event is stored, it replaces the currently stored of same type, and as such it will also be sent in order of the "old event", rather than current/new order

Comment 1 Tim-Philipp Müller 2012-02-06 10:22:35 UTC

I think some of these have been fixed now, esp. the mapping stuff, no?

Comment 2 Mark Nauwelaerts 2012-02-06 10:55:38 UTC

Some of these, yes, mainly on the "lower" memory/buffer level, though e.g. still a bit unclear what happens with outstanding _map upon free (doc or API wise).  Also, most adapter problems seem plugged for now, but not entirely sure there on the whole.

Comment 3 Wim Taymans 2012-02-14 09:29:06 UTC

(In reply to comment #0)
> Collecting a number of odd things/behaviour that pop up when browsing 0.11
> core; maybe they are bugs, maybe as it is supposed to be.
> 
> * memory/buffer:
> - no checking on modifying size in _unmap

The resizing feature was removed from unmap.

> - a number of operations on _buffer can be tricky with an outstanding _map
> (notable removing a memory range, or performing another _map); no checking
> though

The outstanding _map now keeps a ref to the memory object so it can't go away when buffer operations remove the memory for some reason. Nested mappings are now allowed and refcounted. You can only do a nested mapping with the same mode.

> - what is to happen upon release/free with an outstanding _map (allowed or not,
> automagic stuff ?); no doc or sanity checking

Unreffing the memory with an outstanding _map will cause an assertion and is not allowed. Since the buffer map keeps a ref to the memory objects, you can't unref the memory while mapped.

> - some adapter operations with outstanding _map can be likewise tricky (most
> notably another _map, _flush or _clear); see also bug #664133.

the adapter now keeps the state of the last map operation and automatically unmaps when the state would become invalid (_clear, _flush etc). It's not as nice but it should work.

> 
> * (sticky) event:
> - order of serialized events not preserved;
> a sticky event could be received, not sent downstream and stored, and then a
> subsequent event is sent directly downstream without sending the stored first

I will check how this is suppposed to work (looking at the code it seems like sticky events are not pushed before the new event)

> * a FLUSH_STOP does not clear any sticky event, so the latter can turn up after
> a flush, which seems to be very (too?) sticky

FLUSH_STOP clears the EOS event. Removing segment events is not really needed, the next segment event after the flush will replace the old one. All other events can probably stay on the pad.. 

> * when a sticky event is stored, it replaces the currently stored of same type,
> and as such it will also be sent in order of the "old event", rather than
> current/new order

While this is true, it never causes events to go into a 'wrong' order, you'll always have STREAM_START, CAPS, STREAM_CONFIG, SEGMENT, ..., EOS in that order (unless the element sends out the events in the wrong order when starting, we don't enforce that)

Comment 4 Wim Taymans 2012-03-09 11:10:31 UTC

the event concern should be fixed with this:

commit 35241f35c036beec0237653db388ba7fd8b93385
Author: Wim Taymans <wim.taymans@collabora.co.uk>
Date:   Fri Mar 9 11:53:54 2012 +0100

    pad: also push sticky events on new event
    
    Make a helper function check_sticky to check and push pending sticky events.
    Move the handling of the result of pushing the sticky event inside the
    push_event function, we need to mark the event as received when it was pushed
    correctly.
    Move the sticky events code outside of gst_pad_push_event_unchecked and
    make it purely handle sending the event to the peer.
    when pushing a sticky event, first store it on the pad. Then check and push any
    pending sticky events when we get a serialized or sticky event on a srcpad. This
    fixes the issue where sticky events are not pushed when an event is pushed.

Comment 5 Mark Nauwelaerts 2012-03-15 09:53:55 UTC

Indeed, previously stated are resolved now, but there are some more where they came from, so reopening (rather than a new bug with a same title/purpose, though that is of course also possible if preferable).

Again, some may already be known and/or mentioned somewhere more or less, but anyway ...:

* tag event is sticky event, presumably with the intent of persisting these and delivering to downstream whenever possible. It also allows for multiple ones, which are distinguished by name. However, afaik, each tag event has the same name, which still only allows one event. So things will not turn out as persistent as expected/hoped, unless each element really sends all it has to in one tag event (and also with subsequent updates). Many (if not all) do not do so, and might be somewhat cumbersome to arrange it so (and undocumented as well).

* a GstVideoFrame holds a pointer to the buffer it is mapping which is not refcounted. That may be so intended and may work out well, but seems potentially tricky.

* afaics gst_buffer_pool_config_get returns (a.o.) caps with an extra ref, which seems not to be taken into account presently (documented or in code anywhere). It looks like this is not intended (and then needs some adjustment). But even if intended, a construction using gst_structure_id_get with multiple fields is tricky since if it fails it is not known whether or not it has succeeded enough for an extra ref to have already been taken.

* audio base classes in 0.10 use pbutils for auto-adding some codec tag info, which is presently disabled in 0.11 (due to circular linkage maybe?), but needs settling

* similarly, plc (packet loss concealment) used newsegment updates and is presently not handled (maybe the new gap event is supposed to handle this?)

* w.r.t. some new events (stream-start, gap, stream-config), these appear largely not used yet at present, and the semantics/documentation are not always clear (e.g. how do caps/stream-config compete or compare, or stream-headers vs codec-data for that matter)

Comment 6 Mark Nauwelaerts 2012-03-27 10:57:05 UTC

And some more ...

* gst_buffer_span has disabled code that will copy over _OFFSET etc if the offset == 0.  Not doing so impacts gst_buffer_join in particular and affects all sorts of elements in subtle ways (e.g. makes the icydemux test fail etc).
Don't know what's the deal/idea here; re-enable ? or is there an alternative.

* seems like elements should be very careful these days to return FALSE when receiving some event (e.g. a newsegment event in an unexpected format), since upstream will keep trying to send it (if sticky, as many are) and fail hard when "real data" is then being pushed (and the sticky event still was not received properly).  Maybe self-evident, but probably a problem here and there for elements using an event return value for some hacks that might then need (a query?) alternative (e.g. flacenc springs to mind).

* a device's position (e.g. alsa) can typically contain something like
  {
      GST_AUDIO_CHANNEL_POSITION_MONO},
  {
        GST_AUDIO_CHANNEL_POSITION_FRONT_LEFT,
      GST_AUDIO_CHANNEL_POSITION_FRONT_RIGHT},

However, this will lead to gst_audio_ring_buffer_set_channel_positions failing (assertion-wise or maybe more seriously) when the incoming data's channel mask describes e.g. a single _FRONT_CENTER mono channel, or _REAR_LEFT and _REAR_RIGHT etc).

* crop metadata vs caps:
it looks like the caps width/height should reflect the image's total width/height, and not the cropped width/height (presumably so an element not having such crop meta support can carry on as usual).  If indeed so, maybe having this specified/documented somewhere?

Comment 7 Wim Taymans 2012-03-27 11:14:09 UTC

(In reply to comment #5)
> * tag event is sticky event, presumably with the intent of persisting these and
> delivering to downstream whenever possible.  It also allows for multiple ones,
> which are distinguished by name.  However, afaik, each tag event has the same
> name, which still only allows one event.  So things will not turn out as
> persistent as expected/hoped, unless each element really sends all it has to in
> one tag event (and also with subsequent updates).  Many (if not all) do not do
> so, and might be somewhat cumbersome to arrange it so (and undocumented as
> well).

It is true, we don't have an api to name the taglist. We can use the structure API to change the name, though. I will have a look.

> 
> * a GstVideoFrame holds a pointer to the buffer it is mapping which is not
> refcounted.  That may be so intended and may work out well, but seems
> potentially tricky.

OK, I think it can be refcounted without problems.

> 
> * afaics gst_buffer_pool_config_get returns (a.o.) caps with an extra ref,
> which seems not to be taken into account presently (documented or in code
> anywhere).  It looks like this is not intended (and then needs some
> adjustment).  But even if intended, a construction using gst_structure_id_get
> with multiple fields is tricky since if it fails it is not known whether or not
> it has succeeded enough for an extra ref to have already been taken.

I removed the extra ref.

Comment 8 Wim Taymans 2012-03-27 13:27:37 UTC

(In reply to comment #6)
> * gst_buffer_span has disabled code that will copy over _OFFSET etc if the
> offset == 0.  Not doing so impacts gst_buffer_join in particular and affects
> all sorts of elements in subtle ways (e.g. makes the icydemux test fail etc).
> Don't know what's the deal/idea here; re-enable ? or is there an alternative.

I fixed it now, no idea why it was disabled.

> * seems like elements should be very careful these days to return FALSE when
> receiving some event (e.g. a newsegment event in an unexpected format), since
> upstream will keep trying to send it (if sticky, as many are) and fail hard
> when "real data" is then being pushed (and the sticky event still was not
> received properly).  Maybe self-evident, but probably a problem here and there
> for elements using an event return value for some hacks that might then need (a
> query?) alternative (e.g. flacenc springs to mind).

Refusing an event will now also generate an error when trying to push a buffer.
I think this makes sense in the case of a segment event with a wrong format.
Will check what hack flacenc does.

> * crop metadata vs caps:
> it looks like the caps width/height should reflect the image's total
> width/height, and not the cropped width/height (presumably so an element not
> having such crop meta support can carry on as usual).  If indeed so, maybe
> having this specified/documented somewhere?

The caps should reflect what the user visible region is, so _after_ cropping. The idea is that a capsfilter still affects the visual result. An element can only send a crop metadata if downstream understands, that means all the downstream elements should understand (and explicitly enable) the crop metadata. If downstream does not accept the crop metadata, the upstream element should do
software cropping (which might not be very accurate).

Comment 9 Mark Nauwelaerts 2012-03-27 17:09:46 UTC

(In reply to comment #8)
> (In reply to comment #6)
> > * crop metadata vs caps:
> > it looks like the caps width/height should reflect the image's total
> > width/height, and not the cropped width/height (presumably so an element not
> > having such crop meta support can carry on as usual).  If indeed so, maybe
> > having this specified/documented somewhere?
> 
> The caps should reflect what the user visible region is, so _after_ cropping.
> The idea is that a capsfilter still affects the visual result. An element can
> only send a crop metadata if downstream understands, that means all the
> downstream elements should understand (and explicitly enable) the crop
> metadata. If downstream does not accept the crop metadata, the upstream element
> should do
> software cropping (which might not be very accurate).

If by "accepting crop metadata", this means that downstream should be peeking into allocation query flying by, examining it for allocation meta apis and then chucking out the ones it does not like/know about, then that seems a bit orthogonal to the supposed extensibility of such apis (as in; future ones currently unknown).

That being as it may, it seems in practice all a typical element/code would care about is that it can still/always rely upon the outcome of gst_video_frame_map and all the GST_VIDEO_FRAME_WIDTH etc stuff.  In particular, one would expect this to correspond to caps specs (or not?) (so specified/documented currently?).  And it does not look like e.g. crop metadata would have this working out well unless there is then at least also video meta to arrange for that.

So if the ground rule for all the meta apis is "video frame mapping must work out properly", then fine by me, though (as said) this requires quite some implementation caution.  And some ambiguity arises, e.g. if upstream decoder decides to use crop api but "forgets" to enable the video meta buffer pool option, how will that work out ...  Or should it also know never to use crop api if video meta is not supported etc

--

On an unrelated topic, it looks like e.g. gst_buffer_join does not "simply" provide a buffer that has all the GstMemory's of the original buffers but has performed (unless in rare cases) a memcpy of all involved.  This is not what one would expect/hope from something so named (with all the GstMemory these days).  Naming aside, a (helper) function doing such seems pretty useful/needed, particularly since "regular buffers" (and the memory blocks) should be able to function as old-style buffer_list (group), as indicated in porting docs.

Comment 10 Tim-Philipp Müller 2012-03-27 22:04:35 UTC

> > * seems like elements should be very careful these days to return FALSE when
> > receiving some event (e.g. a newsegment event in an unexpected format), since
> > upstream will keep trying to send it (if sticky, as many are) and fail hard
> > when "real data" is then being pushed (and the sticky event still was not
> > received properly).  Maybe self-evident, but probably a problem here and there
> > for elements using an event return value for some hacks that might then need (a
> > query?) alternative (e.g. flacenc springs to mind).
> 
> Refusing an event will now also generate an error when trying to push a buffer.
> I think this makes sense in the case of a segment event with a wrong format.
> Will check what hack flacenc does.

It probably tries to push a BYTE newsegment with a non-0 start to see if downstream is seekable. There may be one or two more elements that do this. It used to be the only way.

This should be replaced with s downstream SEEKABLE query. If the query is not handled (FALSE return value), we should also assume downstream is not seekable in master/0.11.

Comment 11 Wim Taymans 2012-03-28 08:37:38 UTC

(In reply to comment #9)
> 
> If by "accepting crop metadata", this means that downstream should be peeking
> into allocation query flying by, examining it for allocation meta apis and then
> chucking out the ones it does not like/know about, then that seems a bit
> orthogonal to the supposed extensibility of such apis (as in; future ones
> currently unknown).

That is how it should be done if you proxy the allocation query but need to deal with the data in the buffers somehow. You remove all the metadata that you don't know or that becomes invalid based on the transform you do (using the tags).

> 
> That being as it may, it seems in practice all a typical element/code would
> care about is that it can still/always rely upon the outcome of
> gst_video_frame_map and all the GST_VIDEO_FRAME_WIDTH etc stuff.  In
> particular, one would expect this to correspond to caps specs (or not?) (so
> specified/documented currently?).  And it does not look like e.g. crop metadata
> would have this working out well unless there is then at least also video meta
> to arrange for that.

The crop metadata needs to be handled separately, the video frame API will not automatically crop (and it can't accurately).  

> 
> So if the ground rule for all the meta apis is "video frame mapping must work
> out properly", then fine by me, though (as said) this requires quite some
> implementation caution.  And some ambiguity arises, e.g. if upstream decoder
> decides to use crop api but "forgets" to enable the video meta buffer pool
> option, how will that work out ...  Or should it also know never to use crop
> api if video meta is not supported etc

If you use the crop metadata on raw video, you also need to have the video metadata enabled because else you can't specify the dimensions of the bigger uncropped image. 

The reason that the crop metadata is there is because we might want to add it to non-raw formats.

> 
> --
> 
> On an unrelated topic, it looks like e.g. gst_buffer_join does not "simply"
> provide a buffer that has all the GstMemory's of the original buffers but has
> performed (unless in rare cases) a memcpy of all involved.  This is not what
> one would expect/hope from something so named (with all the GstMemory these
> days).  Naming aside, a (helper) function doing such seems pretty
> useful/needed, particularly since "regular buffers" (and the memory blocks)
> should be able to function as old-style buffer_list (group), as indicated in
> porting docs.

Indeed, I'll look at this.

Comment 12 Mark Nauwelaerts 2012-03-28 11:30:06 UTC

(In reply to comment #10)
> It probably tries to push a BYTE newsegment with a non-0 start to see if
> downstream is seekable. There may be one or two more elements that do this. It
> used to be the only way.
> 
> This should be replaced with s downstream SEEKABLE query. If the query is not
> handled (FALSE return value), we should also assume downstream is not seekable
> in master/0.11.

Could probably be done somehow on element level, the tricky part here is what should be the default behaviour for a passing downstream SEEKABLE query.  An option might be to have it passed by default, and to have GstCollectPads2 return FALSE to it which would prevent it from being answered positively by a filesink typically following a muxer.

----

Btw, this also shows that (a.o. ?) gst_pad_push_event currently has a flaw; consider some sticky event sent downstream (which it does not like; i.e. returns FALSE), then that FALSE is never returned to the sender (i.e. eaten by core due to a pretty hard res = TRUE in there).

Even if that one is handled, a related problem might remain.  That is, if one event does not make it properly downstream, push_sticky currently makes evens_foreach give up on the following events, so a subsequent EOS might never make it through anymore (which is typically part of some error handling loop cleanup/shutdown).  One might say "so be it" but an EOS passing by does help to avoid "hanging cases" and allows for "graceful degradation" (e.g. muxer metadata etc).

Comment 13 Wim Taymans 2012-03-28 15:03:27 UTC

(In reply to comment #12)
> (In reply to comment #10)
> > It probably tries to push a BYTE newsegment with a non-0 start to see if
> > downstream is seekable. There may be one or two more elements that do this. It
> > used to be the only way.
> > 
> > This should be replaced with s downstream SEEKABLE query. If the query is not
> > handled (FALSE return value), we should also assume downstream is not seekable
> > in master/0.11.
> 
> Could probably be done somehow on element level, the tricky part here is what
> should be the default behaviour for a passing downstream SEEKABLE query.  An
> option might be to have it passed by default, and to have GstCollectPads2
> return FALSE to it which would prevent it from being answered positively by a
> filesink typically following a muxer.
> 
> ----
> 
> Btw, this also shows that (a.o. ?) gst_pad_push_event currently has a flaw;
> consider some sticky event sent downstream (which it does not like; i.e.
> returns FALSE), then that FALSE is never returned to the sender (i.e. eaten by
> core due to a pretty hard res = TRUE in there).

This is intentional, the sticky events all return TRUE when they could be stored on a srcpad. The reason is that a boolean value is not enough information to decide what goes on. It eventually works out because the dataflow will error out after a while (If there is dataflow, see below). 

We're now in a situation where the result of pushing an event is not useful anymore. If you get FALSE, you know something is definitely wrong, if you get TRUE, you don't know, the event could be stored on a non-linked pad and will be pushed later.

> 
> Even if that one is handled, a related problem might remain.  That is, if one
> event does not make it properly downstream, push_sticky currently makes
> evens_foreach give up on the following events, so a subsequent EOS might never
> make it through anymore (which is typically part of some error handling loop
> cleanup/shutdown).  One might say "so be it" but an EOS passing by does help to
> avoid "hanging cases" and allows for "graceful degradation" (e.g. muxer
> metadata etc).

Maybe it could continue pushing the EOS (and only the EOS because the other events might not make sense without the previous events).

There is a bigger problem, though. If you have a pad with only a sticky EOS event and you link that to a sink. The sticky EOS will never be pushed because there is no streaming thread anymore to push it.

Comment 14 Mark Nauwelaerts 2012-03-28 16:54:59 UTC

(In reply to comment #13)
> (In reply to comment #12) 
> > 
> > Even if that one is handled, a related problem might remain.  That is, if one
> > event does not make it properly downstream, push_sticky currently makes
> > evens_foreach give up on the following events, so a subsequent EOS might never
> > make it through anymore (which is typically part of some error handling loop
> > cleanup/shutdown).  One might say "so be it" but an EOS passing by does help to
> > avoid "hanging cases" and allows for "graceful degradation" (e.g. muxer
> > metadata etc).
> 
> Maybe it could continue pushing the EOS (and only the EOS because the other
> events might not make sense without the previous events).

It probably should, since otherwise there can be lots of pipeline hanging near the end.  Also, dataflow (eventually) failing because of some event "failing" (some return FALSE somewhere/time earlier) tends to happen rather quietly/crypticly, since it generates a GST_FLOW_ERROR without a corresponding GST_ELEMENT_ERROR (which normally should be, not sure whether here also is a good thing), and a message less is one more chance to hang.

Comment 15 Mark Nauwelaerts 2012-04-16 16:26:36 UTC

(In reply to comment #13)
> This is intentional, the sticky events all return TRUE when they could be
> stored on a srcpad. The reason is that a boolean value is not enough
> information to decide what goes on. It eventually works out because the
> dataflow will error out after a while (If there is dataflow, see below). 
>
> We're now in a situation where the result of pushing an event is not useful
> anymore. If you get FALSE, you know something is definitely wrong, if you get
> TRUE, you don't know, the event could be stored on a non-linked pad and will be
> pushed later.

FWIW, this actually sort-of sounds pretty ominous; "push an event, and you have no definite idea what happened or what will happen or fail some time later"
[it also implies that any sticky event handling ---other than CAPS--- that dares to return FALSE will likely incur quite some wrath].

IIRC at some point it was planned to have event handling also use a GstFlowReturn so as to know more about what goes on.  In particular, it could be used here to only discard a _NOT_LINKED but still report "real errors upstream", but it seems that somehow got dropped along the way.

Not having such feedback means that performing a SEEKING query at one time is all there is to know (hope?) that sending a BYTE segment downstream some other time will probably work out ok (which all feels a bit wobbly/contrived/unrelated).

Comment 16 Tim-Philipp Müller 2012-09-12 00:15:10 UTC

> Not having such feedback means that performing a SEEKING query at one time is
> all there is to know (hope?) that sending a BYTE segment downstream some other
> time will probably work out ok (which all feels a bit
> wobbly/contrived/unrelated).

Dunno, that bit actually makes much more sense to me than checking the return value of the newsegment event.


Any particular bits from the previous discussion that we should have a look at at this point in the cycle ? (Don't really see us changing event push return type now and some such now...).

Comment 17 Tim-Philipp Müller 2012-09-25 12:08:17 UTC

All resolved? Good!