Bug 671909 – Port base video encoding and decoding classes from gst-plugins-bad

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 671909 - Port base video encoding and decoding classes from gst-plugins-bad


Summary:	Port base video encoding and decoding classes from gst-plugins-bad


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gst-plugins-base
Version:	git master
Hardware:	Other Linux

Importance:	Normal enhancement
Target Milestone:	0.10.37
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:	654294 660770 664127 671977
Blocks:	670296

Reported:	2012-03-12 14:57 UTC by Edward Hervey
Modified:	2012-04-24 16:08 UTC

See Also:
GNOME target:	---
GNOME version:	---

Description Edward Hervey 2012-03-12 14:57:46 UTC

I have started moving/porting the base video decoding/encoding classes from gst-plugins-bad to gst-plugins-base and made them more in sync with how the audio encoder/decoder classes are written.

The work in progress is here (includes theora porting):
http://cgit.freedesktop.org/~bilboed/gst-plugins-base/log/?h=basevideo

And I adapter the already ported -bad plugins here:
http://cgit.freedesktop.org/~bilboed/gst-plugins-bad/log/?h=basevideo

Comment 1 Sebastian Dröge (slomo) 2012-03-13 09:24:07 UTC

Other than everything mentioned in the bugs linked to this one now:
- gst_video_decoder_get_byte_time() is a really weird name :)
- GstVideoState/GstVideoFrame needs padding and probably still contains some unused things. But I know that you planned to work on this next
- The adapters in the decoder instance struct should be hidden
- Check the ownership transfer of the frame parameter of _finish_frame(), _drop_frame() and ::handle_frame(). Should be consistent with the ones in the audio base classes and IIRC they are not
- gst_video_decoder_set_sync_point() and other frame-related things should probably operate on the frames instead of the decoder. Feels weird to do it on the decoder instead and also could be problematic if there's internal queueing of more than a single frame inside the decoder
- The segment in the decoder should be public in the instance struct too, there are always reasons to use it. Or be part of the GstVideoState
- ::event() vfuncs should probably be ::sink_event() and ::src_event(). Or ::event() and ::src_event()
- ::get_caps() vfuncs were really useful for the audio base classes and are missing here
- The coder_hook/destroy_notify is probably hard to use for bindings

Most of this applies to decoder and encoder

Comment 2 Edward Hervey 2012-03-13 14:32:32 UTC

(In reply to comment #1)
> Other than everything mentioned in the bugs linked to this one now:
> - gst_video_decoder_get_byte_time() is a really weird name :)

  Indeed :) Any propositions ? I just re-used what was already there.

> - GstVideoState/GstVideoFrame needs padding and probably still contains some
> unused things. But I know that you planned to work on this next

  I need to figure out a proper way to split out the various concepts:
  * A frame (the 'concept' which applies to both encoded and decoded)
  * The raw information (I'm going to re-use the backported GstVideoInfo)
  * The encoded information

> - The adapters in the decoder instance struct should be hidden

  And for the encoders also ?

> - Check the ownership transfer of the frame parameter of _finish_frame(),
> _drop_frame() and ::handle_frame(). Should be consistent with the ones in the
> audio base classes and IIRC they are not
> - gst_video_decoder_set_sync_point() and other frame-related things should
> probably operate on the frames instead of the decoder. Feels weird to do it on
> the decoder instead and also could be problematic if there's internal queueing
> of more than a single frame inside the decoder

  I'll check how to do that.

> - The segment in the decoder should be public in the instance struct too, there
> are always reasons to use it. Or be part of the GstVideoState

  I'd say we need two. One for the input stream/context and one for the output stream/context. Especially important due to the delay/latency betwen input and output.

> - ::event() vfuncs should probably be ::sink_event() and ::src_event(). Or
> ::event() and ::src_event()

  Agreed

> - ::get_caps() vfuncs were really useful for the audio base classes and are
> missing here

  *cough* You removed them when they were in -bad :) I'll put them back

> - The coder_hook/destroy_notify is probably hard to use for bindings
> 
> Most of this applies to decoder and encoder

  Thanks !

Comment 3 Sebastian Dröge (slomo) 2012-03-13 14:46:47 UTC

(In reply to comment #2)
> (In reply to comment #1)
> > Other than everything mentioned in the bugs linked to this one now:
> > - gst_video_decoder_get_byte_time() is a really weird name :)
> 
>   Indeed :) Any propositions ? I just re-used what was already there.

Not really :(

> > - The adapters in the decoder instance struct should be hidden
> 
>   And for the encoders also ?

Yes

> > - The segment in the decoder should be public in the instance struct too, there
> > are always reasons to use it. Or be part of the GstVideoState
> 
>   I'd say we need two. One for the input stream/context and one for the output
> stream/context. Especially important due to the delay/latency betwen input and
> output.

Yes, same for the "caps" btw (input could already have a different resolution as the output for example)

> > - ::event() vfuncs should probably be ::sink_event() and ::src_event(). Or
> > ::event() and ::src_event()
> 
>   Agreed

Best to check what other base classes do here

> > - ::get_caps() vfuncs were really useful for the audio base classes and are
> > missing here
> 
>   *cough* You removed them when they were in -bad :) I'll put them back

The old ::get_caps() that was there was meant to provide the fixated srcpad caps that were put on the pad and on the buffer. What I mean here now is a real get_caps() like everywhere else :)

Comment 4 Edward Hervey 2012-03-13 15:05:33 UTC

Got it, thx !

Comment 5 Sebastian Dröge (slomo) 2012-03-23 11:34:45 UTC

Some more comments here:

* http://cgit.freedesktop.org/~bilboed/gst-plugins-base/commit/?h=basevideo&id=bb8a8c3b0491f0718c98d925c17dc47958cca4a9:
** leaking previously set input state
* http://cgit.freedesktop.org/~bilboed/gst-plugins-base/commit/?h=basevideo&id=54cb2a569b4ed629bfa60f8d978699d5c30cacd1:
** the comment at the top talks about channels/rate
* http://cgit.freedesktop.org/~bilboed/gst-plugins-base/commit/?h=basevideo&id=10fdd482ddfb0094e49ddd97477e30ff802e8274:
** why are there two functions for the caps? can't this be one and detection if this is raw (video/x-raw) or not is inside the function?
** and the parameters to the compare function can be const
* http://cgit.freedesktop.org/~bilboed/gst-plugins-base/commit/?h=basevideo&id=a3aa0b63b4f3aa8a8c66c3d4c9fb4b2d537f3419:
** can we get rid of the clean_* fields too?
* Two states are missing in the encoder currently

And from previous review, just so it doesn't get lost:
* sink_event/src_event vfuncs missing
* two public segments (in and output segment) or make part of the states
* gst_video_decoder_get_byte_time() is a really weird name :)
* GstVideoState/GstVideoFrame needs padding and probably still contains some unused things. But I know that you planned to work on this next
* Check the ownership transfer of the frame parameter of _finish_frame(), _drop_frame() and ::handle_frame(). Should be consistent with the ones in the audio base classes and IIRC they are not
* gst_video_decoder_set_sync_point() and other frame-related things should probably operate on the frames  nstead of the decoder. Feels weird to do it on the decoder instead and also could be problematic if there's internal queueing of more than a single frame inside the decoder
* The coder_hook/destroy_notify is probably hard to use for bindings

Comment 6 Edward Hervey 2012-03-29 13:55:32 UTC

(In reply to comment #5)
> Some more comments here:
> 
> *
> http://cgit.freedesktop.org/~bilboed/gst-plugins-base/commit/?h=basevideo&id=bb8a8c3b0491f0718c98d925c17dc47958cca4a9:
> ** leaking previously set input state

Fixed

> *
> http://cgit.freedesktop.org/~bilboed/gst-plugins-base/commit/?h=basevideo&id=54cb2a569b4ed629bfa60f8d978699d5c30cacd1:
> ** the comment at the top talks about channels/rate

Fixed 

> *
> http://cgit.freedesktop.org/~bilboed/gst-plugins-base/commit/?h=basevideo&id=10fdd482ddfb0094e49ddd97477e30ff802e8274:
> ** why are there two functions for the caps? can't this be one and detection if
> this is raw (video/x-raw) or not is inside the function?
> ** and the parameters to the compare function can be const
> *

Fixed, added a new GstVideoFormat for codecs instead and only have one method

> http://cgit.freedesktop.org/~bilboed/gst-plugins-base/commit/?h=basevideo&id=a3aa0b63b4f3aa8a8c66c3d4c9fb4b2d537f3419:
> ** can we get rid of the clean_* fields too?
> * Two states are missing in the encoder currently

Fixed

> 
> And from previous review, just so it doesn't get lost:
> * sink_event/src_event vfuncs missing

Fixed

> * two public segments (in and output segment) or make part of the states

Fixed

> * gst_video_decoder_get_byte_time() is a really weird name :)

renamed it to _estimate_rate()

> * GstVideoState/GstVideoFrame needs padding and probably still contains some
> unused things. But I know that you planned to work on this next

Added padding.

Was wondering if we couldn't have some private entry for cleaner usage ...

> * Check the ownership transfer of the frame parameter of _finish_frame(),
> _drop_frame() and ::handle_frame(). Should be consistent with the ones in the
> audio base classes and IIRC they are not

Done

> * gst_video_decoder_set_sync_point() and other frame-related things should
> probably operate on the frames  nstead of the decoder. Feels weird to do it on
> the decoder instead and also could be problematic if there's internal queueing
> of more than a single frame inside the decoder

Removed (callers can just set frame->is_sync_point=TRUE instead)

> * The coder_hook/destroy_notify is probably hard to use for bindings

What would a better option be ?


Pushed all changed to my repos (base and bad)

Comment 7 Sebastian Dröge (slomo) 2012-03-30 09:04:42 UTC

(In reply to comment #6)

> > * GstVideoState/GstVideoFrame needs padding and probably still contains some
> > unused things. But I know that you planned to work on this next
> 
> Added padding.
> 
> Was wondering if we couldn't have some private entry for cleaner usage ...

Sure, why not?

> > * The coder_hook/destroy_notify is probably hard to use for bindings
> 
> What would a better option be ?

gst_video_frame_set_coder_hook(frame, hook, destroy_notify). The problem here is with having the two fields in the struct and no obvious connection between them.

Some new reviews :)

* http://cgit.freedesktop.org/~bilboed/gst-plugins-base/commit/?h=basevideo&id=5f76571bef4745340333f33f2985ea3f923fd383 : Maybe rename GstVideoState to GstVideoCodecState too
* In general, use gst_object_ref/unref everywhere, otherwise twi will be unhappy. And myself too. When debugging refcount issues
* gst_video_decoder_set_output_state(), gst_video_encoder_set_output_state() should probably take the video-decoder streamlock, same probably elsewhere where the states are used. It's a recursive mutex btw. This might be necessary for things like gst-omx, where the sink and srcpad have separate streaming threads. Which is the only reason why these lock exists
* http://cgit.freedesktop.org/~bilboed/gst-plugins-base/commit/?h=basevideo&id=26a4ad1a95cc4c464d797cdf9a9adf2e4f0b1ddf : Remove the timestamp/offset stuff for headers completely. It's broken and useless
* I think GST_FLOW_DROPPED is useless now, it was only used for EOS events and these can now be handled and not pushed by the base class correctly, right?
* The _event() vfuncs should require to chain up to the base class and the base class should do all handling there instead of abusing the gboolean return value. We changed that for all base classes in 0.11
* audio encoder/decoder base class has flush vfunc, video encoder/decoder has reset and finish vfuncs. Unify if possible
* audio encoder base class has set_hard_resync(), video encoder has set_discont()  (is this the same at all?)
* getters for setters in video encoder missing
* Something like gst_audio_{de,en}coder_merge_tags() will be very useful to have
* No set_latency() in decoder
* No timestamp tolerance in encoder/decoder (see audio base classes)

Comment 8 Sebastian Dröge (slomo) 2012-03-30 09:06:23 UTC

(In reply to comment #7)
>  [...]
> * The _event() vfuncs should require to chain up to the base class and the base
> class should do all handling there instead of abusing the gboolean return
> value. We changed that for all base classes in 0.11

For this keep the same behaviour as the audio base classes in 0.10 btw (not sure if they chain up or abuse the gboolean), and change it accordingly in 0.11.

Comment 9 Sebastian Dröge (slomo) 2012-03-30 09:42:31 UTC

* Check if the parsing support in video and audio decoder base classes works similar. Currently there's ::parse_data() and ::scan_for_sync() and the capture pattern stuff in the video decoder, and ::parse() only in the audio decoder. Might be unified a bit (which will need extensions in the audio decoder too).

Comment 10 Sebastian Dröge (slomo) 2012-03-30 10:35:53 UTC

* The headers pushing code in videoencoder is broken, it makes the refs inside the list invalid. Also it should set discont on the first header buffer instead of the next non-header buffer (see discont handling code a few lines above) and probably also include the headers in priv->bytes

Comment 11 Sebastian Dröge (slomo) 2012-03-30 12:56:40 UTC

* And the synchronization of the two segments/serialized events is not optimal yet. Serialized events are pushed on the next finish_frame() but this next finish_frame() could be for a frame that happened before these serialized events (and in the worst case are for a previous segment).

Comment 12 Sebastian Dröge (slomo) 2012-04-02 08:39:23 UTC

(In reply to comment #11)
> * And the synchronization of the two segments/serialized events is not optimal
> yet. Serialized events are pushed on the next finish_frame() but this next
> finish_frame() could be for a frame that happened before these serialized
> events (and in the worst case are for a previous segment).

Serialized events that are pending should probably be stored in the GstVideoCodecFrame and pushed downstream when that GstVideoCodecFrame is finished/dropped or any other GstVideoCodecFrame that was created after this frame.

Otherwise we're also reordering data/buffers. Can be done later though.

Comment 13 Mark Nauwelaerts 2012-04-02 10:57:12 UTC

(In reply to comment #9)
> * Check if the parsing support in video and audio decoder base classes works
> similar. Currently there's ::parse_data() and ::scan_for_sync() and the capture
> pattern stuff in the video decoder, and ::parse() only in the audio decoder.
> Might be unified a bit (which will need extensions in the audio decoder too).

Parsing in the video decoder currently seems (over)complicated, particularly when considering that even BaseParse basically gets by with a single call (handle_frame).  As such, either a (even more) simplified version (such as in the audio classes) should suffice or only using the current parse_data (and drop the mask stuff and scan_for_sync, not even sure how that one can do any scanning these days if it does not have access to the data/adapter).

Comment 14 Mark Nauwelaerts 2012-04-02 11:15:27 UTC

Additional comments:

[video encoder]
* almost unused variable stuff (e.g. priv->clipping always TRUE)
* at_eos set to TRUE upon EOS, set to FALSE when new segment (where afaik only FLUSH can clear eos state?)
* draining (e.g. call to gst_video_encoder_drain) happens only once where this happens more in e.g. audio encoder case, and some (or all) might also make sense here.  Also semantics/documentation are unclear there, since the subclass reset method is actually being used to perform draining, not a reset (as in post-seek, as suggested in .h documentation)
* lots of frame number stuff seems to involve almost unused fields, or badly used ones (e.g. system_frame_number is never really properly initialized, frame_decode_number is used to determine dts, but this may have no relation at all to the pts since it does not consider segment info at all).
Might also consider (maybe depending on a subclass settable setting) to obtain a frame's dts as the oldest queued pts.

[video decoder]
* see previous comment w.r.t. parsing
* roughly similar comments as for the encoder apply
(a.o. various frame number and ts issues, e.g. presentation_timestamp might get determined based on (a.o.) a never set base_picture_number, as noted in a FIXME).

Comment 15 Edward Hervey 2012-04-04 08:27:11 UTC

(In reply to comment #10)
> * The headers pushing code in videoencoder is broken, it makes the refs inside
> the list invalid. Also it should set discont on the first header buffer instead
> of the next non-header buffer (see discont handling code a few lines above) and
> probably also include the headers in priv->bytes

fixed

(In reply to comment #8)
> (In reply to comment #7)
> >  [...]
> > * The _event() vfuncs should require to chain up to the base class and the base
> > class should do all handling there instead of abusing the gboolean return
> > value. We changed that for all base classes in 0.11
> 
> For this keep the same behaviour as the audio base classes in 0.10 btw (not
> sure if they chain up or abuse the gboolean), and change it accordingly in
> 0.11.

So nothing to change in 0.10

(In reply to comment #7)
> gst_video_frame_set_coder_hook(frame, hook, destroy_notify). The problem here
> is with having the two fields in the struct and no obvious connection between
> them.

Done

> 
> Some new reviews :)
> 
> *
> http://cgit.freedesktop.org/~bilboed/gst-plugins-base/commit/?h=basevideo&id=5f76571bef4745340333f33f2985ea3f923fd383
> : Maybe rename GstVideoState to GstVideoCodecState too

Done

> * In general, use gst_object_ref/unref everywhere, otherwise twi will be
> unhappy. And myself too. When debugging refcount issues

Done

> * gst_video_decoder_set_output_state(), gst_video_encoder_set_output_state()
> should probably take the video-decoder streamlock, same probably elsewhere
> where the states are used. It's a recursive mutex btw. This might be necessary
> for things like gst-omx, where the sink and srcpad have separate streaming
> threads. Which is the only reason why these lock exists

 done

> *
> http://cgit.freedesktop.org/~bilboed/gst-plugins-base/commit/?h=basevideo&id=26a4ad1a95cc4c464d797cdf9a9adf2e4f0b1ddf
> : Remove the timestamp/offset stuff for headers completely. It's broken and
> useless

 Done

> * I think GST_FLOW_DROPPED is useless now, it was only used for EOS events and
> these can now be handled and not pushed by the base class correctly, right?


> * audio encoder/decoder base class has flush vfunc, video encoder/decoder has
> reset and finish vfuncs. Unify if possible
> * audio encoder base class has set_hard_resync(), video encoder has
> set_discont()  (is this the same at all?)

> * getters for setters in video encoder missing

done

> * Something like gst_audio_{de,en}coder_merge_tags() will be very useful to
> have

can be done later

> * No set_latency() in decoder


How do we want to handle this ? by frames, by min/max time ? By both ?
> * No timestamp tolerance in encoder/decoder (see audio base classes)
We haven't really needed this for any video decoder so far. I'd prefer to see that later on.

Comment 16 Sebastian Dröge (slomo) 2012-04-04 09:56:10 UTC

For latency use min/max time. Also I agree with all of Mark's comments :)

Some new though:
* gst_video_codec_frame_set_hook() should call the destroy notify if one was set previously before overwriting with the new values
* /* FIXME : We should change the buffer timestamp/duration to the clipped values */ ---> Yes ;)

Comment 17 Sebastian Dröge (slomo) 2012-04-05 07:45:34 UTC

* You added a flag for force-keyframe-headers but don't use it (and keep the old boolean field)
* imho set_latency_fields() should be removed
* the weird framenumber fields should be removed too I guess

Comment 18 Nicolas Dufresne (ndufresne) 2012-04-19 14:27:27 UTC

* DISCONT flags should not be forwared (the forward code is broken with encoder that has a queue)

Comment 19 Edward Hervey 2012-04-24 16:00:58 UTC

This is now pushed upstream along with port of:
jpeg, png, schroedinger, dirac, vp8, gst-ffmpeg.