Bug 642690 – [baseaudio] GstBaseAudioEncoder and GstBaseAudioDecoder class

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 642690 - [baseaudio] GstBaseAudioEncoder and GstBaseAudioDecoder class


Summary:	[baseaudio] GstBaseAudioEncoder and GstBaseAudioDecoder class


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gst-plugins-base
Version:	git master
Hardware:	Other Linux

Importance:	Normal enhancement
Target Milestone:	0.10.36
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Duplicates:	582569 (view as bug list)
Depends on:
Blocks:

Reported:	2011-02-18 16:32 UTC by Mark Nauwelaerts
Modified:	2011-12-06 11:56 UTC

See Also:
GNOME target:	---
GNOME version:	---

Description Mark Nauwelaerts 2011-02-18 16:32:38 UTC

See [*] for initial version, as well as a number of audio encoders ported to it.

The ported encoders illustrate how baseclass can be used in a few varying cases; other encoders are typically similar to one of the ported ones, or are somewhat different altogether (e.g. wavpack).

Ported encoders should behave as before (notably matching output ts and granule marking), at least in the "basic perfect stream case" (typically also matching ts a.o. in other cases, but maybe not granule, since existing encoders are hardly consistent in those cases).

Next step is to have a look at what is presented bug #582569, and see if/how that needs cleaning/merging/combining with this class.

Then hopefully it can all get into -base ASAP (rather than compete with e.g. GstCollectPads2 on collected dust/bitrot ...)

[*]
http://git.collabora.co.uk/?p=user/manauw/gst-plugins-bad.git;a=shortlog;h=refs/heads/baseaudioencoder

Comment 1 Stefan Sauer (gstreamer, gtkdoc dev) 2011-03-08 12:56:39 UTC

Cool stuff! A patch set for the basecalsses would be easier to review though. I hope you can match the context.

vorbiscommon seems to be only used by vorbis.

gstbaseaudioencoder.h
 101  * GstAudioState:
this needs a one liner describing the purpose below the fields. Is it really a state or GstAudioFormat. Should that go to gst-plugin-base/libs/audio (could ev. be used in the GstRingBuffer too).

 102  * @xint: whether sample data is int or float
"is_int" looks better to me.


 178  * GstBaseAudioEncoderClass:
 200  * @pre_push:       Optional.
not clear when one would need it. The ported encoders don't seem to use it.

gstbaseaudioencoder.c
 104  * <link linkend="GstBaseAudioEncoder--perfect-ts">perfect-ts</link>.
 108  * than <link linkend="GstBaseAudioEncoder--tolerance">tolerance</link>,
 116  * (see <link linkend="GstBaseAudioEncoder--hard-resync">hard-resync</link>)
you can write "GstBaseAudioEncoder:perfect-ts" "GstBaseAudioEncoder:tolerance" and so on

 231 GType
 232 gst_base_audio_encoder_get_type (void)
shoudl this be using G_DEFINE_ABSTRACT_TYPE_WITH_CODE() ?

 310       g_param_spec_boolean ("perfect-ts", "perfect-ts",
in the name we can spell it out more "perfect timestamps". Also on other properties.

The whole granule business might need some more explanation. People who don't know ogg might not be familiar with it.

 431 /** gst_base_audio_encoder_finish_frame:
the gtk-doc syntax is (several occurrences):
 /**
  * gst_base_audio_encoder_finish_frame:


 451  * Returns: a #GstFlowReturn that should be escalated to caller (of caller)
typo at the end of the line?

 600     // TODO return value ?
ha - its not only me doing that (c++ comments) :) (there are more of them)

 907 #define CHECK_VALUE(res, var, val) \
Use G_STMT_START/END, in the code below you create several ;; right now - just to avoid suprises if some one adds an if() there later on. Also either drop 'res' from the args (and use it implicitely) or also pass 'changed' :)

1181   enc = GST_BASE_AUDIO_ENCODER (gst_pad_get_parent (pad));
event handlers are protected by the streaming lock and thus the pad cannot disappear. enc = GST_BASE_AUDIO_ENCODER (GST_PAD_PARENT (pad)); should be enough.

Comment 2 Stefan Sauer (gstreamer, gtkdoc dev) 2011-03-09 08:52:05 UTC

(In reply to comment #1)
> 
> gstbaseaudioencoder.c
>  104  * <link linkend="GstBaseAudioEncoder--perfect-ts">perfect-ts</link>.
>  108  * than <link linkend="GstBaseAudioEncoder--tolerance">tolerance</link>,
>  116  * (see <link
> linkend="GstBaseAudioEncoder--hard-resync">hard-resync</link>)
> you can write "GstBaseAudioEncoder:perfect-ts" "GstBaseAudioEncoder:tolerance"
> and so on
> 
Correcting myself -> "#GstBaseAudioEncoder:perfect-ts".

Comment 3 Mark Nauwelaerts 2011-03-21 11:52:42 UTC

Updated work now in [*] as it includes both baseaudioencoder and baseaudioencoder, along with common/utils part.

[*] http://git.collabora.co.uk/?p=user/manauw/gst-plugins-bad.git;a=shortlog;h=refs/heads/baseaudio

(In reply to comment #1)
> Cool stuff! A patch set for the basecalsses would be easier to review though. I
> hope you can match the context.

Not entirely sure what is meant here.  If by that you mean patches for the ported elements, that is tricky stuff.  As baseclass is in -bad for now, it would be tricky to have branches all over the various repos, so the plan/hope is to have the ports in -bad for now, and when baseclasses make it up (to -base, albeit possibly with experimental marking), then these ports can be "slammed" (with caution) onto the elements there (thereby turning into a patch).  In particular, that should be possible for most of these elements, as the behaviour of the ports is arranged to match (practically identically) the original element (give or take rarely what would otherwise be fixes to the elements).  Roughly, only the ported mad is a bit distinct from the original one, as it no longer minds about tags etc (that belonging to parser).

That all being said, it's possible to first commit a verbatim copy in -bad audiocodecs, and then modify/port that copy (which should give more of an idea of the eventual patch in the origin repo) --> next update round.

> 
> vorbiscommon seems to be only used by vorbis.
> 
> gstbaseaudioencoder.h
>  101  * GstAudioState:
> this needs a one liner describing the purpose below the fields. Is it really a
> state or GstAudioFormat. Should that go to gst-plugin-base/libs/audio (could
> ev. be used in the GstRingBuffer too).

Jury/taste still out on this one.  video counterparts have a more elaborate state here.  In practice, this kind of "state" suffices (and seems more convenient by itself as a "format").

> 
>  102  * @xint: whether sample data is int or float
> "is_int" looks better to me.
Fixed.

> 
>  178  * GstBaseAudioEncoderClass:
>  200  * @pre_push:       Optional.
> not clear when one would need it. The ported encoders don't seem to use it.

Indeed they do not.  However, baseparse practice has shown it might become useful, and iirc there have been a few (porting) times it has come close to using it (or might be advisable/cleaner to do so).  For instance, it might come in handy having to do semi-hacks in the wavpack(enc) case when having to deal with the extra srcpad there.

> 
> gstbaseaudioencoder.c
>  104  * <link linkend="GstBaseAudioEncoder--perfect-ts">perfect-ts</link>.
>  108  * than <link linkend="GstBaseAudioEncoder--tolerance">tolerance</link>,
>  116  * (see <link
> linkend="GstBaseAudioEncoder--hard-resync">hard-resync</link>)
> you can write "GstBaseAudioEncoder:perfect-ts" "GstBaseAudioEncoder:tolerance"
> and so on
Fixed.

> 
>  231 GType
>  232 gst_base_audio_encoder_get_type (void)
> shoudl this be using G_DEFINE_ABSTRACT_TYPE_WITH_CODE() ?
Possibly, but current approach allows more coding freedom if needed.

> 
>  310       g_param_spec_boolean ("perfect-ts", "perfect-ts",
> in the name we can spell it out more "perfect timestamps". Also on other
> properties.
Done.

> 
> The whole granule business might need some more explanation. People who don't
> know ogg might not be familiar with it.
Some documentation added.

> 
>  431 /** gst_base_audio_encoder_finish_frame:
> the gtk-doc syntax is (several occurrences):
>  /**
>   * gst_base_audio_encoder_finish_frame:
Fixed.

> 
> 
>  451  * Returns: a #GstFlowReturn that should be escalated to caller (of
> caller)
> typo at the end of the line?
It is really meant that way 
(caller == typically subclass, caller of caller == typically baseclass,
which will then return upstream etc)

> 
>  600     // TODO return value ?
> ha - its not only me doing that (c++ comments) :) (there are more of them)
Fixed.

> 
>  907 #define CHECK_VALUE(res, var, val) \
> Use G_STMT_START/END, in the code below you create several ;; right now - just
> to avoid suprises if some one adds an if() there later on. Also either drop
> 'res' from the args (and use it implicitely) or also pass 'changed' :)
Fixed and moved to utils part.

> 
> 1181   enc = GST_BASE_AUDIO_ENCODER (gst_pad_get_parent (pad));
> event handlers are protected by the streaming lock and thus the pad cannot
> disappear. enc = GST_BASE_AUDIO_ENCODER (GST_PAD_PARENT (pad)); should be
> enough.
Fixed.

Comment 4 Mark Nauwelaerts 2011-03-22 13:20:43 UTC

[*] given above has been updated with a few fixes as well as intermediate decoder copy prior to port to highlight port patch difference.

However, the latter are likely not that informative (e.g. due to some massive code deleting that can occur), and the emphasis/intent of the ports is not so much the patch/difference, but to illustrate how a "final result" (using the baseclass) can look/feel/taste like.

Comment 5 Tim-Philipp Müller 2011-05-16 19:42:52 UTC

*** Bug 582569 has been marked as a duplicate of this bug. ***

Comment 6 Mark Nauwelaerts 2011-05-17 10:28:46 UTC

FYI/FWIW: [*] mentioned above has been updated to latest state (along with some stray demo basevideo ports).

Comment 7 Mark Nauwelaerts 2011-05-31 10:52:47 UTC

Any other comments/opinions ?  Would be good to have these up and running e.g. (well) before 0.11 is up and running.

Comment 8 Mark Nauwelaerts 2011-06-17 10:53:47 UTC

So, are we punting these baseclasses for now and only leaving them for 0.11 or can we still get this moving for 0.10 ?

FWIW, using the -bad compare element shows e.g. that the output of ported faac, lamemp3enc and vorbisenc elements is identical to the original ones (give or take minor jitter in ts and/or flags, where neither of the cases are "wrong", and least of all baseaudioencoder).

Comment 9 Stefan Sauer (gstreamer, gtkdoc dev) 2011-06-17 11:13:40 UTC

I see some merit in getting it merged. So +1 from my side.

Comment 10 Tim-Philipp Müller 2011-06-17 11:25:50 UTC

I would like to see these go into 0.10 as well, and was targetting this release cycle even.

I'm thinking encoders first, then decoders.

I'll try to have a look at this soonish if no one else is faster, feel free to remind me on IRC if I forget.

Comment 11 Sebastian Dröge (slomo) 2011-08-12 10:27:01 UTC

Looks good to me, only thing missing are boxed types for all the structs. Let's get this into the next release

Comment 12 Tim-Philipp Müller 2011-08-12 17:25:45 UTC

Some niggles:

 === baseaudioutils ===

 - should all be folded into audio.c (without the
   _base_ stuff) now that we're moving it into
   libgstaudio directly

 - GstAudioState isn't really a state, more like
   an audio format description

 - GstAudioState: some overlap with GstRingBufferSpec
   and the (slightly misnamed) GstBufferFormat and the
   buffer_parse_caps stuff. Not quite sure what follows
   from that, it just feels like we could do something
   better here.

 - GstAudioState: needs new/free or init/clear functions,
   since channel positions are g_malloc()ed.

 - _parse_caps(): is the changed parameter widely-used
    here? Feels cleaner to separate parsed + compare.

 - GTypes for the structs (as Sebastian already said)


 === baseaudioencoder ===

 - GstBaseAudioEncoderContext "deriving" from
    GstAudioState seems a bit weird - are they really
    related? It's going to be awkward for bindings,
    because if we just register boxed types we can't
    derive from another struct / boxed type.

 - in fact, I'm not sure I see the point of the
   Context struct at all - what does it buy us?

   It's not used in any part of the API directly, so
   it doesn't serve to ensure future extensibility
   like in the case of GstBaseParseFrame or
   GstVideoFrame. Seems to me that maybe
   we should just get rid of it and add
   _set_foo_bar() API for the various fields
   instead, to be used by the subclasses?

 - add setters+getters for the properties?

 - "granulepos" - maybe we can come up with
    a better property name? $verb-granulepos
    or so instead of $noun :)

 - "perfect-ts":  would perfect-timestamp[s]
   or "perfect-time" be better maybe? The
   rtp payloaders use perfect-rtptime and
   identity has check-perfect-timestamp,
   for comparison.

 - "tolerance": encodebin has "audio-jitter-tolerance",
   which is long, but more descriptive. Though the audio
   is somewhat implied here I guess. audiorate also
   uses "tolerance" of course. oggmux uses "max-tolerance".

 - the event vfunc returns a boolean, which makes it
   hard to differentiate between FALSE = "drop event"
   and "not handled, do with it what you want" or
   "handled, but return FALSE". Maybe subclass should
   be required to chain up to parent class event func
   explicitly to support these variations? (But then I
   guess it's not so important if you haven't found a 
   need for that during your plugin porting so far).


 === baseaudiodecoder ===

 - same comment as above regarding
    GstBaseAudioDecoderContext

 - "plc" => "packet-loss-concealment" ?

 - explicit getters/setters for properties?

 - re. event vfunc see above

 - re. "tolerance" see above

 - "latency" - if this is a minimum, maybe
   name it as such? (dunno)


I'm happy to help re-shuffle things if needed/desired.

Comment 13 Sebastian Dröge (slomo) 2011-08-15 06:19:31 UTC

I've to agree with all of Tim's comments

(In reply to comment #12)

>  - GstAudioState isn't really a state, more like
>    an audio format description
>
>  - GstAudioState: some overlap with GstRingBufferSpec
>    and the (slightly misnamed) GstBufferFormat and the
>    buffer_parse_caps stuff. Not quite sure what follows
>    from that, it just feels like we could do something
>    better here.

Let's call it GstAudioFormatInfo, like in 0.11 GstVideoFormatInfo. And the same for GstVideoState in the video base class. This way we might require less changes when moving to 0.11.
 
>  - GstBaseAudioEncoderContext "deriving" from
>     GstAudioState seems a bit weird - are they really
>     related? It's going to be awkward for bindings,
>     because if we just register boxed types we can't
>     derive from another struct / boxed type.

It's not strictly deriving but yes, this will confuse bindings and make everything harder for bindings. Better store a pointer to the state.

>  - in fact, I'm not sure I see the point of the
>    Context struct at all - what does it buy us?
> 
>    It's not used in any part of the API directly, so
>    it doesn't serve to ensure future extensibility
>    like in the case of GstBaseParseFrame or
>    GstVideoFrame. Seems to me that maybe
>    we should just get rid of it and add
>    _set_foo_bar() API for the various fields
>    instead, to be used by the subclasses?

Me neither, I think the context can go away too

>  - the event vfunc returns a boolean, which makes it
>    hard to differentiate between FALSE = "drop event"
>    and "not handled, do with it what you want" or
>    "handled, but return FALSE". Maybe subclass should
>    be required to chain up to parent class event func
>    explicitly to support these variations? (But then I
>    guess it's not so important if you haven't found a 
>    need for that during your plugin porting so far).

That's how we do it in the other base classes. It makes sense for consistency but has the problem you mentioned and also is not nice for bindings. We should probably keep it for 0.10 this way and for 0.11 change it (with the event functions returning GstFlowReturn this helps already), especially the event ownership depending on the return value is a problem.

Comment 14 Mark Nauwelaerts 2011-08-17 16:55:20 UTC

Updated version pushed to aforementioned branch, changes roughly involve:

* _get and _set for various context fields/parameters (including renamed GstAudioFormatInfo)

* also _get and _set for object properties.  Some properties renamed, others kept where believed consistent with other elements.  Taste may still vary.

* event vfunc kept as is for now (0.10)

* some changes in audio utils; separate _parse and _compare, and other _init/_clear etc helper code

It remains to add boxed type registration and actually move to -base (and then also merge with existing audio), but this way it can already be checked w.r.t. previous review and taste.

Comment 15 Wim Taymans 2011-08-17 17:57:42 UTC

I'm going to be making a GstAudioFormat enum and GstAudioFormatInfo in 0.11. This GstAudioFormatInfo is almost right. I would use flags to mark it as int/float and le/be and some other things.

It might be worth it to add this also directly to 0.10 and remove it from the audioutils.h here. Just a heads up.

Comment 16 Tim-Philipp Müller 2011-08-17 18:04:45 UTC

> It might be worth it to add this also directly to 0.10 and remove it from the
> audioutils.h here. Just a heads up.

The plan is to add it to libgstaudio in -base and move the audioutils.[ch] stuff to -base/gst-libs/gst/audio/audio.[ch]

Comment 17 Tim-Philipp Müller 2011-08-27 17:35:25 UTC

Ok, pushed this now, and added the 0.11 API to audio.h and ported everything to that. Haven't tested the implementation yet, so possible I introduced some bugs. Left unstable API defines in there for now. Also haven't added it to the docs build yet, because:

Question: why not just GstAudioDecoder and GstAudioEncoder? (pls don't hate me)

Comment 18 Mark Nauwelaerts 2011-08-29 10:22:17 UTC

Thanks for all that; guess will find out if something went wrong when porting something.

Typically don't really mind naming too much (as long as it's clear), but it seems a bit of a bold(/confusing?) move to have the first base class(es) that would have no 'Base' in it (but again; if it works).

Comment 19 Wim Taymans 2011-08-29 10:30:14 UTC

Ok, so for 0.11 I will do the event function with a default implementation that should be called by the subclass. IMO, it would make sense for 0.10 too.

I will have to make the gst_base_audio_decoder_src_setcaps() function public and would recommend to call this from the subclasses instead of doing gst_pad_set_caps() (it's gone in 0.11) or call it something like: gst_base_audio_decoder_set_outcaps() like the RTP payloader, dunno.

Comment 20 Tim-Philipp Müller 2011-08-29 20:37:51 UTC

> Typically don't really mind naming too much (as long as it's clear), but it
> seems a bit of a bold(/confusing?) move to have the first base class(es) that
> would have no 'Base' in it (but again; if it works).

True, but then again that's just because most of those classes are actually in libgstbase-0.10 (hence GstBaseXyz).

There are base classes without the 'Base' bit already, e.g. GstAudioSink, GstVideoSink, GstVideoFilter. But then, to add to the confusion, GstCddaBaseSrc rather than GstBaseCddaSrc (which at least maintains the GstCdda prefix for libgstcdda ;)).

I don't really mind either way, it's just that GstAudio{Decoder,Encoder} is shorter and it might be more obvious that it belongs to libgstaudio.

Comment 21 Tim-Philipp Müller 2011-08-29 20:41:30 UTC

> Ok, so for 0.11 I will do the event function with a default implementation that
> should be called by the subclass. IMO, it would make sense for 0.10 too.

You mean so the subclass has to decide whether to chain up or not?


> I will have to make the gst_base_audio_decoder_src_setcaps() function public
> and would recommend to call this from the subclasses instead of doing
> gst_pad_set_caps() (it's gone in 0.11) or call it something like:
> gst_base_audio_decoder_set_outcaps() like the RTP payloader, dunno.

Sounds like a plan. Could you fix both in 0.10 directly and then merge into 0.11?

Comment 22 Sebastian Dröge (slomo) 2011-11-24 10:31:38 UTC

Do we still want to do that in 0.10 now? It's changed in 0.11 already but not in 0.10 and all the encoders/decoders in 0.10 are using gst_pad_set_caps().

Anything else that should be done here before the release?

Comment 23 Tim-Philipp Müller 2011-11-25 12:26:46 UTC

gst_audio_decoder_set_outcaps() seems to also adjust the base_ts in case the sample rate changes, which might be useful (but then the chain function resets that on DISCONT anyway, which might be sufficient in practice..). Otherwise we may just as well leave it as it is now.

Comment 24 Tim-Philipp Müller 2011-12-04 16:06:46 UTC

One other thing that might be good to fix is this:

in GLib master the GStaticRecMutex API is deprecated, and we expose a GStaticRecMutex stream_lock in the public GstAudioDecoder/Encoder headers, including lock/unlock macros. These don't seem to be used anywhere though, so I wonder if we should just make those private ?

Comment 25 Sebastian Dröge (slomo) 2011-12-04 16:44:50 UTC

They are used in gst-omx and necessary when you want to have different streaming threads for the sinkpad and srcpad

Comment 26 Tim-Philipp Müller 2011-12-04 17:10:41 UTC

Ok, I guess then we have two options:

 a) not care, given that core API is littered with these as well

 b) make private and make lock/unlock use wrapper API

Comment 27 Sebastian Dröge (slomo) 2011-12-04 17:48:29 UTC

I'd vote for a), it's simpler and consistent with core's *_STREAM_LOCK() macros

Comment 28 Tim-Philipp Müller 2011-12-06 11:56:23 UTC

Ok, let's close this, I think we're all happy with how things are. The set_src_caps() function can still be added later if required.