After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 746617 - opusenc: headers are never sent
opusenc: headers are never sent
Status: RESOLVED OBSOLETE
Product: GStreamer
Classification: Platform
Component: gst-plugins-base
git master
Other All
: Normal major
: git master
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2015-03-22 19:05 UTC by Ilya Konstantinov
Modified: 2018-11-03 11:36 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch - Actually send headers (5.67 KB, patch)
2015-03-22 19:08 UTC, Ilya Konstantinov
none Details | Review
Patch - Actually send headers (5.60 KB, patch)
2015-03-22 20:19 UTC, Ilya Konstantinov
rejected Details | Review
rtpopuspay: Set the number of channels to 2 as per RFC draft (859 bytes, patch)
2015-03-24 18:09 UTC, Olivier Crête
committed Details | Review
rtpopuspay: Forward stereo preferences from caps upstream (3.78 KB, patch)
2015-03-24 18:09 UTC, Olivier Crête
committed Details | Review
rtpopusdepay: sprop-* are optional, no need for a WARNING (1.33 KB, patch)
2015-03-24 18:09 UTC, Olivier Crête
reviewed Details | Review
opusenc: Forward channel count preferences from downstream to source (3.83 KB, patch)
2015-03-24 18:09 UTC, Olivier Crête
needs-work Details | Review

Description Ilya Konstantinov 2015-03-22 19:05:08 UTC
opusenc is creating headers but they never get sent. Therefore opusdec resorts to fallback values, e.g. always 2 channels.

(e.g. in OpenWebRTC this results in wasteful audioconverts)
Comment 1 Ilya Konstantinov 2015-03-22 19:08:56 UTC
Created attachment 300088 [details] [review]
Patch - Actually send headers
Comment 2 Sebastian Dröge (slomo) 2015-03-22 19:44:27 UTC
Review of attachment 300088 [details] [review]:

Good catch :)

::: ext/opus/gstopusenc.c
@@ +984,3 @@
         gst_tag_setter_get_tag_list (GST_TAG_SETTER (enc)));
 
+    gst_audio_encoder_set_headers (benc, enc->headers);

This function takes ownership of the list of headers. The pointer is not valid anymore afterwards, so you would either need to deep-copy the list or set enc->headers to NULL
Comment 3 Sebastian Dröge (slomo) 2015-03-22 19:51:30 UTC
On a second thought... are the headers supposed to be in-stream? In all cases or only RTP (does this break Ogg muxing?)? The Opus RTP RFC seems to want headers in-stream, but that also seems stupid. You can lose the first packet with the headers and then would only be able to decode stereo again.

It seems like the information from the SDP would be enough for the depayloader or the decoder to actually re-generate the headers or at least do something more sensible. Especially the SDP will contain the channel count. So I think the decoder or depayloader also needs some changes to handle the case without headers more gracefully.
Comment 4 Ilya Konstantinov 2015-03-22 19:59:58 UTC
I agree. I also thought it strange that opusdec gives a single chance for the header to be present in the first packet, given that transport can be unreliable.

In OpenWebRTC, GstRTPOpusDepay gets 'application/x-rtp; ... channels:1' but this information isn't forwarded to GstOpusDec in any way.

Perhaps GstOpusDec should allow the fallback values to be configurable via properties?

(In reply to Sebastian Dröge (slomo) from comment #2)
> This function takes ownership of the list of headers. The pointer is not
> valid anymore afterwards, so you would either need to deep-copy the list or
> set enc->headers to NULL

I'll just set it to NULL and eliminate all of the unrefs.
Comment 5 Ilya Konstantinov 2015-03-22 20:02:43 UTC
Also, what's with this "gst_audio_encoder_set_headers" thing? I'm merely "setting" the headers and then gst_audio_encoder decides when to send them?

If so, does the 'enc->header_sent' member make sense? Shouldn't it be 'enc->header_set'?
Comment 6 Sebastian Dröge (slomo) 2015-03-22 20:06:01 UTC
(In reply to Ilya Konstantinov from comment #4)
> I agree. I also thought it strange that opusdec gives a single chance for
> the header to be present in the first packet, given that transport can be
> unreliable.

(note that other codecs like h264 or even Vorbis put the headers into the SDP for that reason...)

> In OpenWebRTC, GstRTPOpusDepay gets 'application/x-rtp; ... channels:1' but
> this information isn't forwarded to GstOpusDec in any way.
> 
> Perhaps GstOpusDec should allow the fallback values to be configurable via
> properties?

The SDP fields should become caps fields in application/x-rtp, which then are transformed by the depayloader in caps fields in audio/x-opus. Then the decoder can do something useful with them.

It seems like the Opus payloader and depayloader completely ignore all the SDP fields mentioned in the RFC.

> (In reply to Sebastian Dröge (slomo) from comment #2)
> > This function takes ownership of the list of headers. The pointer is not
> > valid anymore afterwards, so you would either need to deep-copy the list or
> > set enc->headers to NULL
> 
> I'll just set it to NULL and eliminate all of the unrefs.

I think the headers are also used to put them into the caps. They might still be needed.
Comment 7 Sebastian Dröge (slomo) 2015-03-22 20:07:33 UTC
(In reply to Ilya Konstantinov from comment #5)
> Also, what's with this "gst_audio_encoder_set_headers" thing? I'm merely
> "setting" the headers and then gst_audio_encoder decides when to send them?
> 
> If so, does the 'enc->header_sent' member make sense? Shouldn't it be
> 'enc->header_set'?

The base class sends them right away before the next real buffer, and also whenever requested by downstream.

Btw, see also the config-interval property on the payloaders (e.g. h264 and vorbis one).
Comment 8 Ilya Konstantinov 2015-03-22 20:11:08 UTC
(In reply to Sebastian Dröge (slomo) from comment #6)
> > I'll just set it to NULL and eliminate all of the unrefs.
> 
> I think the headers are also used to put them into the caps. They might
> still be needed.

The caps are also produced within gst_opus_enc_handle_frame, and no other code references 'enc->headers'. I can just make 'headers' a local variable within gst_opus_enc_handle_frame, pass it to gst_audio_encoder and forget about it.
Comment 9 Ilya Konstantinov 2015-03-22 20:19:33 UTC
Created attachment 300092 [details] [review]
Patch - Actually send headers

Changes:

* Turn headers into a local variable -- no longer unrefing it ourselves
Comment 10 Ilya Konstantinov 2015-03-22 20:22:54 UTC
(In reply to Sebastian Dröge (slomo) from comment #6)
> The SDP fields should become caps fields in application/x-rtp, which then
> are transformed by the depayloader in caps fields in audio/x-opus. Then the
> decoder can do something useful with them.
> 
> It seems like the Opus payloader and depayloader completely ignore all the
> SDP fields mentioned in the RFC.

Thanks, your direction is extremely valuable.

Let's land this fix and meanwhile I'll look at the payloader/depayloader.
Comment 11 Sebastian Dröge (slomo) 2015-03-22 20:26:14 UTC
I'll check tomorrow if sending the headers in-stream is correct for Opus in all cases, or only for RTP. If it's only for RTP, we should just let the payloader grab them from the caps and put them into the stream, and the depayloader parse them from the stream and put them into the caps.
Comment 12 Ilya Konstantinov 2015-03-22 20:32:48 UTC
In any case, this fix seems desirable. Perhaps gst_audio_encoder_set_headers should be conditional on a property, i.e.

 if (enc->inband_headers) {
   gst_audio_encoder_set_headers (benc, headers);
 } else {
   g_list_foreach (headers, (GFunc) gst_buffer_unref, NULL);
   g_list_free (headers);
 }
Comment 13 Sebastian Dröge (slomo) 2015-03-22 21:05:08 UTC
This should work automatically without setting any properties from the application :)
Comment 14 Ilya Konstantinov 2015-03-22 21:07:14 UTC
But if:
 - sending headers in-stream is forbidden only in RTP case, and
 - opusenc doesn't (want to) have intimate knowledge of who it's linked to
how would it know?
Comment 15 Sebastian Dröge (slomo) 2015-03-22 21:10:09 UTC
Caps. We could make a new "stream-format" for this case... or simpler: the RTP payloader could just extract the header from the caps. See last part of comment 11.
Comment 16 Ilya Konstantinov 2015-03-22 21:26:29 UTC
Ohh, you think it might be *only* for RTP! (I misread it the other way around.)

So:
- if it's not only for RTP, then gst_audio_encoder_set_headers is right
- if it's only for RTP, then modify

   1) gst_opus_header_create_caps
   2) gst_opus_header_create_caps_from_headers

  to allow 'headers' argument to be NULL (or just eliminate this GList completely).

Is opusenc intended for streams only, while e.g. file muxing is done by some other element? That's why you refer to the headers as "in-stream"?

The comment in gstopusheader.c says:
  /* Opus streams in Ogg begin with two headers; the initial header (with
     most of the codec setup parameters) which is mandated by the Ogg
     bitstream spec.  The second header holds any comment fields. */

Isn't it more likely that in RTP you don't have to send headers in-stream, since there's SDP? https://tools.ietf.org/html/draft-ietf-payload-rtp-opus-07 certainly seems to indicate that SDP is the place for information like whether the Opus stream is stereo.

I'll wait for your feedback tomorrow, as I get the feeling you know this subject much more intimately than I do.
Comment 17 Sebastian Dröge (slomo) 2015-03-22 21:37:37 UTC
opusenc already puts the headers into the caps, there's nothing to change there in that case :)

opusenc is intended for general Opus encoding, anything "container" specific should be done there. E.g. in WebM/Matroska and Ogg the headers are taken from the caps and placed in special places in the Matroska/Ogg headers.
Comment 18 Ilya Konstantinov 2015-03-22 21:45:06 UTC
(In reply to Sebastian Dröge (slomo) from comment #17)
> opusenc already puts the headers into the caps, there's nothing to change
> there in that case :)

I think we should then eliminate the 'enc->headers' member since it's misleading.

> opusenc is intended for general Opus encoding, anything "container" specific
> should be done there. E.g. in WebM/Matroska and Ogg the headers are taken
> from the caps and placed in special places in the Matroska/Ogg headers.

Right now GstRtpOpusPay gets its caps from props (rate, channels). Ideally:
 1) those props should be eliminated, and
 2) it should learn to parse the "streamheader" structure (from sink caps), and configure its src caps purely from the Opus header. This will require duplicating some parsing code from opusdec.
Comment 19 Ilya Konstantinov 2015-03-22 22:11:06 UTC
(In reply to Ilya Konstantinov from comment #18)
> Right now GstRtpOpusPay gets its caps from props (rate, channels).

Oops, I was wrong. GstRtpOpusPay has no such properties. Instead, it has fixed 48000 rate. In OpenWebRTC, the 'channels' value actually origins from the following send-rtp-capsfilter-2 -- the 'channels' aren't actually negotiated, it seems.

Ideally, GstRtpOpusPay would provide its own 'channels' (based on parsing the "streamheader") so that actual negotiation can take place.
Comment 20 Sebastian Dröge (slomo) 2015-03-23 09:23:59 UTC
The headers (and streamheader!) that we have there are for Ogg (and WebM apparently) only. They must not be in-stream, and for RTP we have to get the relevant info from the SDP and then put it into the caps. I'm going to fix that now :)
Comment 21 Sebastian Dröge (slomo) 2015-03-23 11:28:04 UTC
commit 856bb027f993413494ed76ef9f282ead99812075
Author: Sebastian Dröge <sebastian@centricular.com>
Date:   Mon Mar 23 12:24:55 2015 +0100

    opus: Handle sprop-stereo and sprop-maxcapturerate RTP caps fields
    
    https://bugzilla.gnome.org/show_bug.cgi?id=746617

commit a128502221352a82e23c5b143619917c76892669
Author: Sebastian Dröge <sebastian@centricular.com>
Date:   Mon Mar 23 12:09:25 2015 +0100

    opusdec: Take channels and sample rate from the caps if we have no stream header

commit c2f38cd0547c54018e819291c4877183113ddafc
Author: Sebastian Dröge <sebastian@centricular.com>
Date:   Mon Mar 23 12:07:52 2015 +0100

    opusdec: Reset the decoder if the caps change

commit bb5b0f2d127b14f8d80c6df041ea37e6fd1f0475
Author: Sebastian Dröge <sebastian@centricular.com>
Date:   Mon Mar 23 11:57:09 2015 +0100

    opusdec: Take output sample rate from the stream headers too
    
    This way we let opusdec do the resampling if needed and don't carry
    around buffers with a too high sample rate if not required.
    
    While Opus always uses 48kHz internally, this information from the
    header specifies which frequencies are safe to drop.

commit 716eaf765b1251bdddb8eb6c3a9ecc87967123a8
Author: Sebastian Dröge <sebastian@centricular.com>
Date:   Mon Mar 23 11:56:09 2015 +0100

    opusheader: Put number of channels and sample rate into the caps
    
    https://bugzilla.gnome.org/show_bug.cgi?id=746617
Comment 22 Olivier Crête 2015-03-24 02:30:13 UTC
I think setting the rate in the caps from the depayloader is incorrect, the rate is always 48khz for Opus. It's just a "hint", not a guarantee. Same for sprop-stereo, it just indicates that it can be safely downmixed, not that the data will never be stereo.

Also, in section 7, the draft says "the number of channels MUST be 2". 

Ie, please read the RFC before doing patches, and please just revert all of this. Opus was carefully designed to be used without headers with RTP, so the mistakes of Vorbis were not repeated.
Comment 23 Ilya Konstantinov 2015-03-24 02:48:18 UTC
(In reply to Olivier Crête from comment #22)
> Ie, please read the RFC before doing patches, and please just revert all of
> this.

Wouldn't it be still beneficial for opusenc to indicate 'channels' on its src pad, if only for the downstream rtpopuspay to ensure conformance to the RFC?
Comment 24 Olivier Crête 2015-03-24 05:15:39 UTC
You're right, it should indeed indicate it and negotiate it with up stream so that it is down mixed appropriately.
Comment 25 Sebastian Dröge (slomo) 2015-03-24 12:34:30 UTC
(In reply to Olivier Crête from comment #22)
> I think setting the rate in the caps from the depayloader is incorrect, the
> rate is always 48khz for Opus. It's just a "hint", not a guarantee. Same for
> sprop-stereo, it just indicates that it can be safely downmixed, not that
> the data will never be stereo.

What is the difference between "safely downmixed" (or "safely downsampled") and actually doing that?

What do you suggest should we do here?

> Also, in section 7, the draft says "the number of channels MUST be 2". 
> 
> Ie, please read the RFC before doing patches, and please just revert all of
> this.

I was reading the RFC and as you quote above it says you can safely downmix. That's what we do now to not produce useless data and cause unneeded processing overhead.

Why would anything put the sprop- things into the SDP if it wanted to send stereo or higher sample rates? And what would anybody be supposed to do with those if they mean nothing?

> Opus was carefully designed to be used without headers with RTP, so
> the mistakes of Vorbis were not repeated.

Right, and it also works without any headers and information from the SDP.

The only difference now is that we actually put this information into the RTP caps in the payloader so someone can do something useful with them (whatever that may be), what is missing in the payloader is negotiation (to tell upstream to produce e.g. mono if downstream only wants mono). And in the depayloader we put this information in the caps again so that the decoder can do something with it.


Also we now put the channels and original rate into the audio/x-opus caps now so that something like "opusenc ! opusdec" behaves like an identity for caps instead of resampling and upsamling. If the information from the SDP are supposed to be useless then the fix here would be to not have the depayloader translate them into caps fields for audio/x-opus.
Comment 26 Olivier Crête 2015-03-24 18:09:22 UTC
I'm sorry, I completely misread your patch :-( I'm not sure how it could ever produce useless data? You start with one or two channels, and you end up with 2 channels which may be the same.

Also attaching some improvement patches, so we can forward the receipts preferences to the source, ie implement the "stereo" property for the sender.
Comment 27 Olivier Crête 2015-03-24 18:09:33 UTC
Created attachment 300215 [details] [review]
rtpopuspay: Set the number of channels to 2 as per RFC draft
Comment 28 Olivier Crête 2015-03-24 18:09:37 UTC
Created attachment 300216 [details] [review]
rtpopuspay: Forward stereo preferences from caps upstream
Comment 29 Olivier Crête 2015-03-24 18:09:41 UTC
Created attachment 300217 [details] [review]
rtpopusdepay: sprop-* are optional, no need for a WARNING
Comment 30 Olivier Crête 2015-03-24 18:09:45 UTC
Created attachment 300218 [details] [review]
opusenc: Forward channel count preferences from downstream to source
Comment 31 Sebastian Dröge (slomo) 2015-03-24 18:21:36 UTC
Comment on attachment 300216 [details] [review]
rtpopuspay: Forward stereo preferences from caps upstream

Might also want to do something here with maxplaybackrate (that's a sample rate btw, even if it sounds different :) ).
Comment 32 Sebastian Dröge (slomo) 2015-03-24 18:22:56 UTC
Comment on attachment 300217 [details] [review]
rtpopusdepay: sprop-* are optional, no need for a WARNING

The warning should only be printed if the fields are there but contain something we don't understand. It seems like that's a good idea as we might then want to add support for whatever a new spec adds :)

If the field is just not there it shouldn't print a warning
Comment 33 Sebastian Dröge (slomo) 2015-03-24 18:26:02 UTC
Comment on attachment 300218 [details] [review]
opusenc: Forward channel count preferences from downstream to source

gst_audio_encoder_proxy_getcaps() should already do that, with less code.


For the multistream stuff, it doesn't look like it's properly implemented currently. In the encoder it needs some better channel-layout handling (currently it just fails to handle any non-Vorbis channel layout but doesn't put that on the caps), in the decoder we don't have any multistream handling at all AFAIU. I would prefer to keep channels=[1,2] until that is fixed.

Also I think we should put the channel mappings and other things (the stuff that is stored in the first Ogg header) into the audio/x-opus caps. They are needed by the decoder to properly decode multistream Opus (unless the caps contain the Ogg specific header).
Comment 34 Sebastian Dröge (slomo) 2015-03-24 18:27:29 UTC
(In reply to Olivier Crête from comment #26)
> I'm sorry, I completely misread your patch :-( I'm not sure how it could
> ever produce useless data?

I just meant that with how I understood your comments, the sprop-* SDP fields would be completely useless data and couldn't be used for anything (as compared to the stereo and maxplaybackrate fields without the sprop- prefix, which are useful for negotiation).
Comment 35 Sebastian Dröge (slomo) 2015-03-24 18:27:51 UTC
(In reply to Sebastian Dröge (slomo) from comment #34)
> (In reply to Olivier Crête from comment #26)
> > I'm sorry, I completely misread your patch :-( I'm not sure how it could
> > ever produce useless data?
> 
> I just meant that with how I understood your comments, the sprop-* SDP
> fields would be completely useless data and couldn't be used for anything
> (as compared to the stereo and maxplaybackrate fields without the sprop-
> prefix, which are useful for negotiation).

And by "understood" I mean "misunderstood" :)
Comment 36 Olivier Crête 2015-03-25 19:26:28 UTC
Comment on attachment 300215 [details] [review]
rtpopuspay: Set the number of channels to 2 as per RFC draft

Turns out we put the channels as a string, not an INT, fixed and committed.

commit c82a2d4aa07790456046184fa7d90fbd97624631
Author: Olivier Crête <olivier.crete@collabora.com>
Date:   Tue Mar 24 13:56:21 2015 -0400

    rtpopuspay: Set the number of channels to 2 as per RFC draft
    
    https://bugzilla.gnome.org/show_bug.cgi?id=746617
Comment 37 Olivier Crête 2015-03-25 19:27:25 UTC
Comment on attachment 300216 [details] [review]
rtpopuspay: Forward stereo preferences from caps upstream

Also pushed:

commit 45422791f7f244900f3f76e7e9d7497ef1b88075
Author: Olivier Crête <olivier.crete@collabora.com>
Date:   Tue Mar 24 13:57:54 2015 -0400

    rtpopuspay: Forward stereo preferences from caps upstream
    
    https://bugzilla.gnome.org/show_bug.cgi?id=746617
Comment 38 Olivier Crête 2015-03-25 19:31:38 UTC
If I understand correctly the "header" produced by opusenc is actually Ogg specific, so it should probably go into oggmux?

Also, I don't think we have a way in Gst 1.x to express channel ordering in caps anymore? Should we just always encode  in GStreame order and put the mask on the audio/x-opus caps ?
Comment 39 Ilya Konstantinov 2015-03-25 20:06:39 UTC
(In reply to Olivier Crête from comment #38)
> Also, I don't think we have a way in Gst 1.x to express channel ordering in
> caps anymore? Should we just always encode  in GStreamer order and put the
> mask on the audio/x-opus caps ?

opusenc cannot receive channels in any order but the GStreamer order, so why would it do anything else?

If to judge by other audio sources, opusdec should reorder if needed, to output GStreamer order.
Comment 40 Sebastian Dröge (slomo) 2015-03-25 21:09:11 UTC
(In reply to Olivier Crête from comment #38)
> If I understand correctly the "header" produced by opusenc is actually Ogg
> specific, so it should probably go into oggmux?

Yes, having it in opusdec and opusenc is hurting my eyes :) But I think this header is also used by matroskamux/demux, needs to be checked. Which seems weird... but oh well.

> Also, I don't think we have a way in Gst 1.x to express channel ordering in
> caps anymore? Should we just always encode  in GStreame order and put the
> mask on the audio/x-opus caps ?

I think for the audio/x-opus caps we will need to express everything that goes into the Ogg header somehow. Meaning that we also need to express arbitrary channel orders. We can just use a GST_TYPE_ARRAY of integers for that, just like it is stored in the Ogg header and also expected by the Opus API.
It's not nice but I think the best we can do.

opusdec of course has to reorder channels into the GStreamer channel order then, and also opusenc should potentially reorder channels if downstream requests a different "Opus channel map".

(In reply to Ilya Konstantinov from comment #39)
> (In reply to Olivier Crête from comment #38)
> > Also, I don't think we have a way in Gst 1.x to express channel ordering in
> > caps anymore? Should we just always encode  in GStreamer order and put the
> > mask on the audio/x-opus caps ?
> 
> opusenc cannot receive channels in any order but the GStreamer order, so why
> would it do anything else?
> 
> If to judge by other audio sources, opusdec should reorder if needed, to
> output GStreamer order.

All true, but see above. We need to be able to express other orders in the audio/x-opus caps, e.g. to handle multichannel streams coming from containers that define a different order (MPEGTS). opusenc/dec need to reorder then to make sure only the GStreamer order ever appears in audio/x-raw streams.
Comment 41 Tim-Philipp Müller 2016-03-03 17:44:17 UTC
Sebastian, is this still relevant after your opus multichannel work?
Comment 42 Edward Hervey 2018-01-15 10:56:28 UTC
still valid ?
Comment 43 GStreamer system administrator 2018-11-03 11:36:12 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/issues/174.