GNOME Bugzilla – Bug 746617
opusenc: headers are never sent
Last modified: 2018-11-03 11:36:12 UTC
opusenc is creating headers but they never get sent. Therefore opusdec resorts to fallback values, e.g. always 2 channels. (e.g. in OpenWebRTC this results in wasteful audioconverts)
Created attachment 300088 [details] [review] Patch - Actually send headers
Review of attachment 300088 [details] [review]: Good catch :) ::: ext/opus/gstopusenc.c @@ +984,3 @@ gst_tag_setter_get_tag_list (GST_TAG_SETTER (enc))); + gst_audio_encoder_set_headers (benc, enc->headers); This function takes ownership of the list of headers. The pointer is not valid anymore afterwards, so you would either need to deep-copy the list or set enc->headers to NULL
On a second thought... are the headers supposed to be in-stream? In all cases or only RTP (does this break Ogg muxing?)? The Opus RTP RFC seems to want headers in-stream, but that also seems stupid. You can lose the first packet with the headers and then would only be able to decode stereo again. It seems like the information from the SDP would be enough for the depayloader or the decoder to actually re-generate the headers or at least do something more sensible. Especially the SDP will contain the channel count. So I think the decoder or depayloader also needs some changes to handle the case without headers more gracefully.
I agree. I also thought it strange that opusdec gives a single chance for the header to be present in the first packet, given that transport can be unreliable. In OpenWebRTC, GstRTPOpusDepay gets 'application/x-rtp; ... channels:1' but this information isn't forwarded to GstOpusDec in any way. Perhaps GstOpusDec should allow the fallback values to be configurable via properties? (In reply to Sebastian Dröge (slomo) from comment #2) > This function takes ownership of the list of headers. The pointer is not > valid anymore afterwards, so you would either need to deep-copy the list or > set enc->headers to NULL I'll just set it to NULL and eliminate all of the unrefs.
Also, what's with this "gst_audio_encoder_set_headers" thing? I'm merely "setting" the headers and then gst_audio_encoder decides when to send them? If so, does the 'enc->header_sent' member make sense? Shouldn't it be 'enc->header_set'?
(In reply to Ilya Konstantinov from comment #4) > I agree. I also thought it strange that opusdec gives a single chance for > the header to be present in the first packet, given that transport can be > unreliable. (note that other codecs like h264 or even Vorbis put the headers into the SDP for that reason...) > In OpenWebRTC, GstRTPOpusDepay gets 'application/x-rtp; ... channels:1' but > this information isn't forwarded to GstOpusDec in any way. > > Perhaps GstOpusDec should allow the fallback values to be configurable via > properties? The SDP fields should become caps fields in application/x-rtp, which then are transformed by the depayloader in caps fields in audio/x-opus. Then the decoder can do something useful with them. It seems like the Opus payloader and depayloader completely ignore all the SDP fields mentioned in the RFC. > (In reply to Sebastian Dröge (slomo) from comment #2) > > This function takes ownership of the list of headers. The pointer is not > > valid anymore afterwards, so you would either need to deep-copy the list or > > set enc->headers to NULL > > I'll just set it to NULL and eliminate all of the unrefs. I think the headers are also used to put them into the caps. They might still be needed.
(In reply to Ilya Konstantinov from comment #5) > Also, what's with this "gst_audio_encoder_set_headers" thing? I'm merely > "setting" the headers and then gst_audio_encoder decides when to send them? > > If so, does the 'enc->header_sent' member make sense? Shouldn't it be > 'enc->header_set'? The base class sends them right away before the next real buffer, and also whenever requested by downstream. Btw, see also the config-interval property on the payloaders (e.g. h264 and vorbis one).
(In reply to Sebastian Dröge (slomo) from comment #6) > > I'll just set it to NULL and eliminate all of the unrefs. > > I think the headers are also used to put them into the caps. They might > still be needed. The caps are also produced within gst_opus_enc_handle_frame, and no other code references 'enc->headers'. I can just make 'headers' a local variable within gst_opus_enc_handle_frame, pass it to gst_audio_encoder and forget about it.
Created attachment 300092 [details] [review] Patch - Actually send headers Changes: * Turn headers into a local variable -- no longer unrefing it ourselves
(In reply to Sebastian Dröge (slomo) from comment #6) > The SDP fields should become caps fields in application/x-rtp, which then > are transformed by the depayloader in caps fields in audio/x-opus. Then the > decoder can do something useful with them. > > It seems like the Opus payloader and depayloader completely ignore all the > SDP fields mentioned in the RFC. Thanks, your direction is extremely valuable. Let's land this fix and meanwhile I'll look at the payloader/depayloader.
I'll check tomorrow if sending the headers in-stream is correct for Opus in all cases, or only for RTP. If it's only for RTP, we should just let the payloader grab them from the caps and put them into the stream, and the depayloader parse them from the stream and put them into the caps.
In any case, this fix seems desirable. Perhaps gst_audio_encoder_set_headers should be conditional on a property, i.e. if (enc->inband_headers) { gst_audio_encoder_set_headers (benc, headers); } else { g_list_foreach (headers, (GFunc) gst_buffer_unref, NULL); g_list_free (headers); }
This should work automatically without setting any properties from the application :)
But if: - sending headers in-stream is forbidden only in RTP case, and - opusenc doesn't (want to) have intimate knowledge of who it's linked to how would it know?
Caps. We could make a new "stream-format" for this case... or simpler: the RTP payloader could just extract the header from the caps. See last part of comment 11.
Ohh, you think it might be *only* for RTP! (I misread it the other way around.) So: - if it's not only for RTP, then gst_audio_encoder_set_headers is right - if it's only for RTP, then modify 1) gst_opus_header_create_caps 2) gst_opus_header_create_caps_from_headers to allow 'headers' argument to be NULL (or just eliminate this GList completely). Is opusenc intended for streams only, while e.g. file muxing is done by some other element? That's why you refer to the headers as "in-stream"? The comment in gstopusheader.c says: /* Opus streams in Ogg begin with two headers; the initial header (with most of the codec setup parameters) which is mandated by the Ogg bitstream spec. The second header holds any comment fields. */ Isn't it more likely that in RTP you don't have to send headers in-stream, since there's SDP? https://tools.ietf.org/html/draft-ietf-payload-rtp-opus-07 certainly seems to indicate that SDP is the place for information like whether the Opus stream is stereo. I'll wait for your feedback tomorrow, as I get the feeling you know this subject much more intimately than I do.
opusenc already puts the headers into the caps, there's nothing to change there in that case :) opusenc is intended for general Opus encoding, anything "container" specific should be done there. E.g. in WebM/Matroska and Ogg the headers are taken from the caps and placed in special places in the Matroska/Ogg headers.
(In reply to Sebastian Dröge (slomo) from comment #17) > opusenc already puts the headers into the caps, there's nothing to change > there in that case :) I think we should then eliminate the 'enc->headers' member since it's misleading. > opusenc is intended for general Opus encoding, anything "container" specific > should be done there. E.g. in WebM/Matroska and Ogg the headers are taken > from the caps and placed in special places in the Matroska/Ogg headers. Right now GstRtpOpusPay gets its caps from props (rate, channels). Ideally: 1) those props should be eliminated, and 2) it should learn to parse the "streamheader" structure (from sink caps), and configure its src caps purely from the Opus header. This will require duplicating some parsing code from opusdec.
(In reply to Ilya Konstantinov from comment #18) > Right now GstRtpOpusPay gets its caps from props (rate, channels). Oops, I was wrong. GstRtpOpusPay has no such properties. Instead, it has fixed 48000 rate. In OpenWebRTC, the 'channels' value actually origins from the following send-rtp-capsfilter-2 -- the 'channels' aren't actually negotiated, it seems. Ideally, GstRtpOpusPay would provide its own 'channels' (based on parsing the "streamheader") so that actual negotiation can take place.
The headers (and streamheader!) that we have there are for Ogg (and WebM apparently) only. They must not be in-stream, and for RTP we have to get the relevant info from the SDP and then put it into the caps. I'm going to fix that now :)
commit 856bb027f993413494ed76ef9f282ead99812075 Author: Sebastian Dröge <sebastian@centricular.com> Date: Mon Mar 23 12:24:55 2015 +0100 opus: Handle sprop-stereo and sprop-maxcapturerate RTP caps fields https://bugzilla.gnome.org/show_bug.cgi?id=746617 commit a128502221352a82e23c5b143619917c76892669 Author: Sebastian Dröge <sebastian@centricular.com> Date: Mon Mar 23 12:09:25 2015 +0100 opusdec: Take channels and sample rate from the caps if we have no stream header commit c2f38cd0547c54018e819291c4877183113ddafc Author: Sebastian Dröge <sebastian@centricular.com> Date: Mon Mar 23 12:07:52 2015 +0100 opusdec: Reset the decoder if the caps change commit bb5b0f2d127b14f8d80c6df041ea37e6fd1f0475 Author: Sebastian Dröge <sebastian@centricular.com> Date: Mon Mar 23 11:57:09 2015 +0100 opusdec: Take output sample rate from the stream headers too This way we let opusdec do the resampling if needed and don't carry around buffers with a too high sample rate if not required. While Opus always uses 48kHz internally, this information from the header specifies which frequencies are safe to drop. commit 716eaf765b1251bdddb8eb6c3a9ecc87967123a8 Author: Sebastian Dröge <sebastian@centricular.com> Date: Mon Mar 23 11:56:09 2015 +0100 opusheader: Put number of channels and sample rate into the caps https://bugzilla.gnome.org/show_bug.cgi?id=746617
I think setting the rate in the caps from the depayloader is incorrect, the rate is always 48khz for Opus. It's just a "hint", not a guarantee. Same for sprop-stereo, it just indicates that it can be safely downmixed, not that the data will never be stereo. Also, in section 7, the draft says "the number of channels MUST be 2". Ie, please read the RFC before doing patches, and please just revert all of this. Opus was carefully designed to be used without headers with RTP, so the mistakes of Vorbis were not repeated.
(In reply to Olivier Crête from comment #22) > Ie, please read the RFC before doing patches, and please just revert all of > this. Wouldn't it be still beneficial for opusenc to indicate 'channels' on its src pad, if only for the downstream rtpopuspay to ensure conformance to the RFC?
You're right, it should indeed indicate it and negotiate it with up stream so that it is down mixed appropriately.
(In reply to Olivier Crête from comment #22) > I think setting the rate in the caps from the depayloader is incorrect, the > rate is always 48khz for Opus. It's just a "hint", not a guarantee. Same for > sprop-stereo, it just indicates that it can be safely downmixed, not that > the data will never be stereo. What is the difference between "safely downmixed" (or "safely downsampled") and actually doing that? What do you suggest should we do here? > Also, in section 7, the draft says "the number of channels MUST be 2". > > Ie, please read the RFC before doing patches, and please just revert all of > this. I was reading the RFC and as you quote above it says you can safely downmix. That's what we do now to not produce useless data and cause unneeded processing overhead. Why would anything put the sprop- things into the SDP if it wanted to send stereo or higher sample rates? And what would anybody be supposed to do with those if they mean nothing? > Opus was carefully designed to be used without headers with RTP, so > the mistakes of Vorbis were not repeated. Right, and it also works without any headers and information from the SDP. The only difference now is that we actually put this information into the RTP caps in the payloader so someone can do something useful with them (whatever that may be), what is missing in the payloader is negotiation (to tell upstream to produce e.g. mono if downstream only wants mono). And in the depayloader we put this information in the caps again so that the decoder can do something with it. Also we now put the channels and original rate into the audio/x-opus caps now so that something like "opusenc ! opusdec" behaves like an identity for caps instead of resampling and upsamling. If the information from the SDP are supposed to be useless then the fix here would be to not have the depayloader translate them into caps fields for audio/x-opus.
I'm sorry, I completely misread your patch :-( I'm not sure how it could ever produce useless data? You start with one or two channels, and you end up with 2 channels which may be the same. Also attaching some improvement patches, so we can forward the receipts preferences to the source, ie implement the "stereo" property for the sender.
Created attachment 300215 [details] [review] rtpopuspay: Set the number of channels to 2 as per RFC draft
Created attachment 300216 [details] [review] rtpopuspay: Forward stereo preferences from caps upstream
Created attachment 300217 [details] [review] rtpopusdepay: sprop-* are optional, no need for a WARNING
Created attachment 300218 [details] [review] opusenc: Forward channel count preferences from downstream to source
Comment on attachment 300216 [details] [review] rtpopuspay: Forward stereo preferences from caps upstream Might also want to do something here with maxplaybackrate (that's a sample rate btw, even if it sounds different :) ).
Comment on attachment 300217 [details] [review] rtpopusdepay: sprop-* are optional, no need for a WARNING The warning should only be printed if the fields are there but contain something we don't understand. It seems like that's a good idea as we might then want to add support for whatever a new spec adds :) If the field is just not there it shouldn't print a warning
Comment on attachment 300218 [details] [review] opusenc: Forward channel count preferences from downstream to source gst_audio_encoder_proxy_getcaps() should already do that, with less code. For the multistream stuff, it doesn't look like it's properly implemented currently. In the encoder it needs some better channel-layout handling (currently it just fails to handle any non-Vorbis channel layout but doesn't put that on the caps), in the decoder we don't have any multistream handling at all AFAIU. I would prefer to keep channels=[1,2] until that is fixed. Also I think we should put the channel mappings and other things (the stuff that is stored in the first Ogg header) into the audio/x-opus caps. They are needed by the decoder to properly decode multistream Opus (unless the caps contain the Ogg specific header).
(In reply to Olivier Crête from comment #26) > I'm sorry, I completely misread your patch :-( I'm not sure how it could > ever produce useless data? I just meant that with how I understood your comments, the sprop-* SDP fields would be completely useless data and couldn't be used for anything (as compared to the stereo and maxplaybackrate fields without the sprop- prefix, which are useful for negotiation).
(In reply to Sebastian Dröge (slomo) from comment #34) > (In reply to Olivier Crête from comment #26) > > I'm sorry, I completely misread your patch :-( I'm not sure how it could > > ever produce useless data? > > I just meant that with how I understood your comments, the sprop-* SDP > fields would be completely useless data and couldn't be used for anything > (as compared to the stereo and maxplaybackrate fields without the sprop- > prefix, which are useful for negotiation). And by "understood" I mean "misunderstood" :)
Comment on attachment 300215 [details] [review] rtpopuspay: Set the number of channels to 2 as per RFC draft Turns out we put the channels as a string, not an INT, fixed and committed. commit c82a2d4aa07790456046184fa7d90fbd97624631 Author: Olivier Crête <olivier.crete@collabora.com> Date: Tue Mar 24 13:56:21 2015 -0400 rtpopuspay: Set the number of channels to 2 as per RFC draft https://bugzilla.gnome.org/show_bug.cgi?id=746617
Comment on attachment 300216 [details] [review] rtpopuspay: Forward stereo preferences from caps upstream Also pushed: commit 45422791f7f244900f3f76e7e9d7497ef1b88075 Author: Olivier Crête <olivier.crete@collabora.com> Date: Tue Mar 24 13:57:54 2015 -0400 rtpopuspay: Forward stereo preferences from caps upstream https://bugzilla.gnome.org/show_bug.cgi?id=746617
If I understand correctly the "header" produced by opusenc is actually Ogg specific, so it should probably go into oggmux? Also, I don't think we have a way in Gst 1.x to express channel ordering in caps anymore? Should we just always encode in GStreame order and put the mask on the audio/x-opus caps ?
(In reply to Olivier Crête from comment #38) > Also, I don't think we have a way in Gst 1.x to express channel ordering in > caps anymore? Should we just always encode in GStreamer order and put the > mask on the audio/x-opus caps ? opusenc cannot receive channels in any order but the GStreamer order, so why would it do anything else? If to judge by other audio sources, opusdec should reorder if needed, to output GStreamer order.
(In reply to Olivier Crête from comment #38) > If I understand correctly the "header" produced by opusenc is actually Ogg > specific, so it should probably go into oggmux? Yes, having it in opusdec and opusenc is hurting my eyes :) But I think this header is also used by matroskamux/demux, needs to be checked. Which seems weird... but oh well. > Also, I don't think we have a way in Gst 1.x to express channel ordering in > caps anymore? Should we just always encode in GStreame order and put the > mask on the audio/x-opus caps ? I think for the audio/x-opus caps we will need to express everything that goes into the Ogg header somehow. Meaning that we also need to express arbitrary channel orders. We can just use a GST_TYPE_ARRAY of integers for that, just like it is stored in the Ogg header and also expected by the Opus API. It's not nice but I think the best we can do. opusdec of course has to reorder channels into the GStreamer channel order then, and also opusenc should potentially reorder channels if downstream requests a different "Opus channel map". (In reply to Ilya Konstantinov from comment #39) > (In reply to Olivier Crête from comment #38) > > Also, I don't think we have a way in Gst 1.x to express channel ordering in > > caps anymore? Should we just always encode in GStreamer order and put the > > mask on the audio/x-opus caps ? > > opusenc cannot receive channels in any order but the GStreamer order, so why > would it do anything else? > > If to judge by other audio sources, opusdec should reorder if needed, to > output GStreamer order. All true, but see above. We need to be able to express other orders in the audio/x-opus caps, e.g. to handle multichannel streams coming from containers that define a different order (MPEGTS). opusenc/dec need to reorder then to make sure only the GStreamer order ever appears in audio/x-raw streams.
Sebastian, is this still relevant after your opus multichannel work?
still valid ?
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/issues/174.