GNOME Bugzilla – Bug 776983
opusenc does not discard silence if DTX enabled
Last modified: 2018-11-03 11:53:23 UTC
if dtx is enabled and silence is detected opusenc should only encode 1 frame each 400ms, as stated in RFC 6716, 2.1.9. in function gst_opus_enc_encode() 1 is returned from opus_multistream_encode(), signaling the frame should be discarded cause of detected silence. Each 400ms a value != 1 is returned from opus_multistream_encode(). I use the attached patch to discard the frame. can you tell me why this might be a bad idea? I use this pipeline for testing gst-launch-1.0 pulsesrc ! "audio/x-raw,channels=1,rate=48000" ! opusenc audio-type=2048 bandwidth=1103 bitrate=20000 bitrate-type=1 inband-fec=TRUE packet-loss-percentage=1 dtx=TRUE ! fakesink I came across this while measuring network bandwidth using opusenc with dtx and rtpopuspay. The packet rate was constant, one packet each 20ms. There where packets containing only one single RTP payload byte. With the attached patch the packet rate is irregular with a minimum of one packet each 400ms, always containing multiple bytes.
Created attachment 343091 [details] [review] discard frames marked as dtx
(In reply to Nubosch from comment #0) > if dtx is enabled and silence is detected opusenc should only encode 1 frame > each 400ms, as stated in RFC 6716, 2.1.9. > > in function gst_opus_enc_encode() 1 is returned from > opus_multistream_encode(), signaling the frame should be discarded cause of > detected silence. > Each 400ms a value != 1 is returned from opus_multistream_encode(). > > I use the attached patch to discard the frame. > can you tell me why this might be a bad idea? gathering enough information is not as easy as I thought. there was some discussion touching this topic here https://patches.libav.org/patch/27718/ in fact the non multi stream api tells: "If the return value is 2 bytes or less, then the packet does not need to be transmitted (DTX)" I was not able to find a reasonable information about the multi stream api. so I played around with the amount of channels and encoding parameters. It looks like, if opus is able to actually perform dtx it adds a TOC and a 0 length octet, skipping the length octet for the last stream. - is checking for dtx property enough? - does it make sense to check for outsize > (2*(enc->n_channels - enc->n_stereo_streams)) imo >= should not be needed, as adding another length octet would require a payload to exists. > I use this pipeline for testing > gst-launch-1.0 pulsesrc ! "audio/x-raw,channels=1,rate=48000" ! opusenc > audio-type=2048 bandwidth=1103 bitrate=20000 bitrate-type=1 inband-fec=TRUE > packet-loss-percentage=1 dtx=TRUE ! fakesink > > I came across this while measuring network bandwidth using opusenc with dtx > and rtpopuspay. > The packet rate was constant, one packet each 20ms. > There where packets containing only one single RTP payload byte. > With the attached patch the packet rate is irregular with a minimum of one > packet each 400ms, always containing multiple bytes. RFC 7587 does not include multiple streams as also rtpopuspay does not include it
Created attachment 343112 [details] [review] v02 discard frames marked as dtx
Created attachment 343113 [details] [review] v03 discard frames marked as dtx remove debug output, remove experiment setting outsize 0, use -1 sorry for the noise :/
Review of attachment 343113 [details] [review]: You should push a GAP event instead of a buffer. It seems that the base class has nothing to help, so something like ret = gst_pad_push (encoder->srcpad, timestamp, duration); should be enough if the finish frame didnt return an error.
many thanks for your review. the buffer wrapped around the data in the adapter does not carry the time information. Should this be generated by gst_audio_encoder_push_buffers? Maybe it is a good idea to go the way speexenc did? Just marking the buffer with GST_BUFFER_FLAG_GAP, and dropping it in rtpopuspay? Another option would be to add the event to gst_audio_encoder_finish_frame where all required information seems to be available, but I'm unsure about this. Sorry for bothering you with another question. Do you think the dtx-drop feature should be optional? Without a property this may break existing implementations. The comfort noise generation in opus is not triggered, also the jitterbuffer goes crazy, maybe dropping valid audio data. This is the same situation for speex, but it may be acceptable depending on the use-case. having a dtx aware jitterbuffer would be cool, but this is another story, and much more complicated than this simple changes...
Marking it as FLAG_GAP, and then dropping it either in the payloader or pre_push() both seem fine. The question is how muxing to Ogg (or other containers) is affected by dropping DTX frames. Comfort noise generation should be triggered when the jitterbuffer emits an event because an expected buffer is missing. I'm not sure what a drop-dtx property would do more than just turning off DTX in practice, for Opus, the codec compresses enough that I would expect that the overhead of sending "empty" dtx packets would make you lose half of the benefit anyway. If the jitterbuffer goes crazy in presence of DTX, this is definitely something that should be fixed in the JB. DTX is a real world feature we should support.
(In reply to Olivier Crête from comment #7) > Marking it as FLAG_GAP, and then dropping it either in the payloader or > pre_push() both seem fine. The question is how muxing to Ogg (or other > containers) is affected by dropping DTX frames. I decided to drop in the payloader, so everything on a local machine is not affected by this change. I expect the issues in a muxer will be the same as for e.g. pulsesink "discontinuity in audio" > Comfort noise generation should be triggered when the jitterbuffer emits an > event because an expected buffer is missing. see below > I'm not sure what a drop-dtx property would do more than just turning off > DTX in practice, for Opus, the codec compresses enough that I would expect > that the overhead of sending "empty" dtx packets would make you lose half of > the benefit anyway. 100% ACK. > If the jitterbuffer goes crazy in presence of DTX, this is definitely > something that should be fixed in the JB. DTX is a real world feature we > should support. I'm in doubt I won't have the time to look into this in the near future. At least I opened a new bug report documenting what I know so far. https://bugzilla.gnome.org/show_bug.cgi?id=777319 I really hope I can come back to this after solving my 1001 network issues not related to gstreamer. many thanks for your help!
Created attachment 343547 [details] [review] opusenc: discard frames marked as dtx v04 instead of dropping the frame mark it as gap, so it can be dropped in the payloader
Created attachment 343548 [details] [review] gstrtpopuspay: discard frames marked as dtx
Created attachment 343635 [details] [review] gstrtpopuspay: discard frames marked as dtx do I really have to do all the silly mistakes possible? v01 fix memory leak
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/issues/324.