GNOME Bugzilla – Bug 757153
opus: Handle start/end trimming
Last modified: 2015-11-03 18:39:52 UTC
See bug #757049, bug #742643, bug #727305 and bug #729950. We need to handle for Opus the start/end trimming and seek preroll. This is also related to bug #746109 and especially bug #740196. The Opus encoder tells us with OPUS_GET_LOOKAHEAD_REQUEST how many samples have to be thrown away from the beginning of the stream. At the end of the stream the last Opus frame will be padded with some more samples if the stream length is not an integer multiple of the frame size. This could also be used for lossless cutting of Opus streams, where you might want to cut in the middle of the frame. For Ogg the start trimming is in the OpusHead, the end trimming is given on the last packet. For Matroska/WebM it's in the container too, done exactly the same just for a different container. In MP4 the start trimming is the same, the end trimming is given directly in the headers from the duration of the stream. In MPEG-TS it's given per packet how much of this specific packet has to be trimmed (note that this means that you never get a value bigger than the framesize then, you'll get values for each frame!). We need to a) pass this information from demuxer to decoder, and b) from encoder to muxer. In a way that we can handle all the variants (complete start trimming vs. start trimming per frame, complete end trimming by duration vs. end trimming per frame). Then c) we also need to handle it in the decoder properly to throw away the data and make sure to not break A/V sync because of that. The thrown away data is not part of the segment! Any ideas how to do this, other than having 1) the start trimming in the OpusHead to get it from the beginning and 2) additionally having a generic GstMeta (this affects all audio codecs in the end) to have per frame start/end trimming? For GstAudioDecoder we should probably have a generic setting on the base class and then let the base class handle that. That might be what the "lookahead" currently is for, but it seems to not do what it should (see bug #754226).
Couldn't this be done by having the relevant samples just fall out of the segment and the audiodecoder class will then just clip it before pushing ?
No, because having it in the segment requires you to a) know the complete amount of padding at the beginning (you don't in MPEGTS) and b) to know the segment stop position accurately (you don't for MPEGTS, raw MP3, etc). I think a combined approach with using the segment if possible, and otherwise having some kind of meta (which has a flag to say if clipping this specific padding was removed via the segment already) would be the best solution here.
Implementing something here now
Created attachment 314539 [details] [review] opusenc: Put lookahead/pre-skip into the OpusHead header
Created attachment 314540 [details] [review] opusenc: Encode exactly the amount of samples we got as input and put correct timestamps on it The first frame has lookahead less samples, the last frame might have some padding or we might have to encode another frame of silence to get all our input into the encoded data. This is because of a) the lookahead at the beginning of the encoding, which shifts all data by that amount of samples and b) the padding needed to fill the very last frame completely. Ideally we would use LPC to calculate something better than silence for the padding, and would also add some more samples in the very beginning with LPC to make the encoding as smooth as possible. With this we get exactly the same amount of samples again in an opusenc ! opusdec pipeline.
Created attachment 314541 [details] [review] opusenc: Place 48kHz first in the caps For all the other sample rates the encoder will have to resample internally.
Created attachment 314542 [details] [review] opusenc: Add some FIXME comments about calculating padding with LPC
Created attachment 314543 [details] [review] audio: Round up for removing samples at the end of the buffer in gst_audio_buffer_clip() The stop position is not part of the buffer while the start position is. This ensures that we don't have too many samples in the stream in the end.
First step, not this has to be fixed up again for Ogg... and then the beforementioned meta can be implemented and the same can be used in other containers.
Actually Ogg does seem to do it correct, but seems more like an accident ;) Will have to check that in detail... also off-by-one with end-clipping and it can't work with unseekable streams (duration not known then). The meta would fix that all.
Created attachment 314609 [details] [review] oggdemux: Allow start clipping for Opus The granulepos does not have the pre-skip subtracted while timestamps do, and the last granulepos will be shorter by the number of samples that should be dropped because of padding in the end. As such, extrapolating the granule of the beginning of the first frame will lead to a negative value, which is not a problem but intentional.
Comment on attachment 314543 [details] [review] audio: Round up for removing samples at the end of the buffer in gst_audio_buffer_clip() We should try to not clip too much in any case, if sample accurate clipping is needed the upcoming meta should be used.
Created attachment 314610 [details] [review] opusenc: Disable granule position calculations by the base class It is doing the wrong thing because of the Opus pre-skip: while the timestamps are shifted by the pre-skip, the granule positions are not shifted. oggmux is doing the right thing here already.
Created attachment 314642 [details] [review] audio: Add GstAudioClippingMeta for specifying clipping on encoded audio buffers
Created attachment 314659 [details] [review] audio: Add GstAudioClippingMeta for specifying clipping on encoded audio buffers
Created attachment 314660 [details] [review] opusenc: Add GstAudioClippingMeta to buffers that need to be clipped
Created attachment 314661 [details] [review] opusdec: Handle GstAudioClippingMeta in addition to the pre-skip field of the OggHead The OggHead is ignored if the meta is used.
Created attachment 314667 [details] [review] tsdemux/mux: Add support for GstAudioClippingMeta for Opus
Created attachment 314703 [details] [review] oggdemux: Add GstAudioClippingMeta for Opus for accurate start/end clipping
Created attachment 314704 [details] [review] opusdec: Handle GstAudioClippingMeta instead of the pre-skip field in the OpusHead oggdemux is outputting the meta now, and only outputs if it should really apply to the current buffer. Previously we would skip N samples also if we started the decoder in the middle of the stream.
Created attachment 314705 [details] [review] opusdec: Handle GstAudioClippingMeta instead of the pre-skip field in the OpusHead oggdemux is outputting the meta now, and only outputs if it should really apply to the current buffer. Previously we would skip N samples also if we started the decoder in the middle of the stream.
Created attachment 314708 [details] [review] oggmux: Use GstAudioClippingMeta for Opus for accurate end clipping ... instead of relying on the segment. For the clipping at the start we assume a proper value in the OpusHead, as generated by opusparse or opusenc. Transmuxing in general is not guaranteed to produce the correct values, or even have a OpusHead (e.g. when having RTP input).
Created attachment 314709 [details] [review] opusparse: Fix up pre-skip in OpusHead if upstream using GstAudioClippingMeta Makes transmuxing from e.g. MPEG-TS to Ogg sample accurate.
All merged now