GNOME Bugzilla – Bug 754226
speexenc: don't set lookahead
Last modified: 2018-11-03 15:03:26 UTC
Created attachment 310167 [details] [review] patch Also introducing a test-suite for speex.
No // comments, we use /* */ Also, what does it do wrong? What did you expect different?
(In reply to Olivier Crête from comment #1) > Also, what does it do wrong? What did you expect different? When giving the speexenc 20ms of 16000Hz (320 samples), and decoding the resulting buffers with speexdec, it is clear that there is 11ms of silence before the audio starts, which is the number set as "lookahead". Now, running audiotestsrc samplesperbuffer=320 num-buffers=3 ! audio/x-raw,format=S16LE,rate=16000 ! speexenc ! fakesink -v you get: 80 bytes, dts: none, pts: none, duration: 0:00:00.000000000 70 bytes, dts: none, pts: none, duration: 0:00:00.000000000 70 bytes, dts: none, pts: none, duration: 0:00:00.011062501 70 bytes, dts: 0:00:00.011062500, pts: 0:00:00.011062500 70 bytes, dts: 0:00:00.031062500, pts: 0:00:00.031062500 I have several problems with this. First off, the pts is none on the first buffers (and the 2 headers) because it calculates a negative timestamp. Secondly, the duration of the first buffers is not 11ms, it is 20ms, when decoded. The fact that there is some zeros at the start does not change that fact. You could argue that there is only 11ms of *usable* audio in that buffer, but that is not what duration is about. Discussing this with slomo in #gstreamer, he seemed to think using segments would be the proper way of doing this, but in the meantime it feels a lot better to have speexenc outputting buffers starting from pts:0 with duration of 20ms per buffer.
Yes, these initial 9ms of samples per buffer should be clipped away by having an appropriate segment configured so that a) the first 9ms are before segment.start and b) PTS==9ms is mapped to stream and running time 0. See also bug #620323
Håvard, what should we do about this? This patch does not seem complete to me because of what I said in comment 3
(In reply to Sebastian Dröge (slomo) from comment #4) > Håvard, what should we do about this? This patch does not seem complete to > me because of what I said in comment 3 IMO this patch make it go from "just all over wrong" to "what most other encoders are doing" but not perfect. Since a proper solution would need more thought(involving a new way of doing this in the base class), I propose this is an intermediate step, as it is an improvement.
This relates to bug #757153 for Opus, right? So this also means that there are two things to do here: - the first 11ms of silence need to be clipped away in one way or another - we need to know if it's always 11ms or if it depends on the encoder and settings - at the end of stream we also likely need to drop some samples, we need to know how many and clip them
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/issues/214.