GNOME Bugzilla – Bug 765115
alsasink: single segment audio glitch after 200ms at startup
Last modified: 2018-11-03 11:46:19 UTC
with some audio sinks (alsasink), we can get a single segment glitch at startup after processing 200ms of data, because the GStreamer ringbuffer is sized exactly the same as the target hardware ringbuffer. alsasink starts pushing samples to ALSA after the ringbuffer fills up / prerolls with 200ms (default buffer-size) of samples. At that point, it writes samples as quickly as possible to ALSA until the write() call to ALSA blocks. What that means is we read and write 20 10ms segments *very rapidly* to the audio device, and then on the 21st write it will block for ~10ms until the hardware plays 1 segment. What gets written in the 21st segment *may* be correct audio, if upstream was given a timeslice to write those samples, but commonly (especially on single core machines), alsasink gets to the 21st segment and it's not ready yet, so it generates a silence fragment and *then* blocks writing that out. If upstream is then woken up, it will find that it is too late and drop that 1 segment. The simple fix is to allocate 1 more segment in the GStreamer ringbuffer for upstream to write into, which uses slightly more memory but otherwise doesn't affect overall playback latency since the ringbuffer in the hardware is still configured with 20 segments. Most audio sinks need that extra segment to avoid this glitch. I think pulsesink where we write directly to their ringbuffer is the only one that doesn't. I propose we make the ringbuffer expand spec->segtotal += 1 in gst_audio_ring_buffer_parse_caps() when indicated by setting a flag using a new gst_audio_ring_buffer_enable_extra_segment () function (or so).
We also discussed that with Wim. I believe this should be configurable. We are missing a third size, we have latency-time, which is the size of write(), buffer-time, which is both the size of the ring buffer and the latency. But I think the latency/start-threshold should be in-between the size of a write() and the ring buffer size, and that should be configurable.
(In reply to Nicolas Dufresne (stormer) from comment #1) > We also discussed that with Wim. I believe this should be configurable. We > are missing a third size, we have latency-time, which is the size of > write(), buffer-time, which is both the size of the ring buffer and the > latency. But I think the latency/start-threshold should be in-between the > size of a write() and the ring buffer size, and that should be configurable. That's only possible with pulsesink that maps the output ringbuffer and writes into it directly. With almost all the other audio sinks that write the first batch of data as rapidly as possibly, the ringbuffer start threshold *has* to be the full size or it can glitch. pulsesink has a custom commit function, so it can arrange any start threshold it wants. In any case, I think that enhancement is orthogonal to this bug.
(In reply to Jan Schmidt from comment #2) > (In reply to Nicolas Dufresne (stormer) from comment #1) > > We also discussed that with Wim. I believe this should be configurable. We > > are missing a third size, we have latency-time, which is the size of > > write(), buffer-time, which is both the size of the ring buffer and the > > latency. But I think the latency/start-threshold should be in-between the > > size of a write() and the ring buffer size, and that should be configurable. > > That's only possible with pulsesink that maps the output ringbuffer and > writes into it directly. With almost all the other audio sinks that write > the first batch of data as rapidly as possibly, the ringbuffer start > threshold *has* to be the full size or it can glitch. > See bug #657076 for why this is so.
Maybe we could fill the bottom part of the ring buffer with silence. That would be of the duration of the configure latency. Then fill the other part with data and then start the ring buffer. Or follow your plan, but in fact allow for the configured latency to be pre-buffered rather then dropped.
I guess I just don't see the point. If you want to store less data in ALSA / pulseaudio (have smaller latency) you'd just set buffer-time smaller? Anyway again - it's completely separate to this bug.
I think tt's not, you're proposed solution is to allocate extra data (and hide it). What I'm saying is that buffers are dropped because fill entirely the ringbuffer before we start. That has the side effect of having the live buffer arrive right at the moment they should be rendered, hence we get frequent drops. It's quite related.
No, you've misunderstood the bug here. This problem is much simpler - alsasink reads a fragment too many from the ringbuffer. It claims the 21st fragment from the ringbuffer before it tries to write it. It doesn't find out the hardware ringbuffer is full until then. At that point, our GStreamer ringbuffer can actually accept 20 fragments more from upstream.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/issues/266.