GNOME Bugzilla – Bug 781998
directsound: increase thread priority read/write to avoid stutter and glitches
Last modified: 2018-11-03 14:08:09 UTC
Since forever [1], when using GStreamer on Windows if there's enough load on the system (or various other things), playback will stutter. Other media players on Windows (for example, Windows Media Player (WMP)) don't exhibit these issues. An easy way to reproduce this issue is by running Windows in a VM, and doing things like rapidly moving around windows that change frequently (like Process Explorer). The experiment I did today was running gst-play (msys2 gst 1.10.4) and WMP, and comparing their performance under load. * It's very easy to get gst-play to stutter, even under very light load * It's much more difficult to get WMP to stutter, even under heavy load A brief examination of the processes themselves with Process Explorer reveals that the thread priorites in WMP are significantly higher than the thread priorities in the gst-play process. More digging reveals that there's a Windows API [2] for telling the system that you're doing important multimedia processing. I theorize that if GStreamer calls the relevant APIs when doing processing (AvSetMmThreadCharacteristics followed by AvRevertMmThreadCharacteristics), then the performance will become similar to WMP. I note that Clementine has a hack to set realtime priority for OSX [3] by setting thread priority when a GstTask is detected to start. Perhaps this is a good place to do this? If not, where? I'm very very interested in getting rid of this stutter issue on Windows, and would be willing to implement/test something if someone will give me some pointers on the best place to implement it. I note that the Clementine audio player also has a open bug report for this issue as well [4]. [1] https://github.com/exaile/exaile/issues/76 [2] https://msdn.microsoft.com/en-us/library/ms684247(VS.85).aspx [3] https://github.com/clementine-player/Clementine/commit/df21da786e74a504adbdbd74d9d7c7577c6e52ed [4] https://github.com/clementine-player/clementine/issues/2570
I have a simple proof of concept that sets the thread priorities for all Gstreamer task threads. Unfortunately, it doesn't make any difference. Examination of the thread properties show that all of the threads with > 0% cpu are set as high priority, except one -- which appears to be the directsoundsink ring buffer thread. Going to see if I can get a handle to that. https://gist.github.com/virtuald/3acbb802b26cc595bbe6690346ece556
Stuttering can be caused my many factor, including bugs. It would be good to investigate. Maybe the sink latency is being reported too low, or we didn't configure the buffering right ?
I don't have a ton of audio knowledge or GStreamer internals, but I'm willing to try things if you have suggestions. :)
I think you're right that the buffering is incorrect. Setting GST_DEBUG="directsoundsink:5,audiobasesink:5" outputs messages that are basically, queue a few samples, and then call sink_write A LOT of times. Presumably they should be (mostly) one to one?
This looks like the "ALSA" model. In ALSA you accumulate data (preroll) and at the end of the pre-roll you start the ring buffer. As the ring buffer is in the kernel (we don't do mmap in gst), we quickly write all the data to fill it up. Afterward though, the write should start being equally spaced. Would be nice to compare how audio is being played back on other working player, open source one would be easier.
Audacity comes to mind? https://github.com/audacity/audacity/blob/17afc51644b2b327e173a23d6066dde598838c03/lib-src/portaudio-v19/src/hostapi/dsound/pa_win_ds.c
Almost forgot, Chrome is open source too! They have a bunch of audio code in there, looks like they use WASAPI or WaveOutput: https://cs.chromium.org/chromium/src/media/audio/win/
Alright, so more playing around, and after spending a long time reading other sinks and looking at the old 0.10.x directsound ringbuffer patch in #584980... instead of rewriting the sink, I've hit upon a combination of hacks that seems to perform as well as WMP. There are two pieces: * I changed my sync handler to listen for stream status ENTER instead, and when that occurs then I call AvSetMmThreadCharacteristics. Since stream status ENTER is emitted by the ring buffer thread in addition to the other threads, the result is that all threads that seem to be consuming CPU get an elevated priority, and then the stutter mostly goes away. * The other piece that helps is modifying the sink to call GST_DSOUND_UNLOCK before the sleep occurs, and GST_DSOUND_LOCK once again after the sleep -- I got this idea after noticing a comment in the ringbuffer patch [1] that one shouldn't hold the object lock while sleeping. The combination of the two results in output that seems to be as difficult to get it to stutter as WMP, which is good enough for now IMHO. The CPU usage seems pretty minimal too. So where do we go from here? Ideally, anywhere that Gstreamer creates a thread, we should call the magic API to get more processing power. It seems like perhaps the sink could listen to its own bus for these stream enter events and bump the priority there. Or perhaps on Windows anytime one creates a GstTask then it automatically bumps the priority? I did try bumping the priority only on the sink thread, and there were still stutter issues. [1] https://github.com/psi-im/psimedia/blob/master/gstprovider/gstelements/directsound_sinkonly/gstdirectsoundringbuffer.c#L729
Can you please try the "Fix high CPU usage and sleep for at least 10ms while looping" patch in https://bugzilla.gnome.org/show_bug.cgi?id=773681 please? It should have the same effect as your fix.
It won't address the stutter issues (see above about the priority API stuff and the locking problem), but yes, it seems like it would fix the high CPU usage issue.
Ok, so I'm back at it again tonight, and this is very strange. It looks like the only thing that the lock protects is the write function (which is always called by the same thread) or the reset function -- which presumably isn't called very often. So in theory removing the lock should have zero effect at all, and yet it's very consistent. Will dig some more. In case anyone has thoughts, I'm using msys2 on Windows 7 in a VirtualBox VM for my testing.
Done for now I think. I've updated the gist with something that performs slightly better, even without doing an unlock during the sleep. Two improvements: * I use the bus sync signal stuff and connect to sync-message::stream-status so that I only get the stream status messages. * I used the "Pro Audio" priority category, which gets the threads an even higher priority, and the stutter is significantly more difficult to cause than previously. I'm not really a gstreamer expert, but there doesn't seem to be a good way to receive the stream status messages from the sink. My hope was that I could just receive those messages from the gstdirectsoundsink and then bump the priorities of all threads that are feeding into the sink -- while that's a bit of a hack, it would work. It seems like the only other alternative is to inject some Windows-specific code into GstAudioSink and GstTask. If developers would be open to such a patch or have a better suggestion, I can put it together. A more difficult alternative is to rewrite the sink and create a custom ring buffer implementation; or improve the WASAPI sink and then do some priority stuff there (though, I think you'll still want to bump the priority of all multimedia threads or you'll run into the same issues).
Can you provide some code for what you're doing? :)
All of the gstreamer code stuff that's fit to use is over in the other bug. I tried reorganizing the write function to deal with errors better, but I realized I introduced some race conditions -- and it didn't result in anything better than the patches over there anyways. The example program that increases the priority of all streaming threads can be found in the gist referenced above: https://gist.github.com/virtuald/3acbb802b26cc595bbe6690346ece556 . The method works (all threads that are doing something get a priority boost when using process explorer to look at it), I just don't have the prerequisite knowledge to integrate it into gstreamer directly effectively.
There is example code on how to set thread priorities for stutter-free playback (under "pro-audio") here: https://msdn.microsoft.com/en-us/library/windows/desktop/dd370844(v=vs.85).aspx https://docs.microsoft.com/en-us/windows-hardware/drivers/audio/low-latency-audio
WASAPI glitch-free playback and capture is being handled in: https://bugzilla.gnome.org/show_bug.cgi?id=793289 Reducing severity because with those changes, WASAPI will have better performance, latency, and code-quality than directsound. The only things missing are automatic stream management (bug 793059) and support for encoded formats like dts, ac3, alaw, etc. All those are better added to WASAPI anyway since directsound was deprecated more than a decade ago.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/issues/553.