GNOME Bugzilla – Bug 337799
Audio stuttering
Last modified: 2006-04-21 13:59:37 UTC
Windows has issues with 16kHz.
Created attachment 63020 [details] [review] Michael Rickmann's solution
The patch looks pretty good ; but it raises the question: should we not switch ekiga to 8kHz and 3 buffers for all platforms, to avoid playing with #ifdefs ?
Actually it is a 2x2 matrix of questions. Opal and audio setup on one side versus frequency and buffers on the other. Ekiga already contains the SPEEX 16 kHz codec. So Opal and 8 kHz are out. What the number of buffers is concerned we have to cope with the latencies of Windows Multimedia. Should Linux suffer from that? Remains the combination audio setup and frequency. I hope that things will change and some work is done in pwlib. I regard the frequency part of this patch as a temporary measure and will watch closely when it can be reversed, if necessary invest some work starting with the pwlib audio sample. Regards Michael
Michael, Snark, thanks for your excellent work. The #ifdef for the number of buffers is also present in OPAL. Adding more buffers, adds more latency. However, if it is only for sound events, it is not a problem. But the 16kHz bug is weird. We shouldn't go back to 8kHz for all platforms, and we shouldn't hide the bug on WIN32 by an #ifdef. If sound events do not work in 16kHz, then I'm sure that SPEEX WideBand won't work either. The reason could be in pwlib. Does the WIN32 audio plugin support to be opened in 16kHz?
Hmmm... so I could apply the part of the patch for the number of buffers. That would perhaps make things at least a little better ?
Yes you can apply it. It has only an impact on the sound events.
Ok, applied that part. I have no idea about the 16kHz support in the win32 sound driver...
The 3 buffers did the trick. With last evenings CVS stuttering has gone even at 16 kHz. Remains that the ring buffer is still a bit small. In GMAudioTester::Main () it is allocated by "buffer_ring = (char *) malloc (8000 /*Hz*/ * 5 /*s*/ * 2 /*16bits*/);" and in GMAudioRP::Main () wrapped arround by "if (buffer_pos >= 80000)". This is good for 2.5 seconds and not for the 4 s announced by the druid. This is really a very minor issue. For Linux the beginning of talking is cut off and then the delay is about 2.5 s, i.e. I can say 1, 2, 3 before I hear myself again. On Windows things appear faster. I had to increase the ring buffer to 192000 byte to get an approximate delay of 4 s (sitting there with my watch). I wonder whether it is really 16 kHz. Nevertheless, the stuttering has been resolved. And I moved Speex 16 kHz to first place on both the Windows and the Linux box, Ekiga said Speex on both when connected and communication was cristal clear. Regards Michael
Hmmm... then what do we do ? ;-)
Fix the audio buffer size... Change the 8000 in 16000 and change teh 5 into 4.
I see which 8000 you're talking about, but which 5 in 4 ?
"buffer_ring = (char *) malloc (8000 /*Hz*/ * 5 /*s*/ * 2 /*16bits*/);"
I checked it on Win32 during the initial delay, in audio.cpp after the PThread::Current ()->Sleep (100);. Audio input is a bit slow during the first 200 ms but after that, input is 3200 bytes per 100 ms, exactly 16 kHz. I regard the stuttering audio fixed. In Win32, during the initial delay maximally 126080 bytes were read. Thus the 80000 byte long ring buffer had wrapped around once and I was hearing myself with a delay of ~1.5 sec. On Linux, when doing identical tests, never more than 127360 bytes were read, better than Windows but less than the theoretical 128000. Adding 100 ms margin gives 130560 which is just the usable (chunks of 640) amount in (1 << 17). But being mean at this place makes things worse. Therefore, I would suggest to double the ring buffer as follows. Regards Michael
Created attachment 63474 [details] [review] double druid's audio ring buffer
Hmmm... anybody could remind me why I didn't commit that patch yet ?!
I do not know, it fixes the delay for Linux and Win32 and works for both. What has not been discussed here is that if ((PTime () - now).GetSeconds () > 3) becomes true if (4 > 3) so it is 4 sec as the druid anounces and the patch gives an additional second safety against overrun. That way it was apparently designed for 8 kHz. Regards Michael
Applied. Thanks!