GNOME Bugzilla – Bug 744922
osxaudiosrc: iOS resampling is stuttering
Last modified: 2015-02-24 14:55:49 UTC
This issue is best exhibited by configuring AVAudioSession to a sample rate other than 44100, i.e.: [[AVAudioSession sharedInstance] setSampleRate:48000 error:nil] This will cause the hardware sample rate (48,000) to be different from the osxaudiosrc sample rate (currently hardcoded to 44,100 due to bug 743758). +---------------+ +-------------+ | Remote IO | | osxaudiosrc | --48khz-->| input element |--44.1khz-->| caps = |--44.1khz--> input | | output | rate:44100 | src pad scope +---------------+ scope +-------------+
Created attachment 297544 [details] [review] Patch The issue was caused by our Core Audio API misuse. See comment in patch.
Current patch has: abl.mBuffers[0].mNumberChannels = 1; I should understand what I should really put there -- probably something from buf->spec?
That is correct, we should be using the ringbuffer spec (haven't had the chance to do a full review yet)
I'm not sure I understand Core Audio: The AudioStreamBasicDescription ("ASBD") has member 'mChannelsPerFrame', and supposedly we've already set up our stream format. Why does AudioBuffer specifically contain 'mNumberChannels' but not other format properties?
Review of attachment 297544 [details] [review]: If the only thing that changes between calls is mDataByteSize, would it not be sufficient to just reset it each time we get the callback, before calling AudioUnitRender()?
I don't really understand why the information from the ASBD is repeated in the ABL. FWIW, the documentation at https://developer.apple.com/library/ios/documentation/AudioUnit/Reference/AUComponentServicesReference/index.html#//apple_ref/c/func/AudioUnitRender mentions: "The AudioBufferList that you provide on input must match the topology for the current audio format for the given bus."
(In reply to Arun Raghavan from comment #5) > If the only thing that changes between calls is mDataByteSize, would it not > be sufficient to just reset it each time we get the callback, before calling > AudioUnitRender()? From the standpoint of keeping this patch smaller, it would be better. From the standpoint of keep the code nice, I disagree. a) There's little to no performance incentive in keeping the AudioBufferList as part of our heap structure as opposed to the function's stack. Initializing this structure is quick, and it's just the scaffolding you need in order to call AudioUnitRender. It's an O(1) structure. In a way, it's just like passing arguments. b) Keeping AudioBufferList between calls confuses the reader -- He'll be wondering why the duplicity? Why does AudioBuffer contain mDataByteSize but they're keeping a separate copy of it? In the end, it's a matter of code documenting itself through common conventions. Remember that our code is, in a way, documenting Core Audio to somebody who didn't read Apple's docs too closely. c) We're keeping the members on the heap in a non-Apple-y way, as something obvious every audio developer should understand. Then we only use Apple-y conventions right before calling Apple functions, so it's clear whom it serves. If Apple chooses to "clobber" its own structure, it's an Apple-y problem. It doesn't clobber *anything* on our GST side of things. This was a long treatise over this patch, but I hope it was convincing :)
(In reply to Arun Raghavan from comment #6) > I don't really understand why the information from the ASBD is repeated in > the ABL. Luckily, someone from the Core Audio team just answered me: http://lists.apple.com/archives/coreaudio-api/2015/Feb/msg00027.html Quoting here: You’re right that it is a bit of redundant information. The correct value mNumberChannels for an AudioBuffer can be derived from the mChannelsPerFrame and the interleaved flag. For non interleaved formats, mNumberChannels is always 1. For interleaved formats, mNumberChannels is equal to mChannelsPerFrame. The reason for mNumberChannels existence may be that AudioBufferList may have slightly preceeded AudioStreamBasicDescription in CoreAudio’s development. It has a slightly earlier check-in date. These are 15 year old structs by now. The channel assignments for the buffers are determined by an AudioChannelLayout for multichannel, or in the case of stereo, the order is always buffer 0 is left and buffer 1 is right. My original question: http://lists.apple.com/archives/coreaudio-api/2015/Feb/msg00024.html
OK, for non-interleaved data, it seems like we need a variable-sized AudioBufferList. We can still do alloca, though. We still need to O(n)-initialize it before every call to AudioUnitRender.
However, from a cursory look, alsasrc didn't try to provide non-interleaved formats at all, despite ALSA itself being capable of it. So far as I can see, non-interleaved formats require creating pads dynamically, which is a complexity that's out of the scope of this change (and probably not worthwhile to invest effort into at this point).
(In reply to Ilya Konstantinov from comment #7) > From the standpoint of keep the code nice, I disagree. Actually, scratch that. Preallocating the whole AudioBufferList ends up much nicer, and reads simpler. My revised patch will be much smaller :)
Created attachment 297727 [details] [review] Revised patch Supporting both interleaved and non-interleaved buffer lists. Though we don't offer non-interleaved caps so it's not tested for the time being, I thought it good to document the knowledge.
Review of attachment 297727 [details] [review]: Thanks for pinning this down. ::: sys/osxaudio/gstosxaudiosrc.c @@ +404,3 @@ + /* TODO: To support non-interleaved audio, go over all mBuffers, + * not just the first one. */ Wouldn't the mDataByteSize be the same for all buffers (it'd be pretty weird to get a different number of buffers for each channel). ::: sys/osxaudio/gstosxcoreaudiocommon.c @@ +155,3 @@ AudioBufferList *list; + gsize list_size; + UInt32 num_buffers, n; Any reason to change this to UInt32? (and it'd be guint32, to be consistent and glib-y) @@ +170,3 @@ list->mBuffers[n].mDataByteSize = size; list->mBuffers[n].mData = g_malloc (size); + if (list->mBuffers[n].mData == NULL) { I'm pretty sure we don't actually deal with g_malloc() failing anywhere, and if it did, the app would have a bigger catastrophe in its hands. The NULL check is likely just going to be deadcode, in that case, and is better just skipped. @@ +182,3 @@ buffer_list_free (AudioBufferList * list) { + UInt32 n; Again, any particular reason for changing the type? @@ +186,2 @@ + for (n = 0; n < list->mNumberBuffers; ++n) { + g_free (list->mBuffers[n].mData); g_free() is a noop if the data is NULL.
(In reply to Ilya Konstantinov from comment #10) > However, from a cursory look, alsasrc didn't try to provide non-interleaved > formats at all, despite ALSA itself being capable of it. > > So far as I can see, non-interleaved formats require creating pads > dynamically, which is a complexity that's out of the scope of this change > (and probably not worthwhile to invest effort into at this point). You don't need dynamic pads, it's the same mechanism as interleaved, just that the data you are passing is a different format. That said, osxaudiosrc template caps are fixed to layout=interleaved, so we'll never negotiate to non-interleaved right now. Support for non-interleaved formats can be added later if required.
Don't worry about non-interleaved audio for now, almost no element supports it yet :) Let's get this bug fixed first and then we can worry about non-interleaved audio at a later time. (But it works like Arun said, just a different value for the layout field in the caps, not additional pads)
(In reply to Arun Raghavan from comment #14) > You don't need dynamic pads, it's the same mechanism as interleaved, just > that the data you are passing is a different format. What is the format? I couldn't find any reference, so I naively assumed it's done by having a different pad send every channel. > Support for non-interleaved formats can be added later if required. Agree.
(In reply to Sebastian Dröge (slomo) from comment #15) > Don't worry about non-interleaved audio for now, almost no element supports > it yet :) Let's get this bug fixed first and then we can worry about > non-interleaved audio at a later time. Right. As I wrote in the comment: (This patch addresses some non-interleaved audio concerns, but at this moment the elements do not support non-interleaved audio and non-interleaved is untested.) (In reply to Arun Raghavan from comment #13) > + /* TODO: To support non-interleaved audio, go over all mBuffers, > + * not just the first one. */ > > Wouldn't the mDataByteSize be the same for all buffers (it'd be pretty weird > to get a different number of buffers for each channel). I was just referring to the fact current code says "mBuffers[0]", while non-interleaved will have NumberOfChannel buffers. As to whether it makes sense for them to have a different mDataByteSize (i.e. number of bytes returned) values for different buffers, I guess you're right. > AudioBufferList *list; > + gsize list_size; > + UInt32 num_buffers, n; > > Any reason to change this to UInt32? (and it'd be guint32, to be consistent > and glib-y) I'm using UInt32 *only* for Core Audio-derived values, same as I use gsize and not size_t for Glib. > @@ +170,3 @@ > list->mBuffers[n].mDataByteSize = size; > list->mBuffers[n].mData = g_malloc (size); > + if (list->mBuffers[n].mData == NULL) { > > I'm pretty sure we don't actually deal with g_malloc() failing anywhere, and > if it did, the app would have a bigger catastrophe in its hands. The NULL > check is likely just going to be deadcode, in that case, and is better just > skipped. Actually, it's my bad. g_malloc aborts on failure, g_try_malloc is the one that returns NULL. I'll remove the check -- it's in fact dead code. > @@ +182,3 @@ > buffer_list_free (AudioBufferList * list) > { > + UInt32 n; > > Again, any particular reason for changing the type? Basically, it eliminated the (int) cast in: for (n = 0; n < (int) list->mNumberBuffers; ++n) { > @@ +186,2 @@ > + for (n = 0; n < list->mNumberBuffers; ++n) { > + g_free (list->mBuffers[n].mData); > > g_free() is a noop if the data is NULL. Which is exactly why I removed the "if (list->mBuffers[n].mData)" test.
I guess I misread the last diff, sorry. If it's only the deadcode that needs removal, I can just make that change and push this out.
Created attachment 297745 [details] [review] Revised patch 2 Eliminated NULL checks since g_malloc must succeed. Also, made buffer_list_free more free-like, by checking NULL there as well, so eliminated more NULL checks.
Comment on attachment 297745 [details] [review] Revised patch 2 Pushed this with the minor addition of a comment stating that non-interleaved isn't actually used.
> > You don't need dynamic pads, it's the same mechanism as interleaved, just > > that the data you are passing is a different format. > > What is the format? I couldn't find any reference, so I naively assumed it's > done by having a different pad send every channel. 2 channels, interleaved: LRLRLRLRLR 2 channels, non-interleaved: LLLLLRRRRR
(In reply to Tim-Philipp Müller from comment #21) > 2 channels, interleaved: > > LRLRLRLRLR > > 2 channels, non-interleaved: > > LLLLLRRRRR Oh, so it's pretty simple. One side effect is that even if we allocate the Core Audio buffers as one consecutive memory area, the Audio Unit is likely to return it half-full and then it'll be: LLLL___RRRR___ so we still need to memcpy, but we memcpy today anyway.
You could probably do something without a copy (two GstMemory create via gst_memory_share() in one buffer), but chances are copy is cheap enough, and other code probably doesn't handle separate memories yet anyway, so would have to copy to merge when mapping the buffer. Anyway, something for some other bug I guess.