GNOME Bugzilla – Bug 743758
osxaudiosrc supports only 44100 sample rate on iOS
Last modified: 2015-08-16 13:40:52 UTC
Unlike plugins such as alsa, the osxaudio plugin reports fixed audio capabilities for sources. As a result, we don't use the hardware to its full extent and may require CPU-intensive resampling. OS X PLATFORM: On OS X, we can probably use kAudioStreamPropertyAvailableVirtualFormats and kAudioStreamPropertyAvailablePhysicalFormats should probably be used, as explained in: http://lists.apple.com/archives/coreaudio-api/2005/Nov/msg00156.html They're already used in _audio_stream_get_formats (gstosxcoreaudiohal.c) albeit only to detect SPDIF availability. IOS PLATFORM: On iOS, CoreAudio is absent and we use Remote IO, which doesn't offer supported formats. Instead, the app developer configures the AVAudioSession (e.g. for her preferred sample rate) and we should retrieve the AVAudioSession's sampleRate property. Current version always F32LE format with GST_AUDIO_DEF_RATE (= 44100) sampling rate.
Created attachment 295845 [details] [review] Patch Comments inside. Please review.
This is an interim patch. I'm continuing researching this issue. Also, just realized the HAL part of the patch is untested, as I've only tested on iOS. In the meanwhile, documenting a useful blog post: http://www.subfurther.com/blog/2009/04/28/an-iphone-core-audio-brain-dump/
So I guess things that are left here are (a) testing with the HAL backend, and (b) probing formats. Anything else? And are you still looking at this?
(a) I haven't tested yet – not even sure it builds, so that's why I asked not to commit yet (b) HAL: Yes, supposedly we can probe formats. This can be done in a separate commit, though. Remote IO: I can't find any API for probing. The input seems happy to provide any requested format (that can be expressed in ASBD), but only one sample rate (the session's). I've posted a question about this on coreaudio-api, but so far no replies: http://lists.apple.com/archives/coreaudio-api/2015/Feb/msg00000.html The output is happy to accept any format and any sample rate. It also prefers the session's sample rate, but will convert as necessary. (Between GstAudioResample and Core Audio, I'd trust the later to be more effective.) iOS used to prefer "noninterleaved linear PCM with 8.24-bit fixed-point samples" (the so-called canonical format) but starting iOS 8 it's obsolete -- see http://asciiwwdc.com/2014/sessions/501: " ... So in the past we've had this concept of canonical formats. And this concept goes back to about 2002 in Mac OS 10.0 or 10.1 or so. So this format was floating-point, 32-bit, de-interleaved, but then we got along to iOS, and we couldn't really recommend using float everywhere because we didn't have the greatest floating-point performance initially. So for a while canonical was 8.24 fixed-point. But because of that schism we want to reunite under something new now. We've deprecated the concept of canonical formats. Now we have what we call a standard format. We're back to non-interleaved 32-bit floats on both platforms."
Ok, so this needs some testing on OSX then, adding probing on OSX and then can probably get in?
I can test on OS X and do the probing, unless Ilya already has an update on this.
Created attachment 297348 [details] [review] Patch Slight fix to make OS X variant build. This time it's tested on OS X too.
Created attachment 297486 [details] [review] Same patch with revised comment Please review patch with revised comment. Also, please apply with comment (git am).
Thanks for the work on this, code looks good, and I should be able to sanity test and merge this in a day or two (once I've got access to my OS X/iOS hardware again).
Stop the press! It looks like this is, after all, our error that caused us not to get zero sample rate from AudioUnitGetProperty. I'm still looking into this.
OK, the "stuttering" issue which made me, at first, believe that iOS doesn't perform input resampling for us, turns out to be a different bug -- bug 744922. As for the hardcoded 44100, I guess we should follow alsasrc's example with permissive pad template caps. AU is happy to provide whatever we ask of it, and we can assume it's pretty efficient at what it does. The only thing that baffles me are the channel layout translation into caps **for source**? Is anyone using that stuff in source?
And of course: DO NOT MERGE CURRENT PATCH! :)
Comment on attachment 297486 [details] [review] Same patch with revised comment This is a good way to make it clear ;-P
Summarising conversation on IRC, the plan going forwards is to expose what caps can be handled by the AudioUnit (including what it would resample / convert), and let negotiation select things based on that. If we can determine the "native" format/rate, we can always put that up ahead in the caps so that is preferred in negotiation. So this bug would be fixed by exposing the correct caps (and not using the stream format of the AudioUnit). I believe ikonst is working on this (correct me if I'm wrong).
Created attachment 298195 [details] [review] osxaudio: Overhaul probing caps -- PATCH OF DOOM! I've intended this to be much smaller, but ended up with a 47K patch. On the upside, this is a lot of the things we wanted: - preferred caps + template caps - channel layout support on both OS X and iOS - respecting hardware format changes (via property listeners) REVIEW I'm looking for feedback, including but not limited to "holy shit, it's unreviewable!". I'll be available on IRC to discuss. TESTING I gave special attention to channel layouts, and exercised it thoroughly on OS X with a Soundflower 64-channel virtual device -- in both unpositioned and positioned (5.1) modes. I've used the built-in Audio MIDI Setup utility app on OS X to change speaker setups on the Soundflower device, then re-test. To make sure the audio actually flows into a Soundflower, use AU Lab and link the Soundflower to Built-in Output. TESTING TOOLS - Soundflower -- https://rogueamoeba.com/freebies/soundflower/ - AU Lab -- get "Audio Tools for Xcode" through https://developer.apple.com/downloads/index.action?name=for%20Xcode%20-
Created attachment 298197 [details] [review] osxaudio: Overhaul probing caps -- PATCH OF DOOM! [fixed an error in the channel label-position mapping]
BTW, I have qualms about _set_ring_buffer_osx_channel_order and specifically gst_audio_ring_buffer_set_channel_positions. Since we're *setting* a channel layout on the AU anyway (see gst_core_audio_set_channels_layout), and it's always done in the "GStreamer channel order", then the AU does: 1) skip over channels we don't support 2) reorder channels and GStreamer will never have to reorder. In fact, it might be a bug of double channel reordering, since what gst_audio_ring_buffer_set_channel_positions receives is the hardware channel ordering.
Please hold the review until I attach a new patch that: 1) applies cleanly 2) handles "wrong"-ordered channels correctly (see comment 17)
Created attachment 298847 [details] [review] osxaudio: Overhaul probing caps -- PATCH OF DOOM! Since last patch: 1) Applies cleanly on head 2) We never need to reorder channels with Core Audio. Only *warn* if our source/sink hardware doesn't support all the requested channels. TESTING I've done some more multi-channel testing. A trick to test multi-channel if you only have a stereo setup: In Audio MIDI Setup, create an aggregate device of Built-in Output + Soundflower 64ch. Go into Speaker Setup and assign interesting channels to your Built-in Output. Some tests: * http://www-mmsp.ece.mcgill.ca/documents/AudioFormats/WAVE/Samples/Microsoft/6_Channel_ID.wav 5.1-surround should be chosen for the Aggregate Device. * http://www-mmsp.ece.mcgill.ca/documents/AudioFormats/WAVE/Samples/SoundCardAttrition/4ch.wav Hexagonal multi-channel should be chosen for the Aggregate Device. "Surround" is the Rear Surround channel.
Created attachment 298920 [details] [review] Patch 1 - remove unused finalize
Created attachment 298921 [details] [review] Patch 2 - function rename
Created attachment 298922 [details] [review] Patch 3 - type check macro
Created attachment 298923 [details] [review] Patch 4 - fix spaces
Created attachment 298925 [details] [review] Patch 5 - SPDIF-only fields
Created attachment 298926 [details] [review] Patch 6 - AudioUnitInitialize on open
Created attachment 298927 [details] [review] Patch 7 - Overhaul of probing caps
Created attachment 298928 [details] [review] Patch 8 - Invalidate cached caps on format change
Created attachment 298929 [details] [review] Patch 9 - warn on missing channels
Review of attachment 298926 [details] [review]: This one looks a bit tricky. From the documentation at https://developer.apple.com/library/ios/documentation/AudioUnit/Reference/AUComponentServicesReference/index.html#//apple_ref/c/func/AudioUnitInitialize: """ Usually, the state of an audio unit (such as its I/O formats and memory allocations) cannot be changed while an audio unit is initialized. """ If I'm reading this right, we can't really set a different format after initialising, which breaks the semantics of open (viz. open the device, but don't fixate the format, which happens at acquire time). We might need to do an initialise and uninitialise just for probing, to deal with this.
Review of attachment 298927 [details] [review]: ::: sys/osxaudio/gstosxaudiosink.c @@ +338,3 @@ + GST_DEBUG_OBJECT (sink, "no ring buffer, returning NULL caps"); + return GST_BASE_SINK_CLASS (parent_class)->get_caps (sink, filter); + } Should return template caps, not NULL. @@ +358,3 @@ + caps = gst_core_audio_probe_caps (osxbuf->core_audio, template_caps); + gst_caps_replace (&osxbuf->core_audio->cached_caps, caps); + osxbuf->core_audio->cached_caps_valid = TRUE; Can we replace the boolean with a NULL check on cached_caps? @@ +363,3 @@ + } else { + GST_DEBUG_OBJECT (sink, "ring buffer not open, returning NULL caps"); + caps = NULL; Same comment as before about using template caps. ::: sys/osxaudio/gstosxaudiosrc.c @@ +285,3 @@ + GST_DEBUG_OBJECT (src, "ring buffer not open, using template caps"); + caps = GST_BASE_SRC_CLASS (parent_class)->get_caps (src, filter); + filter = NULL; Could just pass NULL to the base class and do filtering ourselves (though I'm fine with either way). ::: sys/osxaudio/gstosxcoreaudio.c @@ +202,3 @@ +/* Does the channel have at least one positioned channel? + (GStreamer is more strict than Core Audio, in that it requires either + all channels to be positioned, or all unpositioned.) */ Do we really need to support the partially-positioned case? And if yes, maybe it makes more sense to just downgrade that to unpositioned, rather than only picking up the positioned channels? @@ +489,3 @@ + /* Get the ASBD of the outer scope (i.e. input scope of Input, + output scope of Output). + This ASBD indicates the hardware format. */ GStreamer comment style is: /* ... * ... */ or /* ... * ... * ... */ @@ +566,3 @@ + * so we ensure it ourselves in _set_ring_buffer_osx_channel_order + * when the ring buffer is acquired. + */ Do we need to have both the with-channel-mask and without-channel-mask variants, then? Can't we just have the version without? ::: sys/osxaudio/gstosxcoreaudiocommon.c @@ +263,3 @@ + /* Construct AudioChannelLayout */ + layoutSize = + G_STRUCT_OFFSET (AudioChannelLayout, mChannelDescriptions[channels]); While I like this for being quite clever, IMO the previous way of allocating the AudioChannelLayout was easier to read. @@ +446,3 @@ + * 'channel' is used for warnings only. */ +GstAudioChannelPosition +coreaudio_channel_label_to_gst_audio_channel_position (AudioChannelLabel label, I wouldn't be against just calling this gst_core_audio_channel_label_to_gst(), which is also long, but shorter, and a bit more consistent with the rest of the code (can also make the other function gst_audio_channel_position_to_core_audio(). @@ +505,3 @@ + case kAudioChannelLabel_RearSurroundLeft: + case kAudioChannelLabel_RearSurroundRight: + case kAudioChannelLabel_TopCenterSurround: I'm not quite sure what these 3 are supposed to map to. Do they overlap with any of the others?
Review of attachment 298928 [details] [review]: ::: sys/osxaudio/gstosxcoreaudio.c @@ +72,3 @@ + + /* Assume AudioUnitRemovePropertyListenerWithUserData will block + * until the last listener finishes, so core_audio remains valid. */ Can we confirm this assumption (which sounds reasonable), and if not, deal with it however appropriate? @@ +75,3 @@ + core_audio = GST_CORE_AUDIO (inRefCon); + g_assert (inUnit == core_audio->audiounit); + parent = core_audio->osxbuf->parent; /* osxbuf is our owner, no need to ref it */ What is the function of the parent here? @@ +151,3 @@ + kAudioUnitProperty_AudioChannelLayout, _audio_unit_property_listener, + core_audio); + if (status) { Not too important, but we check status != noErr in the rest of the code.
Review of attachment 298929 [details] [review]: Do we really need to warn about this? We know that what we're providing is being remapped appropriately to fit what the hardware provides, so there isn't really anything to warn about, IMO.
(In reply to Arun Raghavan from comment #29) > Review of attachment 298926 [details] [review] [review]: > > This one looks a bit tricky. From the documentation at > https://developer.apple.com/library/ios/documentation/AudioUnit/Reference/ > AUComponentServicesReference/index.html#//apple_ref/c/func/ > AudioUnitInitialize: > > """ > Usually, the state of an audio unit (such as its I/O formats and memory > allocations) cannot be changed while an audio unit is initialized. > """ I've read this too, but changing the formats "mid-flight" still works just fine. You have the simple test app which you wrote for bug 743925. > We might need to do an initialise and uninitialise just for probing, to deal > with this. ... if it doesn't actually work fine as-is :)
(In reply to Arun Raghavan from comment #30) > Review of attachment 298927 [details] [review] [review]: > > ::: sys/osxaudio/gstosxaudiosink.c > @@ +338,3 @@ > + GST_DEBUG_OBJECT (sink, "no ring buffer, returning NULL caps"); > + return GST_BASE_SINK_CLASS (parent_class)->get_caps (sink, filter); > + } > > Should return template caps, not NULL. GstBaseSrc::get_caps return template caps. GstBaseSink::get_caps return NULL. I don't know why they default to different behaviors; I assumed they were written by smart people and followed the example. (In the GST_DEBUG_OBJECT, I document what the base class does, but in reality, I should've just written "no ring buffer, calling base class".) > @@ +358,3 @@ > + caps = gst_core_audio_probe_caps (osxbuf->core_audio, template_caps); > + gst_caps_replace (&osxbuf->core_audio->cached_caps, caps); > + osxbuf->core_audio->cached_caps_valid = TRUE; > > Can we replace the boolean with a NULL check on cached_caps? Actually, I should've done without cached_caps_valid, and then introduced it in attachment 298928 [details] [review] where I've introduced _audio_unit_property_listener. The problem was that in _audio_unit_property_listener I couldn't safely GST_LOCK_OBJECT(osxbuf) since sometimes _audio_unit_property_listener would be called on the same thread with osxbuf already locked. In reality, I think _audio_unit_property_listener is only called on the same thread in reaction to *me* changing inner scope formats. Outer scope changes (i.e. hardware format changes) are reported on an auxiliary system thread. So perhaps if we only lock osxbuf for outer scope changes, we can do without cached_caps_valid. What do you think? > ::: sys/osxaudio/gstosxaudiosrc.c > @@ +285,3 @@ > + GST_DEBUG_OBJECT (src, "ring buffer not open, using template caps"); > + caps = GST_BASE_SRC_CLASS (parent_class)->get_caps (src, filter); > + filter = NULL; > > Could just pass NULL to the base class and do filtering ourselves (though > I'm fine with either way). I suppose. Maybe it's clearer. > ::: sys/osxaudio/gstosxcoreaudio.c > @@ +202,3 @@ > +/* Does the channel have at least one positioned channel? > + (GStreamer is more strict than Core Audio, in that it requires either > + all channels to be positioned, or all unpositioned.) */ > > Do we really need to support the partially-positioned case? And if yes, > maybe it makes more sense to just downgrade that to unpositioned, rather > than only picking up the positioned channels? I like your proposal better. Since there's no "right" thing to do here, we might as well choose the easier-to-understand option. HOWEVER, no matter what we decide, we should still skip over kAudioChannelLabel_Unknown channels. For example, in a fully-positioned scenario, the Soundflower 64-channel device reports kAudioChannelLabel_Unknown for all unassigned channels. > @@ +489,3 @@ > + /* Get the ASBD of the outer scope (i.e. input scope of Input, > + output scope of Output). > + This ASBD indicates the hardware format. */ > > GStreamer comment style is: OK, will change. > @@ +566,3 @@ > + * so we ensure it ourselves in _set_ring_buffer_osx_channel_order > + * when the ring buffer is acquired. > + */ > > Do we need to have both the with-channel-mask and without-channel-mask > variants, then? Can't we just have the version without? I think you meant to comment about gst_core_audio_probe_caps. But yes, agreed. In any case, this is why I *warn* -- because if we omit channel-mask, we might have an element sending (L,R) to a sink accepting (RL,RR) etc. I'm not going to bail out over this, since AU will do its own reordering *and negotiation* -- meaning, for (L,R)->(LR,RR) nothing will be played. I'm just warning. > ::: sys/osxaudio/gstosxcoreaudiocommon.c > @@ +263,3 @@ > + /* Construct AudioChannelLayout */ > + layoutSize = > + G_STRUCT_OFFSET (AudioChannelLayout, mChannelDescriptions[channels]); > > While I like this for being quite clever, IMO the previous way of allocating > the AudioChannelLayout was easier to read. That's a pretty common pattern for "flexible array member" structs. I've seen it in the Windows NT sources, and Apple also uses it in their AU sample code (with the non-GLib offsetof). I wish Glib had a more indicative macro, like: #define G_VAR_ARRAY_SIZE (Type,VarSizeMember,Length) G_STRUCT_OFFSET(Type, VarSizeMember[Length]) but anyway, I feel strongly about keeping it :) I can comment on it if you wish. > @@ +446,3 @@ > + * 'channel' is used for warnings only. */ > +GstAudioChannelPosition > +coreaudio_channel_label_to_gst_audio_channel_position (AudioChannelLabel > label, > > I wouldn't be against just calling this > gst_core_audio_channel_label_to_gst(), which is also long, but shorter, and > a bit more consistent with the rest of the code (can also make the other > function gst_audio_channel_position_to_core_audio(). Agreed. > @@ +505,3 @@ > + case kAudioChannelLabel_RearSurroundLeft: > + case kAudioChannelLabel_RearSurroundRight: > + case kAudioChannelLabel_TopCenterSurround: > > I'm not quite sure what these 3 are supposed to map to. Do they overlap with > any of the others? I've listed all the known positional labels that I found no suitable GST mapping for. It's mostly for documentation's sake, I guess. Do you think it confuses more than it helps?
(In reply to Arun Raghavan from comment #31) > Review of attachment 298928 [details] [review] [review]: > > ::: sys/osxaudio/gstosxcoreaudio.c > @@ +72,3 @@ > + > + /* Assume AudioUnitRemovePropertyListenerWithUserData will block > + * until the last listener finishes, so core_audio remains valid. */ > > Can we confirm this assumption (which sounds reasonable), and if not, deal > with it however appropriate? I don't have access to Apple's sources :( I've asked on the coreaudio-api list but I've yet to receive a response http://lists.apple.com/archives/coreaudio-api/2015/Feb/msg00028.html I can't think of an appropriate way to deal with it. If you have zero guarantees as to when a function can be called, there's little you can do. However, I think we can just assume it doesn't get called after AudioUnitUninitialize and remove this alarming comment. > @@ +75,3 @@ > + core_audio = GST_CORE_AUDIO (inRefCon); > + g_assert (inUnit == core_audio->audiounit); > + parent = core_audio->osxbuf->parent; /* osxbuf is our owner, no need to > ref it */ > > What is the function of the parent here? Oops, remains of old code. Will remove. > @@ +151,3 @@ > + kAudioUnitProperty_AudioChannelLayout, _audio_unit_property_listener, > + core_audio); > + if (status) { > > Not too important, but we check status != noErr in the rest of the code. OK. BTW, we have both plenty of "(status)" and "(status != noErr)" in the code. Personally I prefer "(status != noErr)" for being more verbose, so I'll change this particular instance.
I've hacked gst_core_audio_probe_caps to report 9 channels and tried playing a 5.1 test file. Result: HAL maps LEFT to LEFT, RIGHT to RIGHT, and omits all other channels (e.g. REAR LEFT is not heard at all). I haven't tested with iOS due to the cumbersomeness of running things on it -- might do it later -- but I think this tells us that the AU only performs simple reordering, omitting channels and inserting silence channels.
Created attachment 299061 [details] [review] Patch 7 - Overhaul of probing caps Changes: * G_STRUCT_OFFSET remarks * Remove cached_caps_valid * Fix channel-mask bug * Correct comment style and some code style
Created attachment 299163 [details] [review] Patch 7 - Overhaul of probing caps (reuploading with Bugzilla URL in comment)
Created attachment 299168 [details] [review] Patch 8 - Invalidate cached caps on format change Changes: * Eliminate cached_caps_valid, lock on osxbuf in _audio_unit_property_listener
Created attachment 299169 [details] [review] Patch 9 - Warn on missing channels Changes: * Update so it will apply
(In reply to Ilya Konstantinov from comment #36) > I've hacked gst_core_audio_probe_caps to report 9 channels and tried playing > a 5.1 test file. > > Result: HAL maps LEFT to LEFT, RIGHT to RIGHT, and omits all other channels > (e.g. REAR LEFT is not heard at all). > > I haven't tested with iOS due to the cumbersomeness of running things on it > -- might do it later -- but I think this tells us that the AU only performs > simple reordering, omitting channels and inserting silence channels. In this case, I we should report exactly what the sink or source supports and not try to use anything else. If the channel count or configuration doesn't match, it _should_ be remapped to match the actual output, and if that's not happening, we should let negotiation cause an audioconvert to be used for this if present (or else fail entirely).
(In reply to Arun Raghavan from comment #41) > In this case, I we should report exactly what the sink or source supports > and not try to use anything else. If the channel count or configuration > doesn't match, it _should_ be remapped to match the actual output, and if > that's not happening, we should let negotiation cause an audioconvert to be > used for this if present (or else fail entirely). Why not expose the Audio Unit functionality as is? The warning I've implemented should give indication about the missing channels. If you wish, we can turn the warning into an error. I don't see any reason to fail when someone is pushing (L,R) into a 5.1 sink, and no reason to force him to use audioconvert or whatever.
(In reply to Ilya Konstantinov from comment #42) > (In reply to Arun Raghavan from comment #41) > > In this case, I we should report exactly what the sink or source supports > > and not try to use anything else. If the channel count or configuration > > doesn't match, it _should_ be remapped to match the actual output, and if > > that's not happening, we should let negotiation cause an audioconvert to be > > used for this if present (or else fail entirely). > > Why not expose the Audio Unit functionality as is? The warning I've > implemented should give indication about the missing channels. If you wish, > we can turn the warning into an error. > > I don't see any reason to fail when someone is pushing (L,R) into a 5.1 > sink, and no reason to force him to use audioconvert or whatever. On systems that I'm familiar with (namely Linux with PulseAudio), you either configure your system for stereo output or 5.1 or 7.1 or whatever, and from that point on, you expect the output to match what you've configured it for. Streams that play out a different channel count get remapped/upmixed/downmixed appropriately. IMO this makes sense with respect to what the user expects. So I guess I'd model what we do based on what the "standard" behaviour on the platform is -- with the higher level frameworks, does playing a 2ch stream on a 5.1ch output result in upmixing? If yes, then that's what developers and end-users expect, and playing out a partial set of channels would be incorrect. Conversely, if the native frameworks do what you've implemented now, I think that's fine as it is (though I'd be a bit surprised, since it seems a bit counter-intuitive to me).
(In reply to Arun Raghavan from comment #43) > On systems that I'm familiar with (namely Linux with PulseAudio), you either > configure your system for stereo output or 5.1 or 7.1 or whatever, and from > that point on, you expect the output to match what you've configured it for. On the system level, the user defines his speaker setup (through the Audio MIDI Setup app). On the app level, osxaudio configures the inner scope. > So I guess I'd model what we do based on what the "standard" behaviour on > the platform is -- with the higher level frameworks, does playing a 2ch > stream on a 5.1ch output result in upmixing? If yes, then that's what > developers and end-users expect, and playing out a partial set of channels > would be incorrect. I don't know other higher level frameworks on OS X, but we might as well regard GStreamer is *the* higher level framework, since we're implementing a drop-in system source/sink element. In general, I'm fine with forcing people to use $some_element to upmix/downmix, or to remove channels or to add silence channels (is there an element for that?), as long as it doesn't come with performance implications. If there's a performance implication, let's try and find a way to use the Core Audio infrastructure to the maximum extent possible. For example, we can consider only announcing caps with a channel-mask, and: * adding a 'channel-mask' property that'll only accept subsets of the hardware channel mask, and/or * adding a 'force-channel-mask' property that'll make osxaudio accept an arbitrary channel mask, and/or * adding a 'no-channel-mask' property to enable my current behavior I just want to be careful not to force unneeded overhead.
Arun, I just tested on iOS and we have a problem with GST_OBJECT_LOCK (core_audio->osxbuf) in _audio_unit_property_listener. See the stack trace with my comments: TestSendReceive`_audio_unit_property_listener(inRefCon=0x0000000103a26ee0, inUnit=0x0000000170423d40, inID=8, inScope=1, inElement=1) + 192 at gstosxcoreaudio.c:87 ^ trying to recursively lock osxbuf here (gst lock is non-recursive) AudioToolbox`AUBase::PropertyChanged(unsigned int, unsigned int, unsigned int) + 92 AudioToolbox`AUBase::ChangeStreamFormat(unsigned int, unsigned int, CAStreamBasicDescription const&, CAStreamBasicDescription const&) + 296 AudioToolbox`AUConverterBase::ChangeStreamFormat(unsigned int, unsigned int, CAStreamBasicDescription const&, CAStreamBasicDescription const&) + 44 AudioToolbox`AURemoteIO::ChangeHardwareFormats(AudioStreamBasicDescription const*, AudioStreamBasicDescription const*, unsigned int, AudioChannelLayout const*, unsigned int&, SharableMemoryBlock::ClientToken&, SharableMemoryBlock::ClientToken&) + 492 ^ initializing Remote IO sets up the outer scope's hardware formats AudioToolbox`AURemoteIO::Initialize() + 1196 AudioToolbox`AUBase::DoInitialize() + 48 AudioToolbox`AUMethodInitialize(void*) + 76 TestSendReceive`gst_core_audio_open(core_audio=0x0000000103a26ee0) + 384 at gstosxcoreaudio.c:163 TestSendReceive`gst_osx_audio_ring_buffer_open_device(buf=0x0000000103cfc130) + 96 at gstosxaudioringbuffer.c:148 ^ osxbuf is already locked here What do you suggest I do: * go back to the 'cached_caps_valid' boolean flag (which doesn't necessitate locking to modify it), * introduce a new recursive mutex * somehow pend handling of the property change (in fact, equivalent to signalling with the cached_caps_valid boolean flag) * or some other solution?
Created attachment 300058 [details] [review] Patch 9 - Fix lockup in _audio_unit_property_listener Bringing back cached_caps_valid.
I'm seeing one problem (in OWR): osxaudiosrc fixates on F32LE even though downstream elements (osxaudiosrc -> ... -> intersaudiosink -> ... -> audioconvert -> filter) require S16LE. audioconvert comes into play, which is of course undesirable. This seems to happen with armv7 but didn't happen on arm64 (?!). I'm currently researching.
Created attachment 300065 [details] [review] Patch 7 - Overhaul of probing caps Changes: * gst_structure_set (out_s, "channels", GST_TYPE_INT_RANGE, 1, channels, NULL) is invalid when channels == 1, so added an 'if' for this * need to cast before printf'ing (Darwin type, not C99) UInt32 since they're 'unsigned long' (?!) on 32-bit architectures
Created attachment 300073 [details] [review] Patch - Fix string format warning on 32-bit UInt32 (Darwin's, not C99 uint32_t) is 'unsigned long' on 32-bit.
(In reply to Ilya Konstantinov from comment #47) > This seems to happen with armv7 but didn't happen on arm64 (?!). Actually, this happens on both armv7 and arm64. It's just that on arm64, gst_audio_quantize_quantize_signed_tpdf_none is somehow devilishly efficient (0.9%) and on armv7 very inefficient (22.6%). I don't understand what could cause such a difference. In any case, for osxaudio we can: * add a property that disables preferred caps, or * expect apps (e.g. OpenWebRTC) to know that osxaudiosrc is capable of conversion/resampling, and either avoid inserting audioconvert/resample or add another filter to force osxaudiosrc to a specific format Arun, please advise (here or on IRC).
Created attachment 300194 [details] [review] Patch 7 - Overhaul of probing caps Changes: * No longer offering [1, n] channel range * No longer offering caps without channel-mask (except for mono, of course) * Explicitly offering stereo-mono upmixing/downmixing, since HAL and Remote IO do it * Comment about removing gst_audio_ring_buffer_set_channel_positions
I've noticed that while for input HAL element (i.e. osxaudiosink) resampling is performed, the output element behaves really strangely. For non-multiplies of the sample rate, the AudioUnitRender returns an error For multiplies (e.g. Internal Input Microphone set to 44,100, osxaudiosrc set to 22,050), the audio is corrupted. (I wonder if it's something we're doing wrong.)
(On iOS, resampling works well on all sides.)
Arun, please review: https://github.com/ikonst/gst-plugins-good/commits/master * osxaudiosrc: no resampling on OS X * osxaudio: stereo can have channel_mask == 0 * osxaudiosrc: avoid get_channel_layout
Arun, not sure how I missed it, but your patch: http://cgit.freedesktop.org/~arun/gst-plugins-good/commit/?h=osxaudio&id=2356335a9873479c46e17bc31e3cff6441000a63 makes the mistake of replacing GST_AUDIO_CHANNEL_POSITION_MASK (FRONT_LEFT) with GST_AUDIO_CHANNEL_POSITION_FRONT_LEFT. The first is the mask value, the latter is just an enum.
Another bug: http://cgit.freedesktop.org/~arun/gst-plugins-good/commit/?h=osxaudio&id=3e75c439e5764ad7ac0de4fc3845ec31a1bc2985 In gst_core_audio_initialize_impl, you've done: GST_DEBUG_OBJECT (core_audio, "osxbuf ring buffer acquired"); - return TRUE; so it always returns FALSE. ----- Also, generally I'm not sure about your "osxaudio: Avoid making a duplicate structure in caps for mono/stereo case" thing. It tries to do things more compact, but what does it actually mean to have: channels:[1,2], channel-mask:LEFT|RIGHT If the sink element wants "channels:1" (without channel-mask, obviously), the negotiated pipeline will have fixated caps: channels:1, channel-mask:LEFT|RIGHT which is strange and confusing.
Pushed along after a review iteration on IRC.