GNOME Bugzilla – Bug 736071
audiobasesink: Don't hold object lock while calling into other objects like the clock
Last modified: 2014-10-22 17:53:04 UTC
At times while doing the Pipeline state transition from PAUSED to PLAYING i see a deadlock between the PulseAudio Main Loop thread and AudioBaseSink Render thread. The PA thread holds the pulseaudio mainloop lock and is trying to obtain the pulsesink element lock in a call to gst_element_post_message_default(). The AudioBaseSink render thread holds the pulsesink element lock and is trying to obtain the pulseaudio mainloop lock in a call to gst_pulsesink_get_time(). Stack-trace of the 2 threads: Thread 1
+ Trace 234054
Possible fix (maybe): In gst_audio_base_sink_skew_slaving inside gst-plugins-base/gst-libs/gst/audio/gstaudiobasesink.c we call gst_audio_clock_get_time on sink->provided_clock without acquiring the GST_OBJECT_LOCK(sink). Likewise in gst_audio_base_sink_setcaps. Can we do the same in this case also? Something like in the attached patch.
Created attachment 285401 [details] [review] Possible fix to the problem. Release object lock on sink before calling gst_audio_clock_get_time.
Review of attachment 285401 [details] [review]: Can you attach the patch in "git format-patch" format? ::: gst-libs/gst/audio/gstaudiobasesink.c @@ +1482,1 @@ itime = gst_audio_clock_adjust (sink->provided_clock, itime); Move the "etime = ..." line below this, and keep the lock unlocked for both "itime = ..." lines. Releasing the lock here seems safe considering the other code using the clock and the code above these lines
Created attachment 285511 [details] [review] New Patch after suggested changes.
Review of attachment 285401 [details] [review]: Please take a look at the new patch. Thanks for your review.
Created attachment 285529 [details] [review] Please look at this one for review. Thanks. Updated the email address. Please look at this one for review. Thanks.
Please also put your full name into the patch next time, thanks!
Created attachment 287358 [details] [review] pulsesink: Make emitting stream status messages synchronous The stream status messages are emitted in the PA mainloop thread, which means the mainloop lock is taken, followed by the Gst object lock (by gst_element_post_message()). In all other locations, the order of locking is reversed (this is unavoidable in a bunch of cases where the object lock is taken by GstBaseSink or GstAudioBaseSink, and then we get control to take the mainloop lock). The only way to guarantee that the defer callback for stream status messages doesn't deadlock is to either stop posting those messages, or make sure that the message emission is completed before we proceed to any point that might take the object lock before the mainloop lock (which is what we do after this patch).
Reopening because the original fix was not complete. Attached patch should fix this properly.
Created attachment 287410 [details] [review] pulsesink: Make emitting stream status messages synchronous The stream status messages are emitted in the PA mainloop thread, which means the mainloop lock is taken, followed by the Gst object lock (by gst_element_post_message()). In all other locations, the order of locking is reversed (this is unavoidable in a bunch of cases where the object lock is taken by GstBaseSink or GstAudioBaseSink, and then we get control to take the mainloop lock). The only way to guarantee that the defer callback for stream status messages doesn't deadlock is to either stop posting those messages, or make sure that the message emission is completed before we proceed to any point that might take the object lock before the mainloop lock (which is what we do after this patch).
Comment on attachment 287410 [details] [review] pulsesink: Make emitting stream status messages synchronous Updated and pushed the patch with a few more comments about what was done.
Hi, it seems like this patch introduced a regression, when a position query is done on the sink before it emits the stream-status message a deadlock happens:
+ Trace 234182
Thread 6 (Thread 0x7f4e0ddb8700 (LWP 12951))
Thread 1 (Thread 0x7f4e74279940 (LWP 12920))
Okay, so it is not possible to emit this message within the PulseAudio thread context while maintaining locking order (GST_OBJECT_LOCK before pa_threaded_mainloop_lock). This being the case, I see the following alternatives: 1. Create the message in the PA mainloop thread (since we need the thread id), but emit the enter/leave status messages from another thread (without the mainloop lock held) 2. Stop emitting the enter/leave status messages altogether I'm leaning towards (1) unless someone sees a reason not to (the only thing I can think of is that there might be synchronous listeners of the stream status message that hope to be called in the audio mainloop thread context, but it's possible that this is not meaningful).
The stream status messages are usually picked up with a sync handler to set thread priorities or somesuch, no?
That's the impression I got from the documentation. I can't imagine that PulseAudio is alone in using a lock to protect the loop thread vs. other threads, so this should be a more general problem?
We talked about this at GstConf and the conclusion was: 1. For now, I'm going to comment out the stream status messages 2. I'm going to try to add PA API to allow execution of a callback in the mainloop without the mainloop lock held 3. Once 2. is done, we can use that to reintroduce the stream status messages
The following fix has been pushed: 1631557 pulsesink: Temporarily disable stream status posting
Created attachment 289154 [details] [review] pulsesink: Temporarily disable stream status posting We need a mechanism in PulseAudio to allow running code outside the mainloop lock. Then we'd be able to post to the bus (taking the GST_OBJECT_LOCK), without worrying about locking order with the mainloop lock, which is the current cause of deadlocks while trying to post the stream status messages.