GNOME Bugzilla – Bug 783301
basesrc: Should not hold LIVE_LOCK when calling GstBaseSrcClass::create
Last modified: 2018-03-07 10:44:24 UTC
Created attachment 352965 [details] Test that demonstrate the issue Inside the virtual method ::create() the live lock is held. As pushing caps may block on a queue, or on some downstream element doing allocation query toward a queue, this can lead to yet another appsrc deadlock. I'm attaching an example application to demonstrate one way to trigger a deadlock. I do believe this could be reproduce with just one appsrc, two sinks and a queue, but the initial bug was reported to me with a similar setup to this test.
Where should it do that instead? AFAIK other basesrc subclasses also set caps from ::create(), e.g. if the format for the current buffer changes.
(In reply to Sebastian Dröge (slomo) from comment #1) > Where should it do that instead? It is safe when done from negotiate(), in the case it's called by the base_src_lopp() function. This is called when the reconfigure flag is set on the pad. When that happens, we don't yet hold the live lock. It's quite a limitation of the base class. I'll be fixing this today. > AFAIK other basesrc subclasses also set > caps from ::create(), e.g. if the format for the current buffer changes. It would be nice to share which other live sources does that. This problem only affect live sources, as in non-live, the LIVE_LOCK is only taking back in the PAUSED->READY state, and the pipeline has been unlocked already at this point.
Created attachment 353019 [details] New test, without a sleep and without aggregator This test also demonstrate the issue, but only with elements from -base. With hope I can turn this into a unit test. It also no longer rely on a sleep.
Created attachment 353020 [details] [review] basesrc: Protect access to pool and allocator This was only partly protected by the object lock. Always take the object lock to access the currently configured pool and allocator.
Created attachment 353021 [details] [review] basesrc: Don't hold LIVE_LOCK in create/alloc/fill Holding this lock on live source prevents the source from changing the caps in ::create() without risking a deadlock. Note that this makes the public function gst_base_src_wait_playing() useless as we will wait after create, when we take back the lock.
I took in consideration that decklinkvideosrc is a live source also set the caps inside create() call, and decided to drop the live lock from create/alloc/fill virtual method in basesrc. Overall I'm quite happy with the results, as it's better to not hold any private locks while calling virtual functions. It also remove some limitations that were not documented in basesrc base class. As a side effect though, we should probably deprecate gst_base_src_wait_playing() and remove that call from the audio base class. It's insane to wait in the middle of capturing a buffer, as this endup hiding a time gap. BaseSrc will now wait if needed after create(), which will let the sync code in gst figure-out if the buffer should be dropped or not.
Review of attachment 353021 [details] [review]: That will need a new version. I missed that basesrc was relying on the LIVE_LOCK to determin that the create() virtual have returned. For the gst_basesrc_wait_playing(), that worked, because the function is checking the basesrc flushing state before returning.
Created attachment 354154 [details] [review] basesrc: Don't hold LIVE_LOCK in create/alloc/fill Holding this lock on live source prevents the source from changing the caps in ::create() without risking a deadlock. This has consequences as the LIVE_LOCK was replacing the STREAM_LOCK in many situation. As a side effect: - We no longer need to unlock when doing play/pause as the LIVE_LOCK isn't held. We then let the create() call finish, but will block if the state have changed meanwhile. This has the benefit that wait_preroll() calls in subclass is no longer needed. - We no longer need to change the state to unlock, simplifying the set_flushing() interface - We need different handling for EOS depending if we are in push or pull mode. This patch also document the locking of each private class member and the locking order.
Nothing is simple, this patch demonstrate it. This version now passes gst-validate, I did 10 runs to be sure, I also ran all unit tests I could find. It's quite possible that more reduction can be done (it's most likely possible to replace the LIVE_LOCK with the object lock now, would need someone to try). But this already simplify the base class.
Review of attachment 353020 [details] [review]: Looks good
Review of attachment 354154 [details] [review]: Looks good! ::: libs/gst/base/gstbasesrc.c @@ +255,2 @@ /* QoS *//* with LOCK */ + gboolean qos_enabled; /* unused */ Should just remove qos_enabled as it's a private structure anyway.
(In reply to Olivier Crête from comment #11) > Review of attachment 354154 [details] [review] [review]: > > Looks good! > > ::: libs/gst/base/gstbasesrc.c > @@ +255,2 @@ > /* QoS *//* with LOCK */ > + gboolean qos_enabled; /* unused */ > > Should just remove qos_enabled as it's a private structure anyway. Ah, thanks, I wanted to put that in a seperate patch, but forgot.
Attachment 353020 [details] pushed as 946622e - basesrc: Protect access to pool and allocator Attachment 354154 [details] pushed as 523de1a - basesrc: Don't hold LIVE_LOCK in create/alloc/fill
I've simplified further my test, and pushed a unit test. Author: Nicolas Dufresne <nicolas.dufresne@collabora.com> Date: Mon Jul 3 21:08:02 2017 -0400 test-appsrc: Test state when blocked in caps Event In GStreamer 1.12 and older, the GstBaseSrc live lock used to be held while create() virtual function was called. As appsrc pushes serialized event in that virtual function, we ended up with some deadlock while setting the state to NULL. This test simulates this situation. https://bugzilla.gnome.org/show_bug.cgi?id=783301