After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 797223 - Race condition in ipcpipelinesrc during flush
Race condition in ipcpipelinesrc during flush
Status: RESOLVED OBSOLETE
Product: GStreamer
Classification: Platform
Component: gst-plugins-bad
git master
Other Linux
: Normal normal
: git master
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2018-09-28 11:24 UTC by Mike Dyer
Modified: 2018-11-03 14:32 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
[PATCH] Fix race condition during flush (3.03 KB, patch)
2018-09-28 11:24 UTC, Mike Dyer
needs-work Details | Review

Description Mike Dyer 2018-09-28 11:24:02 UTC
Created attachment 373802 [details] [review]
[PATCH] Fix race condition during flush

A race condition exists when handling flush-start and flush-stop events in intersrc.
    
When receiving a flush-start event, the event handler calls gst_inter_src_stop_loop() which sets the flag src->flushing.  In the case of pausing the task, it does not wait for gst_inter_src_loop() to pause itself.
    
gst_inter_src_loop() reads the value of src->flushing while holding src->comm.mutex and stores it in a local variable.  After releasing the mutex, gst_inter_src_loop() then proceeds to clear the queue and then set itself to paused, based on the value of the local variable.
    
During the queue clearing, it is possible that a flush-stop event could arrive.  This will clear src->flushing and call gst_pad_start_task() to schedule gst_inter_src_loop().  However, the task hasn't paused yet, so this call does nothing.  gst_inter_src_loop() finishes clearing the queue and sets itself to paused.  Because this occurs after receipt of the flush-stop, the loop remains paused and is not restarted.
    
To prevent this, we can pause the task from gst_inter_src_stop_loop(). This will then wait for the gst_inter_src_loop() to complete.  We can also clear the queue after it completes, removing the vulnerable period (gst_inter_src_cancel_queued() cannot be called from inside gst_inter_src_loop() with the mutex held, as it locks this mutex itself).
    
We must also propagate the flush-start event before attempting to pause gst_inter_src_loop() as it may be blocked pushing to downstream elements.
Comment 1 Olivier Crête 2018-10-01 16:05:17 UTC
Review of attachment 373802 [details] [review]:

I'm pretty sure this patch is not the right solution. I think that on flush-stop, it should probably do whatever is needed with locking (as this is a serialized event) to ensure that the ordering is right. Or maybe after forwarding the flush start, it should take the STREAM lock to ensure proper serialization after receiving the flush-start.

::: sys/ipcpipeline/gstipcpipelinesrc.c
@@ +375,3 @@
+    gst_pad_pause_task (src->srcpad);
+
+  gst_ipc_pipeline_src_cancel_queued (src);

I wonder if doing the cancel queued here before we're certain that the loop is stopped is safe. Especially that the cancel function modifies the list without locking, so it's not safe to run it while the loop is still running.

@@ -693,1 +692,5 @@
     case GST_EVENT_FLUSH_START:
+      GST_DEBUG_OBJECT (src,
+          "Forwarding flush-start immediately");
+      gst_element_call_async (GST_ELEMENT (src), do_oob_event, event,
+          (GDestroyNotify) gst_event_unref);

This is equally racy.. This call delegates the pyshing to another thread. So it could get push before or after the gst_ipc_pipeline_src_stop_loop() call unpredictably.
Comment 2 Mike Dyer 2018-10-08 15:27:36 UTC
(In reply to Olivier Crête from comment #1)
> Review of attachment 373802 [details] [review] [review]:
> 
> I'm pretty sure this patch is not the right solution. I think that on
> flush-stop, it should probably do whatever is needed with locking (as this
> is a serialized event) to ensure that the ordering is right. Or maybe after
> forwarding the flush start, it should take the STREAM lock to ensure proper
> serialization after receiving the flush-start.

Perhaps take the stream lock before starting the task again in gst_ipc_pipeline_src_start_loop()?  That would ensure the loop isn't running before attempting to start it again.
Comment 3 GStreamer system administrator 2018-11-03 14:32:23 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/issues/792.