GNOME Bugzilla – Bug 372071
gstbasesrc : race condition upon gst_base_src_deactivate crashes in gst_base_src_get_range
Last modified: 2006-11-13 17:58:36 UTC
Overview Description: A crash can occur in gstbasesrc when a pipeline is interupted before normal EOS by changing its state to NULL. The unprotected call to gst_pad_stop_task in gst_base_src_deactivate can lead to unrefs of member objects still used in another thread. Steps to Reproduce: This was tested with gstreamer-0.10.6, but the behaviour hasn't changed in HEAD so the problem should remain... This is impossible to reproduce using gst-launch-0.10, it occurs in the scope of a program which dynamically builds a pipeline, runs, and halts it upon receiving the first buffers from a fakesink's handoff callback. The program uses gstreamer for media analysis by peeking a few decoded buffers and their caps. The main thread of the program has the same behaviour as gst-launch-0.10 for polling the bus in a loop. When the loop catches and EOS message, it stops and changes the state of the pipeline bin to NULL. The callback function, when deciding it has received enough buffers will post and EOS message on the bus. This way, no change state is triggered within any thread. 1) As the state is changed to NULL, a pipeline containing a filesrc will call gst_base_src_deactivate. This will unref the mmap objects of the filesrc, regardless of what it's currently doing. 2) On some rare occasion, the loop thread of a demuxer will execute gst_base_src_get_range at the exact same time, pass the checks for pad activity and end-up with an unreffed buffer moments later. Actual Results: This of course leads to various side effects ranging from gst and glib warnings to plain seg fault. Expected Results: All data flow handling should be protected from element state changes. Patch/Fix: Locking the basesrc object before triggering gst_pad_stop_task fixes the problem... This simple patch has been created for HEAD : --- original/gstbasesrc.c 2006-11-07 10:36:53.000000000 -0500 +++ patched/gstbasesrc.c 2006-11-07 10:36:01.000000000 -0500 @@ -1872,7 +1872,9 @@ result = gst_base_src_unlock (basesrc); /* step 2, make sure streaming finishes */ + GST_OBJECT_LOCK(basesrc); result &= gst_pad_stop_task (pad); + GST_OBJECT_UNLOCK(basesrc); return result; } Debug Traces: (gdb) bt
+ Trace 84032
Okay the fix was a bit hasty, I retested for a while and the problem still happened, only much less often... Locking the mutex like this much change the timing a lot. I'm investigating locking the pad in gst_base_src_get_range instead since it's this object the gst_pad_pause_task will try to lock first. Will test throughly and post a new patch if things look better.
you are not allowed to change the state of a plugin/pipeline from the streaming thread. we try to catch and warn for those cases but it's not always possible. please use the gstbus to marshall the state change to the main thread.
Yes, that's how it's implemented in my app. From the callback I post an EOS message to the bus, the message is caught in the main thread then the change state is called. I know changing the state of an element from a thread poses problems... I've retested much with GStreamer CVS HEAD and could not reproduce. However I didn't see a significant change since 10.6 in gstbasesrc.c which could have fixed this, conceptualy speaking the problem looks to be still there, unless the fix lays at a different level.
OK, then it should work fine, it might be related to other problems indeed (like a sink not changing state properly). Also conceptually it is not possible to have a problem since the resources are only freed when the streaming task is finished. This is ensured by grabbing the stream lock before calling _stop(). I'm closing, please reopen if there is still a problem.