GNOME Bugzilla – Bug 547352
race condition with GstPadBufferAllocFunction
Last modified: 2011-08-11 12:54:29 UTC
There is a race condition when gstreamer call an element GstPadBufferAllocFunction. There is a possibility that the element is doing a state change from paused to ready at the same time. I got the problem in my older version of gstreamer, but I see the problem is still there in head. I can see this with this pipeline and some modif in the code gst-launch videotestsrc num-buffers=2000 ! videoscale ! "video/x-raw-yuv, format=(fourcc)YUY2, width=(int)320, height=(int)240, framerate=(fraction)30/1" ! fakesink And doing ctrl-c. The modif I made to see the problem occur are in gstbasetransform. I just set flags and some g_usleep to make the problem occur more often. I can see that gst_base_transform_activate is called (and code after the STREAM_LOCK/UNLOCK run while at the same time, some code is running on the same element in gst_base_transform_buffer_alloc. (on my older gstreamer, this cause a problem with the variable trans->have_same_caps being changed and causing a gwarning since I run my app with --gst-fatal-warnings). The fix is simple and consist of putting a stream lock in gstpad.c GST_PAD_STREAM_LOCK (pad); ret = bufferallocfunc (pad, offset, size, caps, buf); GST_PAD_STREAM_UNLOCK (pad); This seems to work correct with my application (and my not up to date gstreamer). However, I am wondering if this can cause some problem.
This should work fine but adding a lock will cause the pad_alloc to be synchronized with the streaming threads and we don't want to enforce this in the core. I see two solutions, a specific lock for this in core or using a new lock in basetransform. I would like to have this fixed at the core level eventually. Where did you put the sleep to trigger the problem more reliably?
Created attachment 116880 [details] [review] modif in basetransform to show the problem
Created attachment 116881 [details] [review] patch for this bug
I put my modif for basetransform with the sleep and some printf to show the potential race condition. With the pipeline and doing ctrl-c, you should be able to see that sometime, the change state is occuring while the buffer alloc function is also running. With the patch, things look correct. About the patch : I am not sure it will synchronize anything new, since the patch do a stream lock on the next element pad (from the current chain function of an upstream element). So, I think this is the same thread anyway that do the buffer alloc and the chain itself.
It could introduce additional locking from different threads if the pad alloc is performed over a queue, like in ... ! decoder ! queue ! sink
By adding in gstpad.c:gst_pad_alloc_buffer_full (GstPad GST_PAD_STREAM_LOCK (peer); ret = bufferallocfunc (peer, offset, size, caps, buf); GST_PAD_STREAM_UNLOCK (peer); When basetransform calls pad_bufferalloc on the queue srcpad the queue tries to call it on it's src pad too and thus it calls gst_pad_alloc_buffer_full with the queue src pad this blocks since it requires the src pad lock and it's busy doing a push... This in turn causes the queue not to work properly since it will always wait for the current element to return from it's push before the preceding element can allocate a buffer and send it to the queue. So the queue blocks as if there was no queue. We need a better fix...
I recently update my code to gstreamer 0.10.20 and found a better and much less intrusive fix. In file gstbasetransform.c, there is a trans->transform_lock used in the gst_base_transform_buffer_alloc function. I simply use this mutex in the gst_base_transform_activate function when activate is stopping. This is much less intrusive than the previous patch in gstpad.c I took some time to look in the code of head and see the problem still there too. however, most of the code has change compared to the version I am using : the mutex is not used anymore in the alloc function. So I have no patch for head right now. If someone need, I can update the modif in basetransform that show the problem in 0.10.20 and Head (similar than in comment #2).
Created attachment 127868 [details] [review] patch for 0.10.20
Wim, could you take a look at this patch? :)
Wim? Is the patch correct? Apart from this, in 0.11 there's no gst_pad_alloc_buffer() anymore and this problem simply can't happen. Ok to close this?
ping!
Closing this bug report as no further information has been provided. Please feel free to reopen this bug if you can provide the information asked for. Thanks!