GNOME Bugzilla – Bug 738102
v4l2bufferpool: cleanly handle streamon failure for output device
Last modified: 2014-10-29 20:55:56 UTC
Created attachment 287976 [details] GStreamer v4l2 log on streamon failure Currently, for an output device, if streamon returns in errors, we end up with a bad assert: GLib (gthread-posix.c): Unexpected error from C library during 'pthread_mutex_lock': Invalid argument. Aborting. Aborted This error is caused by a try to lock a v4l2allocator which has been finalized. I attached a v4l2*:6,bufferpool:6 trace with some additional message. Basically, v4l2 bufferpool is active and we finalize it. In the finalize method, we unref the allocator so it's finalize and then we call parent finalize method, ie gst_buffer_pool_finalize(). This method try to deactivate pool so it call v4l2_buffer_pool_stop() whoch try to lock v4l2_allocator which did not exist anymore. This can be improved by setting allocator to NULL after unref it. However, this issue is caused by the fact that v4l2 bufferpool can't be stopped. Here is my analysis on that issue: We are in v4l2_buffer_pool_process() in V4L2_BUF_TYPE_VIDEO_OUTPUT/GST_V4L2_IO_MMAP case. We successfully queue the first buffer into the driver so pool->buffers[index] == buf, and buf refcount == 2 Then, since we are not streaming, we call STREAMON which returns in error so we unref the current buffer and returns in error. (buf refcount == 1) Then the chain function unref the buffer, so buf is released to the pool using the gst_v4l2_buffer_pool_release_buffer(). But this function won't chain up to the parent release_buffer, ie default_release_buffer, because pool->buffers[index] == NULL condition is false. So, later in gst_v4l2_buffer_pool_stop, the baseclass bufferpool stop call will always return FALSE because the previous buffer has not been release from the baseclass point of view. So I wrote a patch that on streamon failure set pool->buffers[index] to NULL in order to have the buffer really released. Removing the buffer unref on failure will solve crash but we will leak v4l2 bufferpool and allocator because bufferpool has one outstanding buffer which will never be released. Another solution could be to allow the gst_v4l2_buffer_pool_release_buffer to release buffer if pool->buffers[index] is NULL and if we are not streaming.
Created attachment 287979 [details] [review] v4l2bufferpool: set pointers to NULL after some unref in finalize() Just set some pointers to NULL after unref them.
Created attachment 287980 [details] [review] v4l2bufferpool: cleanly handle streamon failure for output device Proposal patch to handle streamon failure.
Review of attachment 287979 [details] [review]: That is not supposed to be useful, finalize is calle once, but! Unrefing object is suppose to happen in dispose() call, not finalize(), and in this case can be called multiple time. The proper fix is to also implement dispose, and anything that is object shall be unref and set to NULL there.
Review of attachment 287980 [details] [review]: No comment for now, need to put more head time on this.
Created attachment 288036 [details] [review] v4l2bufferpool: implement dispose method Implement dispose method for v4l2bufferpool. I move allocator, vallocator and other_pool unref to dispose method but not the gst_object_unref (pool->obj->element) since I don't think it's a good idea to set pool->obj->element to NULL here.
Created attachment 288037 [details] [review] v4l2bufferpool: check that allocator is non null when stopping pool Otherwise, we could dereference NULL allocator when the stop method is called by the GstBufferPool finalize method.
Review of attachment 288036 [details] [review]: ::: sys/v4l2/gstv4l2bufferpool.c @@ +1356,2 @@ /* FIXME Is this required to keep around ? */ gst_object_unref (pool->obj->element); You forgot that one, specially that this one is a circular ref.
Review of attachment 287980 [details] [review]: I think this unref() is completly wrong, the buffer is queued ::: sys/v4l2/gstv4l2bufferpool.c @@ +1705,3 @@ + * so it should not fail */ + g_assert_not_reached (); + } This check isn't useful, _qbuf would have failed if it was not the case. @@ +1711,1 @@ gst_buffer_unref (to_queue); The problem here is that it's not correct to call unref() in the first place, the buffer is no longer owned by this code since qbuf() have succeed. But as you mention there is no logic that would let us go through flush_stop() logic, cause we'll never reach streamon state. That means we need to restore the "not streaming" state, which is assumed to have no buffer queued. Performing the flushing logic should be the way forward imho. Considering this is OUTPUT, and there is only 1 process thread allowed, order does not matter, even though the convention is to set the pointer to null and then decrements the counter. Now, you are missing an important step, the allocator still thinks the buffer is queued, gst_v4l2_allocator_flush (pool->vallocator) will fix that. Finally, to avoid breaking USERPTR/DMABUF-IMPORT, cleanup the qdata: gst_mini_object_set_qdata (GST_MINI_OBJECT (buffer), GST_V4L2_IMPORT_QUARK, NULL, NULL); See gst_v4l2_buffer_pool_flush_stop() for how this cleanup shall be done, take note this case is simpler, you are guarantied to only have 1 buffer queued. So basically that code is right, but you also need to cleanup quarks and allocator state. Good catch btw.
Review of attachment 288037 [details] [review]: Good catch, looks like a really early failure case. I've been chasing these a lot.
Created attachment 288141 [details] [review] v4l2bufferpool: implement dispose method Update commit message and add a comment about non moving gst_object_unref (pool->obj->element) to dispose method. pool->obj->element can't be set to NULL here since it's part of the v4l2object and dispose could be called multiple times.
Created attachment 288235 [details] [review] v4l2bufferpool: cleanly handle streamon failure for output device Updated patch wich perform the flush logic on the queued buffer when streamon failed.
Review of attachment 288235 [details] [review]: Looks good, nice comment.
Review of attachment 288141 [details] [review]: Looks good. I think it's worth having all this in 1.4, what do you think ?
(In reply to comment #13) > Review of attachment 288141 [details] [review]: > > Looks good. I think it's worth having all this in 1.4, what do you think ? I think it's a good idea to target them to 1.4.4 because streamon failure will lead the pipeline/process to keep the v4l2 device busy.
Merged in 1.4: commit f501a9d54aec1a164eba075a82b50693b0da7819 Author: Aurélien Zanelli <aurelien.zanelli@parrot.com> Date: Tue Oct 7 15:29:33 2014 +0200 v4l2bufferpool: cleanly handle streamon failure for output device On streamon failure, the queued buffer is not released from the bufferpool class point of view because it is queued to the driver and the flush logic is not performed since we are not in streaming state. It causes the v4l2 bufferpool to always return that stop method failed and to leak v4l2 objects and buffers. commit f501a9d54aec1a164eba075a82b50693b0da7819 Author: Aurélien Zanelli <aurelien.zanelli@parrot.com> Date: Tue Oct 7 15:29:33 2014 +0200 v4l2bufferpool: cleanly handle streamon failure for output device On streamon failure, the queued buffer is not released from the bufferpool class point of view because it is queued to the driver and the flush logic is not performed since we are not in streaming state. It causes the v4l2 bufferpool to always return that stop method failed and to leak v4l2 objects and buffers. This commit solve this by performing the flush logic in error case, ie flushing the allocator and restoring queued buffer state to non-queued. https://bugzilla.gnome.org/show_bug.cgi?id=738102 commit d953e75c8dc0fc22ac4691ec5f7021020433a409 Author: Aurélien Zanelli <aurelien.zanelli@parrot.com> Date: Wed Oct 8 10:31:21 2014 +0200 v4l2bufferpool: implement dispose method Unref objects in dispose method rather than in finalize in order to prevent circular reference. https://bugzilla.gnome.org/show_bug.cgi?id=738102 commit f6d8d950da7111868898ee7ea9723ea37938a846 Author: Aurélien Zanelli <aurelien.zanelli@parrot.com> Date: Wed Oct 8 10:35:14 2014 +0200 v4l2bufferpool: check that allocator is non null when stopping pool Otherwise, we could dereference NULL allocator when the stop method is called by the GstBufferPool's finalize method. https://bugzilla.gnome.org/show_bug.cgi?id=738102 And in 1.5 as commits 1dcc88 8e9c75 7ed27c