After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 738102 - v4l2bufferpool: cleanly handle streamon failure for output device
v4l2bufferpool: cleanly handle streamon failure for output device
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-good
git master
Other Linux
: Normal normal
: 1.4.4
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2014-10-07 16:27 UTC by Aurélien Zanelli
Modified: 2014-10-29 20:55 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
GStreamer v4l2 log on streamon failure (134.49 KB, text/x-log)
2014-10-07 16:27 UTC, Aurélien Zanelli
  Details
v4l2bufferpool: set pointers to NULL after some unref in finalize() (1.10 KB, patch)
2014-10-07 16:34 UTC, Aurélien Zanelli
needs-work Details | Review
v4l2bufferpool: cleanly handle streamon failure for output device (1.58 KB, patch)
2014-10-07 16:34 UTC, Aurélien Zanelli
needs-work Details | Review
v4l2bufferpool: implement dispose method (2.12 KB, patch)
2014-10-08 12:10 UTC, Aurélien Zanelli
needs-work Details | Review
v4l2bufferpool: check that allocator is non null when stopping pool (1.30 KB, patch)
2014-10-08 12:13 UTC, Aurélien Zanelli
committed Details | Review
v4l2bufferpool: implement dispose method (2.47 KB, patch)
2014-10-09 16:12 UTC, Aurélien Zanelli
committed Details | Review
v4l2bufferpool: cleanly handle streamon failure for output device (2.04 KB, patch)
2014-10-10 16:10 UTC, Aurélien Zanelli
committed Details | Review

Description Aurélien Zanelli 2014-10-07 16:27:01 UTC
Created attachment 287976 [details]
GStreamer v4l2 log on streamon failure

Currently, for an output device, if streamon returns in errors, we end up with a bad assert:
GLib (gthread-posix.c): Unexpected error from C library during 'pthread_mutex_lock': Invalid argument.  Aborting.
Aborted

This error is caused by a try to lock a v4l2allocator which has been finalized.
I attached a v4l2*:6,bufferpool:6 trace with some additional message.

Basically, v4l2 bufferpool is active and we finalize it. In the finalize method, we unref the allocator so it's finalize and then we call parent finalize method, ie gst_buffer_pool_finalize(). This method try to deactivate pool so it call v4l2_buffer_pool_stop() whoch try to lock v4l2_allocator which did not exist anymore.
This can be improved by setting allocator to NULL after unref it.

However, this issue is caused by the fact that v4l2 bufferpool can't be stopped.
Here is my analysis on that issue:

We are in v4l2_buffer_pool_process() in V4L2_BUF_TYPE_VIDEO_OUTPUT/GST_V4L2_IO_MMAP case. We successfully queue the first buffer into the driver so pool->buffers[index] == buf, and buf refcount == 2
Then, since we are not streaming, we call STREAMON which returns in error so we unref the current buffer and returns in error. (buf refcount == 1)
Then the chain function unref the buffer, so buf is released to the pool using the gst_v4l2_buffer_pool_release_buffer(). But this function won't chain up to the parent release_buffer, ie default_release_buffer, because pool->buffers[index] == NULL condition is false.
So, later in gst_v4l2_buffer_pool_stop, the baseclass bufferpool stop call will always return FALSE because the previous buffer has not been release from the baseclass point of view.


So I wrote a patch that on streamon failure set pool->buffers[index] to NULL in order to have the buffer really released. Removing the buffer unref on failure will solve crash but we will leak v4l2 bufferpool and allocator because bufferpool has one outstanding buffer which will never be released.

Another solution could be to allow the gst_v4l2_buffer_pool_release_buffer to release buffer if pool->buffers[index] is NULL and if we are not streaming.
Comment 1 Aurélien Zanelli 2014-10-07 16:34:01 UTC
Created attachment 287979 [details] [review]
v4l2bufferpool: set pointers to NULL after some unref in finalize()

Just set some pointers to NULL after unref them.
Comment 2 Aurélien Zanelli 2014-10-07 16:34:47 UTC
Created attachment 287980 [details] [review]
v4l2bufferpool: cleanly handle streamon failure for output device

Proposal patch to handle streamon failure.
Comment 3 Nicolas Dufresne (ndufresne) 2014-10-07 17:12:45 UTC
Review of attachment 287979 [details] [review]:

That is not supposed to be useful, finalize is calle once, but! Unrefing object is suppose to happen in dispose() call, not finalize(), and in this case can be called multiple time. The proper fix is to also implement dispose, and anything that is object shall be unref and set to NULL there.
Comment 4 Nicolas Dufresne (ndufresne) 2014-10-07 17:14:41 UTC
Review of attachment 287980 [details] [review]:

No comment for now, need to put more head time on this.
Comment 5 Aurélien Zanelli 2014-10-08 12:10:15 UTC
Created attachment 288036 [details] [review]
v4l2bufferpool: implement dispose method

Implement dispose method for v4l2bufferpool.

I move allocator, vallocator and other_pool unref to dispose method but not the gst_object_unref (pool->obj->element) since I don't think it's a good idea to set pool->obj->element to NULL here.
Comment 6 Aurélien Zanelli 2014-10-08 12:13:12 UTC
Created attachment 288037 [details] [review]
v4l2bufferpool: check that allocator is non null when stopping pool

Otherwise, we could dereference NULL allocator when the stop method is called by the GstBufferPool finalize method.
Comment 7 Nicolas Dufresne (ndufresne) 2014-10-09 15:10:21 UTC
Review of attachment 288036 [details] [review]:

::: sys/v4l2/gstv4l2bufferpool.c
@@ +1356,2 @@
   /* FIXME Is this required to keep around ? */
   gst_object_unref (pool->obj->element);

You forgot that one, specially that this one is a circular ref.
Comment 8 Nicolas Dufresne (ndufresne) 2014-10-09 15:34:30 UTC
Review of attachment 287980 [details] [review]:

I think this unref() is completly wrong, the buffer is queued

::: sys/v4l2/gstv4l2bufferpool.c
@@ +1705,3 @@
+               * so it should not fail */
+              g_assert_not_reached ();
+            }

This check isn't useful, _qbuf would have failed if it was not the case.

@@ +1711,1 @@
             gst_buffer_unref (to_queue);

The problem here is that it's not correct to call unref() in the first place, the buffer is no longer owned by this code since qbuf() have succeed. But as you mention there is no logic that would let us go through flush_stop() logic, cause we'll never reach streamon state. That means we need to restore the "not streaming" state, which is assumed to have no buffer queued. Performing the flushing logic should be the way forward imho.

Considering this is OUTPUT, and there is only 1 process thread allowed, order does not matter, even though the convention is to set the pointer to null and then decrements the counter. Now, you are missing an important step, the allocator still thinks the buffer is queued, gst_v4l2_allocator_flush (pool->vallocator) will fix that. Finally, to avoid breaking USERPTR/DMABUF-IMPORT, cleanup the qdata:

gst_mini_object_set_qdata (GST_MINI_OBJECT (buffer),
              GST_V4L2_IMPORT_QUARK, NULL, NULL);

See gst_v4l2_buffer_pool_flush_stop() for how this cleanup shall be done, take note this case is simpler, you are guarantied to only have 1 buffer queued. So basically that code is right, but you also need to cleanup quarks and allocator state.

Good catch btw.
Comment 9 Nicolas Dufresne (ndufresne) 2014-10-09 15:35:54 UTC
Review of attachment 288037 [details] [review]:

Good catch, looks like a really early failure case. I've been chasing these a lot.
Comment 10 Aurélien Zanelli 2014-10-09 16:12:43 UTC
Created attachment 288141 [details] [review]
v4l2bufferpool: implement dispose method

Update commit message and add a comment about non moving gst_object_unref (pool->obj->element) to dispose method.

pool->obj->element can't be set to NULL here since it's part of the v4l2object and dispose could be called multiple times.
Comment 11 Aurélien Zanelli 2014-10-10 16:10:57 UTC
Created attachment 288235 [details] [review]
v4l2bufferpool: cleanly handle streamon failure for output device

Updated patch wich perform the flush logic on the queued buffer when streamon failed.
Comment 12 Nicolas Dufresne (ndufresne) 2014-10-10 17:17:40 UTC
Review of attachment 288235 [details] [review]:

Looks good, nice comment.
Comment 13 Nicolas Dufresne (ndufresne) 2014-10-10 17:18:51 UTC
Review of attachment 288141 [details] [review]:

Looks good. I think it's worth having all this in 1.4, what do you think ?
Comment 14 Aurélien Zanelli 2014-10-10 17:52:03 UTC
(In reply to comment #13)
> Review of attachment 288141 [details] [review]:
> 
> Looks good. I think it's worth having all this in 1.4, what do you think ?

I think it's a good idea to target them to 1.4.4 because streamon failure will lead the pipeline/process to keep the v4l2 device busy.
Comment 15 Nicolas Dufresne (ndufresne) 2014-10-29 20:54:45 UTC
Merged in 1.4:
commit f501a9d54aec1a164eba075a82b50693b0da7819
Author: Aurélien Zanelli <aurelien.zanelli@parrot.com>
Date:   Tue Oct 7 15:29:33 2014 +0200

    v4l2bufferpool: cleanly handle streamon failure for output device
    
    On streamon failure, the queued buffer is not released from the
    bufferpool class point of view because it is queued to the driver and
    the flush logic is not performed since we are not in streaming state.
    It causes the v4l2 bufferpool to always return that stop method failed
    and to leak v4l2 objects and buffers.
    
commit f501a9d54aec1a164eba075a82b50693b0da7819
Author: Aurélien Zanelli <aurelien.zanelli@parrot.com>
Date:   Tue Oct 7 15:29:33 2014 +0200

    v4l2bufferpool: cleanly handle streamon failure for output device
    
    On streamon failure, the queued buffer is not released from the
    bufferpool class point of view because it is queued to the driver and
    the flush logic is not performed since we are not in streaming state.
    It causes the v4l2 bufferpool to always return that stop method failed
    and to leak v4l2 objects and buffers.
    
    This commit solve this by performing the flush logic in error case, ie
    flushing the allocator and restoring queued buffer state to non-queued.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=738102

commit d953e75c8dc0fc22ac4691ec5f7021020433a409
Author: Aurélien Zanelli <aurelien.zanelli@parrot.com>
Date:   Wed Oct 8 10:31:21 2014 +0200

    v4l2bufferpool: implement dispose method
    
    Unref objects in dispose method rather than in finalize in order to
    prevent circular reference.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=738102

commit f6d8d950da7111868898ee7ea9723ea37938a846
Author: Aurélien Zanelli <aurelien.zanelli@parrot.com>
Date:   Wed Oct 8 10:35:14 2014 +0200

    v4l2bufferpool: check that allocator is non null when stopping pool
    
    Otherwise, we could dereference NULL allocator when the stop method is
    called by the GstBufferPool's finalize method.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=738102

And in 1.5 as commits 1dcc88 8e9c75 7ed27c