After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 761940 - v4l2: deadlock when using dmabuf-import (zero-copy) between two V4L2 devices
v4l2: deadlock when using dmabuf-import (zero-copy) between two V4L2 devices
Status: RESOLVED OBSOLETE
Product: GStreamer
Classification: Platform
Component: gst-plugins-good
1.4.5
Other Linux
: Normal normal
: git master
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2016-02-12 13:16 UTC by Carlos Alberto Lopez Perez
Modified: 2016-06-10 15:56 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
gst-inspect-1.0 v4l2video2h264enc (4.31 KB, text/plain)
2016-02-12 13:17 UTC, Carlos Alberto Lopez Perez
  Details
gst-inspect-1.0 v4l2src (9.16 KB, text/plain)
2016-02-12 13:18 UTC, Carlos Alberto Lopez Perez
  Details
Patch that makes zero-copy work between two V4L2 devices. (1.42 KB, patch)
2016-02-12 13:25 UTC, Carlos Alberto Lopez Perez
needs-work Details | Review

Description Carlos Alberto Lopez Perez 2016-02-12 13:16:53 UTC
Hi,

I'm working on a Freescale I.MX6 board that has a hardware video encoder (CODA)
that is exposed as a V4L2 device (/dev/video2).

This board has also a camera connected that is exposed as another V4L2 device
(/dev/video0).

I'm using GStreamer to encode and stream video from the camera.
The pipeline I'm using is this:

1. On the board: To record, encode and stream the video:
$ gst-launch-1.0 v4l2src ! video/x-raw,width=1280,height=720,framerate=30/1 ! v4l2video2h264enc ! h264parse ! rtph264pay pt=96 config-interval=5 ! udpsink host=${my.local.ip.addr} port=5555 


2. On my PC to receive the video and measure the FPS at the same time:
$ gst-launch-1.0 -e -v udpsrc port=5555 ! application/x-rtp, payload=96 ! rtpjitterbuffer ! rtph264depay ! avdec_h264 ! fpsdisplaysink sync=false text-overlay=false 


Everything works fine, on my PC I get the video playback at 30FPS without problems.


The issue is that this is taking around 45% of CPU usage on the board due to the memcopies of raw 720p video from v4l2src0:pool:src to v4l2video2h264enc0:pool:sink

So, to fix that I want to use zero-copy (via dmabuf-import).

So, I change the pipeline on the board to:


gst-launch-1.0 v4l2src io-mode=dmabuf ! video/x-raw,width=1280,height=720,framerate=30/1 ! v4l2video2h264enc output-io-mode=dmabuf-import ! h264parse ! rtph264pay pt=96 config-interval=5 ! udpsink host=${my.local.ip.addr} port=5555 

The issue is that is not working. I get a deadlock.

There are 3 bufferpools at play here:


  - v4l2src0:pool:src             -> The source of the video from the camera V4L2 device /dev/video0
  - v4l2video2h264enc0:pool:sink  -> The input of the encoder (the sink pad of the V4L2 hardware encoder at /dev/video2). It matches the output-io-mode property of v4l2video2h264enc
  - v4l2video2h264enc0:pool:src	  -> The output of the enconder (The src pad of the V4L2 hardware encoder at /dev/video2). It matches the capture-io-mode property of v4l2video2h264enc
  
  See the attached files properties_of_v4l2src.txt and properties_of_v4l2video2h264enc.txt for more info
  

Some comments about the versions used:
 - I'm using gstreamer and gstreamer-plugins-good 1.4.5
 - For the v4l2video encoder element of gstreamer-plugins-good I'm using the patches available in bug 728438 <https://gitlab.com/veo-labs/gst-plugins-good>
Comment 1 Carlos Alberto Lopez Perez 2016-02-12 13:17:46 UTC
Created attachment 320972 [details]
gst-inspect-1.0 v4l2video2h264enc
Comment 2 Carlos Alberto Lopez Perez 2016-02-12 13:18:15 UTC
Created attachment 320973 [details]
gst-inspect-1.0 v4l2src
Comment 3 Carlos Alberto Lopez Perez 2016-02-12 13:21:22 UTC
I have been debugging this, and I have observed that if I raise GST_V4L2_MIN_BUFFERS from 2 to 8 I get video playback working for ~1 second and then it deadlocks again.

After more debugging, I think that the deadlocks happens when the bufferpool is full.
Comment 4 Carlos Alberto Lopez Perez 2016-02-12 13:22:58 UTC
Finally I have applied a small patch that don't lets the bufferpool get full and it suddenly everything works as expected.

Video playback is working fine and stable (30FPS without issue) using zero-copy (dmabuf-import).

And the CPU usage drops to a ~[3-15]% which is much better than without zero-copy.


I'm attaching the patch for comments.
Comment 5 Carlos Alberto Lopez Perez 2016-02-12 13:25:03 UTC
Created attachment 320975 [details] [review]
Patch that makes zero-copy work between two V4L2 devices.
Comment 6 Nicolas Dufresne (ndufresne) 2016-02-12 17:02:44 UTC
(In reply to Carlos Alberto Lopez Perez from comment #0)
> Some comments about the versions used:
>  - I'm using gstreamer and gstreamer-plugins-good 1.4.5

Please try with 1.6 or master, so we know if that has been fixed in recent version or not.

>  - For the v4l2video encoder element of gstreamer-plugins-good I'm using the
> patches available in bug 728438
> <https://gitlab.com/veo-labs/gst-plugins-good>

Obviously, if there is a bug in that element, we can only guide you. There is quite a large list of todo's in the review bug for this element. I started on my free-time to address them, but my free time is limit lately.
Comment 7 Nicolas Dufresne (ndufresne) 2016-02-12 17:04:16 UTC
(In reply to Carlos Alberto Lopez Perez from comment #3)
> I have been debugging this, and I have observed that if I raise
> GST_V4L2_MIN_BUFFERS from 2 to 8 I get video playback working for ~1 second
> and then it deadlocks again.
> 
> After more debugging, I think that the deadlocks happens when the bufferpool
> is full.

We call this a stall, not a dealock btw. It happens if you exhaust a buffer pool capabilities. This would indicate the the encoder is not requesting enough buffer from upstream (see propose_allocation) or is not releasing those buffers properly.
Comment 8 Nicolas Dufresne (ndufresne) 2016-06-06 21:30:59 UTC
Review of attachment 320975 [details] [review]:

I think a better solution would be to implement a try_dqbuf. Just like dqbuf except it would be non-blocking. This way, for every queued buffer, we'd run a loop to release anything no longer in use. This would reduce the amount of buffer waste that we have atm.
Comment 9 Nicolas Dufresne (ndufresne) 2016-06-07 18:24:33 UTC
Is there still any interest here ?