GNOME Bugzilla – Bug 761940
v4l2: deadlock when using dmabuf-import (zero-copy) between two V4L2 devices
Last modified: 2016-06-10 15:56:07 UTC
Hi, I'm working on a Freescale I.MX6 board that has a hardware video encoder (CODA) that is exposed as a V4L2 device (/dev/video2). This board has also a camera connected that is exposed as another V4L2 device (/dev/video0). I'm using GStreamer to encode and stream video from the camera. The pipeline I'm using is this: 1. On the board: To record, encode and stream the video: $ gst-launch-1.0 v4l2src ! video/x-raw,width=1280,height=720,framerate=30/1 ! v4l2video2h264enc ! h264parse ! rtph264pay pt=96 config-interval=5 ! udpsink host=${my.local.ip.addr} port=5555 2. On my PC to receive the video and measure the FPS at the same time: $ gst-launch-1.0 -e -v udpsrc port=5555 ! application/x-rtp, payload=96 ! rtpjitterbuffer ! rtph264depay ! avdec_h264 ! fpsdisplaysink sync=false text-overlay=false Everything works fine, on my PC I get the video playback at 30FPS without problems. The issue is that this is taking around 45% of CPU usage on the board due to the memcopies of raw 720p video from v4l2src0:pool:src to v4l2video2h264enc0:pool:sink So, to fix that I want to use zero-copy (via dmabuf-import). So, I change the pipeline on the board to: gst-launch-1.0 v4l2src io-mode=dmabuf ! video/x-raw,width=1280,height=720,framerate=30/1 ! v4l2video2h264enc output-io-mode=dmabuf-import ! h264parse ! rtph264pay pt=96 config-interval=5 ! udpsink host=${my.local.ip.addr} port=5555 The issue is that is not working. I get a deadlock. There are 3 bufferpools at play here: - v4l2src0:pool:src -> The source of the video from the camera V4L2 device /dev/video0 - v4l2video2h264enc0:pool:sink -> The input of the encoder (the sink pad of the V4L2 hardware encoder at /dev/video2). It matches the output-io-mode property of v4l2video2h264enc - v4l2video2h264enc0:pool:src -> The output of the enconder (The src pad of the V4L2 hardware encoder at /dev/video2). It matches the capture-io-mode property of v4l2video2h264enc See the attached files properties_of_v4l2src.txt and properties_of_v4l2video2h264enc.txt for more info Some comments about the versions used: - I'm using gstreamer and gstreamer-plugins-good 1.4.5 - For the v4l2video encoder element of gstreamer-plugins-good I'm using the patches available in bug 728438 <https://gitlab.com/veo-labs/gst-plugins-good>
Created attachment 320972 [details] gst-inspect-1.0 v4l2video2h264enc
Created attachment 320973 [details] gst-inspect-1.0 v4l2src
I have been debugging this, and I have observed that if I raise GST_V4L2_MIN_BUFFERS from 2 to 8 I get video playback working for ~1 second and then it deadlocks again. After more debugging, I think that the deadlocks happens when the bufferpool is full.
Finally I have applied a small patch that don't lets the bufferpool get full and it suddenly everything works as expected. Video playback is working fine and stable (30FPS without issue) using zero-copy (dmabuf-import). And the CPU usage drops to a ~[3-15]% which is much better than without zero-copy. I'm attaching the patch for comments.
Created attachment 320975 [details] [review] Patch that makes zero-copy work between two V4L2 devices.
(In reply to Carlos Alberto Lopez Perez from comment #0) > Some comments about the versions used: > - I'm using gstreamer and gstreamer-plugins-good 1.4.5 Please try with 1.6 or master, so we know if that has been fixed in recent version or not. > - For the v4l2video encoder element of gstreamer-plugins-good I'm using the > patches available in bug 728438 > <https://gitlab.com/veo-labs/gst-plugins-good> Obviously, if there is a bug in that element, we can only guide you. There is quite a large list of todo's in the review bug for this element. I started on my free-time to address them, but my free time is limit lately.
(In reply to Carlos Alberto Lopez Perez from comment #3) > I have been debugging this, and I have observed that if I raise > GST_V4L2_MIN_BUFFERS from 2 to 8 I get video playback working for ~1 second > and then it deadlocks again. > > After more debugging, I think that the deadlocks happens when the bufferpool > is full. We call this a stall, not a dealock btw. It happens if you exhaust a buffer pool capabilities. This would indicate the the encoder is not requesting enough buffer from upstream (see propose_allocation) or is not releasing those buffers properly.
Review of attachment 320975 [details] [review]: I think a better solution would be to implement a try_dqbuf. Just like dqbuf except it would be non-blocking. This way, for every queued buffer, we'd run a loop to release anything no longer in use. This would reduce the amount of buffer waste that we have atm.
Is there still any interest here ?