GNOME Bugzilla – Bug 767176
vaapiencode: implement flush() vmethod
Last modified: 2016-07-11 17:50:03 UTC
Hi! 'x264enc' plugin seems to hang forever when seeking. This can be easily reproduced using gst-launch-1.0 a 'navseek' element to do the seeking. The following pipeline works, even when using the arrow keys to seek backward and forward: gst-launch-1.0 filesrc location=video.mp4 ! decodebin ! navseek ! autovideosink However, if we add a 'x264enc' element, it does not work after pressing an arrow key to seek, the pipeline hangs forever: gst-launch-1.0 filesrc location=video.mp4 ! decodebin ! navseek ! x264enc ! decodebin ! autovideosink
Seems to work fine for me. Could you get a stack trace with gdb when it hangs?
Also, what decoder is used in your case? pass -v to gst-launch.
with gstreamer-vaapi decoders this hangs (never prerolls the first time): gst-launch-1.0 uridecodebin uri=file:///home/tpm/samples/misc/the-great-gatsby-720p-trailer.mov ! navseek ! videoconvert ! x264enc ! decodebin ! videoconvert ! xvimagesink while this works fine (incl. seeking) gst-launch-1.0 uridecodebin uri=file:///home/tpm/samples/misc/the-great-gatsby-720p-trailer.mov ! navseek ! videoconvert ! x264enc tune=zerolatency ! decodebin ! videoconvert ! xvimagesink
Created attachment 329001 [details] pipeline 1 log gst-launch-1.0 -v filesrc location=video.mp4 ! decodebin ! navseek ! x264enc ! decodebin ! autovideosink
Created attachment 329002 [details] pipeline 2 log gst-launch-1.0 -v filesrc location=video.mp4 ! decodebin ! navseek ! vaapih264enc ! decodebin ! autovideosink
Created attachment 329003 [details] pipeline 2 graph gst-launch-1.0 filesrc location=video.mp4 ! decodebin ! navseek ! vaapih264enc ! decodebin ! autovideosink
Hi Tim, thanks for your quick reply. Sorry for not mentioning it before but yes, I am using gstreamer-vaapi decoders. If I do not use them, both of my 2 pipelines work fine so it does not seem to be an issue with x264enc but with vaapi. Also, if I run both of your pipelines I get the same results as you: the first one hangs and the second one works fine, seeking inclusive. So to clarify, this is what I experience using gstreamer-vaapi with these 3 pipelines: 1) hangs forever (never prerolls the first time): gst-launch-1.0 filesrc location=video.mp4 ! decodebin ! navseek ! x264enc ! decodebin ! autovideosink 2) works fine but hangs forever when seeking: gst-launch-1.0 filesrc location=video.mp4 ! decodebin ! navseek ! vaapih264enc ! decodebin ! autovideosink 3) works fine, even when seeking: gst-launch-1.0 filesrc location=video.mp4 ! decodebin ! navseek ! x264enc tune=zerolatency ! decodebin ! autovideosink I have attached gst-launch's log of *1* (output.txt) and *2* (output-vaapi.txt). I also generated the pipeline graph of *2* as it plays fine when not seeking. I cannot get a gdb backtrace now as my home pc has gstreamer installed without debugging symbols but I will get one and upload it tomorrow using my work-station.
Created attachment 329029 [details] pipeline *1* backtrace pipeline *1* backtrace
Created attachment 329030 [details] pipeline *2* backtrace pipeline *2* backtrace
Hi there, So I run both pipelines *1* and *2* with gdb and I got a very interesting backtrace. Both pipelines seems to hang in the same function. Actually, I think the deadlock may be in *gst_vaapidecode_handle_frame* function (gst/vaapi/gstvaapidecode.c:524) when waiting in the conditional variable. Does that makes sense?
The deadlock seems to be in *thread 4*
That would make a lot of sense :)
Tracing gst-launch-1.0 -v filesrc location= ~/patterns/517748282_4.mp4 ! decodebin ! navseek ! vaapih264enc ! decodebin ! vaapisink I see that when a navseek operation is done, the flush() and purge() vmethods are called as expected, and the element start to pushing buffers again, but they never got released by downstream, so the buffers got exhausted reaching the deadlock. This looks similar like the problem with reverse playback.
(In reply to Víctor Manuel Jáquez Leal from comment #13) > Tracing > > gst-launch-1.0 -v filesrc location= ~/patterns/517748282_4.mp4 ! decodebin ! > navseek ! vaapih264enc ! decodebin ! vaapisink > > I see that when a navseek operation is done, the flush() and purge() > vmethods are called as expected, and the element start to pushing buffers > again, but they never got released by downstream, so the buffers got > exhausted reaching the deadlock. > > This looks similar like the problem with reverse playback. This is because pad_task of srcpad, which is to push frame to next element, is paused when seek performs and then never start again. IMHO, we should implement sink_event vmethod or flush vmethod in vaapiencode. By the way, on git master, the pipeline below doesn't work. This is regression or intended? gst-launch-1.0 filesrc location=video.mp4 ! decodebin ! autovideosink Of course, this pipeline works. gst-launch-1.0 filesrc location=video.mp4 ! decodebin ! vaapisink
Created attachment 329754 [details] [review] vaapiencode: implement flush vmethod on vaapiencode I tested with mpeg2,vp8,h264,h265 and confirmed it's fine :)
Comment on attachment 329754 [details] [review] vaapiencode: implement flush vmethod on vaapiencode Sorry. I spoke too fast. The patch is not 100% good. According to the documentation the flush vmethod: flushes all remaining data from the encoder *without pushing it downstream* And your patch is pushing the frames to downstream, so the flow_ret could have different values to GST_FLOW_OK leading to the same deadlock.
Review of attachment 329754 [details] [review]: ::: gst/vaapi/gstvaapiencode.c @@ +602,3 @@ + GstFlowReturn flow_ret = GST_FLOW_OK; + + if (!encode->encoder) { Being picky, remove these brackets.
Created attachment 329835 [details] [review] vaapiencode: implement flush vmethod on vaapiencode Following victor's comment.
@Hyunjun How did you test your patch? If I run gst-launch-1.0 -v filesrc location= ~/patterns/517748282_4.mp4 ! decodebin ! navseek ! vaapih264enc ! decodebin ! vaapisink And press right arrow several times, I reach the deadlock.
(In reply to Hyunjun Ko from comment #14) > By the way, on git master, the pipeline below doesn't work. This is > regression or intended? > > gst-launch-1.0 filesrc location=video.mp4 ! decodebin ! autovideosink You're right, it is a regression. I'll file a bug.
Created attachment 329894 [details] [review] vaapiencode: implement flush vmethod on vaapiencode Victor: could you test with this patch. I found a deadlock, and I doubt if that is same as yours. And this couldn't be merged even if deadlock doesn't happen any more. Because huge memory leak happens in every time that seek performs. I guess, we should look into flush method in each encoder.
Review of attachment 329894 [details] [review]: I'm surprise you don't enable flushing on your buffer pool. Otherwise, please share a backtrace that describe the deadlock.
Review of attachment 329894 [details] [review]: ::: gst/vaapi/gstvaapiencode.c @@ +624,3 @@ + GST_VIDEO_ENCODER_STREAM_LOCK (encode); + GST_VIDEO_ENCODER_STREAM_LOCK (encode); + This is not an option :( We need to look for a better way to handle this.
Review of attachment 329894 [details] [review]: ::: gst/vaapi/gstvaapiencode.c @@ +624,3 @@ + GST_VIDEO_ENCODER_STREAM_LOCK (encode); + GST_VIDEO_ENCODER_STREAM_LOCK (encode); + Yep, this code is very wrong. To implement flush, you often need to implement a FLUSH_START handler. Make sure you understand all blocking operation in your encoder, and then make sure that in that handler you unblock all those.
Created attachment 330058 [details] [review] vaapiencode: implement flush vmethod on vaapiencode Try another way accroding to Nicolas's advice, and release bufferpool on sinkpad in encoder.
Comment on attachment 330058 [details] [review] vaapiencode: implement flush vmethod on vaapiencode This causes another problem when testing with rtsp-server. I should re-work on it
Hi Hyunjun, Thanks a lot for the patch, I have not tested it with the rtsp-server but it seems to fix the hanging issue when using vaapih264enc. This now works: 1) gst-launch-1.0 filesrc location=video.mp4 ! decodebin ! navseek ! vaapih264enc ! decodebin ! autovideosink However, if we do not use vaapih264enc and use x264enc instead, the pipeline still hangs at the beginning: 2) gst-launch-1.0 filesrc location=video.mp4 ! decodebin ! navseek ! x264enc ! decodebin ! autovideosink It looks to me that there is also another issue outside of vaapih264enc. I will keep testing and try to narrow it down.
(In reply to Julian Bouzas from comment #27) > However, if we do not use vaapih264enc and use x264enc instead, the pipeline > still hangs at the beginning: > 2) gst-launch-1.0 filesrc location=video.mp4 ! decodebin ! navseek ! x264enc > ! decodebin ! autovideosink > > It looks to me that there is also another issue outside of vaapih264enc. I > will keep testing and try to narrow it down. This is working with x264enc on my laptop, while it's very slow. You can try on git master.
(In reply to Hyunjun Ko from comment #26) > Comment on attachment 330058 [details] [review] [review] > vaapiencode: implement flush vmethod on vaapiencode > > This causes another problem when testing with rtsp-server. > I should re-work on it The problem is that it's not working on the scenario "SEEK after PAUSE". This patch causes another deadlock if it's waiting on sink event method. Because processing of data in srcpad loop is not done, since flush-start event is not passed to downstream (push mode) on PAUSED state.
Created attachment 330413 [details] [review] vaapiencode: implement flush vmethod on vaapiencode Propose another solution. I guess it fixes all deadlocks and leaks.
Hi Hyunjun, Thanks a lot for your help. I just tested the patch with git master and it fixes all issues I was having. I cannot see any deadlock anymore so, from my point of view, the bug is fixed :)
Review of attachment 330413 [details] [review]: I've been playing with this patch for a while. I suspect it can be simplified. The trick is to recognize that it is necessary to reset the VA encoder, as Hyunjun discovered. ::: gst/vaapi/gstvaapiencode.c @@ +404,3 @@ g_return_val_if_fail (klass->alloc_encoder, FALSE); + if (encode->encoder && !gst_vaapiencode_reset (encode)) NAK. The purpose of ensure_encode() is to create the encoder *if* it doesn't exist. But it *shall not* destroy it if there's already one. It would be semantically incorrect. @@ +455,3 @@ +{ + gst_vaapi_encoder_replace (&encode->encoder, NULL); + gst_vaapi_plugin_base_close (GST_VAAPI_PLUGIN_BASE (encode)); In my opinion there is no need to "close" the plugin. To close the plugin means to destroy the buffer pools, allocators, caps configuration, etc. And we don't want that, we only need to "reset" (the replace to NULL) the encoder.
Created attachment 330915 [details] [review] vaapidencode: simplification of flush() patch
@Hyunjun, Please review and test the attachment 330915 [details] [review] It goes above your patch. Let me know how it works with your tests. During my tests, I found deadlocks, but they don't happen in the encoder, but in the multiqueue before the demuxer (qtdmeux, asfdemux)
Review of attachment 330413 [details] [review]: ::: gst/vaapi/gstvaapiencode.c @@ +645,3 @@ + if (ret && type == GST_EVENT_FLUSH_START) { + GST_LOG_OBJECT (encode, "pausing task"); + gst_pad_pause_task (GST_VAAPI_PLUGIN_BASE_SRC_PAD (encode)); I think pause is not what we need here, but gst_pad_stop_task(), since we are destroying the VA encoder
All in all, I feel we need to re-work the pad task handling in encoders.
(In reply to Víctor Manuel Jáquez Leal from comment #34) > Let me know how it works with your tests. > > During my tests, I found deadlocks, but they don't happen in the encoder, > but in the multiqueue before the demuxer (qtdmeux, asfdemux) I guess the deadlock you met is same as my case (PAUSE and SEEK) That's why I call gst_pad_puase_task after send_event. (This kind of logic can be found in other plugin...) It's ok that calling gst_pad_stop_task instead of gst_pad_pause_task but it also waits for finishing the task to join the task-thread inside gst_pad_stop_task. As far as I tested, if I call it after send_event, working fine. One more small thing. IMHO, calling gst_pad_pause_task is also OK, even we destroy VA encoder when flushing. Because it's not destroying encoder plugin and plugin's infrastructure including gst_pad. Is this same sense as that you don't want to close plugin? :)
(In reply to Hyunjun Ko from comment #37) > (In reply to Víctor Manuel Jáquez Leal from comment #34) > > > Let me know how it works with your tests. > > > > During my tests, I found deadlocks, but they don't happen in the encoder, > > but in the multiqueue before the demuxer (qtdmeux, asfdemux) > > I guess the deadlock you met is same as my case (PAUSE and SEEK) > That's why I call gst_pad_puase_task after send_event. > (This kind of logic can be found in other plugin...) > > It's ok that calling gst_pad_stop_task instead of gst_pad_pause_task but it > also waits for finishing the task to join the task-thread inside > gst_pad_stop_task. > > As far as I tested, if I call it after send_event, working fine. > > One more small thing. > IMHO, calling gst_pad_pause_task is also OK, even we destroy VA encoder when > flushing. Because it's not destroying encoder plugin and plugin's > infrastructure including gst_pad. > Is this same sense as that you don't want to close plugin? :) D'accord. Let's keep the pause but don't destroy the VA encoder at ensure_encoder(). Hyunjun, Can you cook the patch?
Created attachment 330945 [details] [review] vaapiencode: implement flush vmethod Tested with pipelines below. gst-launch-1.0 filesrc location=video-h264.mkv ! decodebin ! navseek ! vaapih264enc ! decodebin ! vaapisink gst-launch-1.0 filesrc location=video-h265.mkv ! decodebin ! navseek ! vaapih265enc ! decodebin ! vaapisink gst-launch-1.0 filesrc location=video-mpeg2.mkv ! decodebin ! navseek ! vaapimpeg2enc ! decodebin ! vaapisink gst-launch-1.0 filesrc location=video-vp8-webm.mkv ! decodebin ! navseek ! vaapivp8enc ! decodebin ! vaapisink