GNOME Bugzilla – Bug 777506
vaapidecoder_h264: Recursive loop and segfault after a few seconds
Last modified: 2017-02-07 11:09:36 UTC
Created attachment 343830 [details] Backtrace Hi, We just realized that the vaapi h264 decoder cannot decode a specific rtsp source with the following pipeline: gst-launch-1.0 rtspsrc location=rtsp://... ! rtph264depay ! h264parse ! vaapidecodebin ! fakesink It segfaults after a few seconds. The backtrace shows many *gst_vaapi_picture_destroy* being called recursively tons of times and the segfault happens in glib (probably because of OOM). I have attached a reduced version of the backtrace showing the beginning and end of it. Also, adding a caps filter between rtph264depay and h264parse to change the alignment from *nal* to *au* seems to solve the problem: gst-launch-1.0 rtspsrc ! rtph264depay ! video/x-h264,stream-format=byte-stream,alignment=au ! h264parse ! vaapidecode ! autovideosink It would be great for the decoder to decode any kind of alignment without applying a capsfilter. Please, let us know if you need more information. Thanks.
Hi, Thanks! a debug log would be useful too GST_DEBUG=vaapi*:5
Created attachment 343887 [details] GST_DEBUG Log with GST_DEBUG=vaapi*:5
There you go :)
Seems that there is a problem in the loop in gstvaapidecoder_h264.c:3170 for(;;) { // lost picture has parent ref, which is prev_picture lost_picture = gst_vaapi_picture_h264_new_clone (prev_picture); ... // replace prev_picture with lost_picture which has parent(prev_picture) gst_vaapi_picture_replace (&prev_picture, lost_picture); ... } But I don't know how to handle this to fix this issue. And i can't reproduce the case that is being through this loop. Julian, can you share the media you are streaming?
Hi Hyunjun, We are going to check if we can share the media with you so you can reproduce the issue. We will get back to you soon. Thanks!
(In reply to Hyunjun Ko from comment #4) > Seems that there is a problem in the loop in gstvaapidecoder_h264.c:3170 > > for(;;) { > // lost picture has parent ref, which is prev_picture > lost_picture = gst_vaapi_picture_h264_new_clone (prev_picture); > ... > // replace prev_picture with lost_picture which has parent(prev_picture) > gst_vaapi_picture_replace (&prev_picture, lost_picture); > ... > } > > But I don't know how to handle this to fix this issue. > And i can't reproduce the case that is being through this loop. > > Julian, can you share the media you are streaming? Hi Hyunjun The camera that shows this issue is available on: rtsp://71.252.214.80:855/trackID=1 The username is admin The password is 11223 This is not our camera so please let us know when you're done with it as the end user wants to close it off. Note that this is on a Ubiquiti wireless link - we suspect that is part of the issue. But since it plays with h264dec, it's odd it does not work with vaapi h264 dec. Thanks a lot! Ben
(In reply to Ben White from comment #6) > Hi Hyunjun > > The camera that shows this issue is available on: > > rtsp://71.252.214.80:855/trackID=1 > > The username is admin > The password is 11223 > > This is not our camera so please let us know when you're done with it as the > end user wants to close it off. > > Note that this is on a Ubiquiti wireless link - we suspect that is part of > the issue. But since it plays with h264dec, it's odd it does not work with > vaapi h264 dec. > > Thanks a lot! > Ben Thanks Ben! I've done dupmping h264 data through this camera. And I've seen the same thing as what this issue says. Seems that it's caused by some data-loss during streaming but I'm not sure.
Created attachment 344837 [details] [review] libs: decoder: h264: reduce frame number of gaps Reduce frame num gaps so that we don't have to create unnecessary dummy pictures, just to throw them away
Ben, Julian, can you test with this patch? Victor, I refered to libav source code, https://github.com/libav/libav/blob/master/libavcodec/h264_slice.c#L1312 I think we have to handle these cases. Due to, I guess, loss of data through network leads to this kind of issues.
Review of attachment 344837 [details] [review]: nice! ::: gst-libs/gst/vaapi/gstvaapidecoder_h264.c @@ +3168,3 @@ priv->frame_num = priv->prev_ref_frame_num; for (;;) { + gint32 tmp_prev_ref_frame_num = priv->frame_num; I wonder if we should execute this block only if if (priv->frame_num != slice_hdr->fram_num) { ... @@ +3179,3 @@ + if (tmp_prev_ref_frame_num < 0) + tmp_prev_ref_frame_num += MaxFrameNum; + priv->prev_ref_frame_num = tmp_prev_ref_frame_num; this assignation seems to be spurious, since the assignation is done unconditionally below.
Hi Hyunjun, Your patch works like a charm, the segfault does not happen any more :) If you guys are happy with that patch, we will turn off the rtsp source and mark the bug as fixed. Thanks.
(In reply to Julian Bouzas from comment #11) > Hi Hyunjun, > > Your patch works like a charm, the segfault does not happen any more :) \o/ > If you guys are happy with that patch, we will turn off the rtsp source and > mark the bug as fixed. Just let us to mark it as fixed when we merge it ;)
Created attachment 345007 [details] [review] libs: decoder: h264: reduce frame number of gaps Reduce frame num gaps so that we don't have to create unnecessary dummy pictures, just to throw them away Sanitize previous patch following to Victor's comments.
Created attachment 345008 [details] [review] libs: decoder: h264: reduce frame number of gaps Reduce frame num gaps so that we don't have to create unnecessary dummy pictures, just to throw them away Sanitize previous patch following to Victor's comments.
Review of attachment 345008 [details] [review]: ::: gst-libs/gst/vaapi/gstvaapidecoder_h264.c @@ +3172,3 @@ + priv->frame_num -= MaxFrameNum; + + if (slice_hdr->frame_num - priv->frame_num - 1 > sps->num_ref_frames) is that -1 correct? it wasn't before, nor in the libav code. @@ +3176,3 @@ + + if (priv->frame_num < 0) + priv->frame_num += MaxFrameNum; in the libav this conditional is nested in the above if... is this correct?
(In reply to Víctor Manuel Jáquez Leal from comment #15) > Review of attachment 345008 [details] [review] [review]: > > ::: gst-libs/gst/vaapi/gstvaapidecoder_h264.c > @@ +3172,3 @@ > + priv->frame_num -= MaxFrameNum; > + > + if (slice_hdr->frame_num - priv->frame_num - 1 > sps->num_ref_frames) > > is that -1 correct? it wasn't before, nor in the libav code. I found if (slice_hdr->frame_num - priv->frame_num - 1) equals sps->num_ref_frames, it doesn't need to reset prev_ref_frame_num. But without -1, it's working likewise too, though it reset prev_ref_frame_num. > > @@ +3176,3 @@ > + > + if (priv->frame_num < 0) > + priv->frame_num += MaxFrameNum; > > in the libav this conditional is nested in the above if... is this correct? This is a bit tricky, but I think it's correct. And I don't think it's nested, since priv->frame_num is reset already. Let's say, slice hdr frame is 1, sps->num ref frames is 1. Then prev frame is -1, so it should be reset to assuming previous value which is positive.
Created attachment 345085 [details] [review] libs: decoder: h264: reduce frame number of gaps Reduce frame num gaps so that we don't have to create unnecessary dummy pictures, just to throw them away
Attachment 345085 [details] pushed as d89a3bd - libs: decoder: h264: reduce frame number of gaps
Also pushed in branch 1.10 commit de29f03361cd1c55195322be571582c3d4d0392b Author: Hyunjun Ko <zzoon@igalia.com> Date: Tue Feb 7 16:17:39 2017 +0900 libs: decoder: h264: reduce frame number of gaps Reduce frame num gaps so that we don't have to create unnecessary dummy pictures, just throw them away. Signed-off-by: Víctor Manuel Jáquez Leal <victorx.jaquez@intel.com> https://bugzilla.gnome.org/show_bug.cgi?id=777506