After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 777506 - vaapidecoder_h264: Recursive loop and segfault after a few seconds
vaapidecoder_h264: Recursive loop and segfault after a few seconds
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gstreamer-vaapi
1.10.1
Other Linux
: Normal normal
: 1.10.4
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2017-01-19 18:26 UTC by Julian Bouzas
Modified: 2017-02-07 11:09 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Backtrace (63.89 KB, text/plain)
2017-01-19 18:26 UTC, Julian Bouzas
  Details
GST_DEBUG (113.98 KB, text/plain)
2017-01-20 09:54 UTC, Julian Bouzas
  Details
libs: decoder: h264: reduce frame number of gaps (1.75 KB, patch)
2017-02-03 09:30 UTC, Hyunjun Ko
none Details | Review
libs: decoder: h264: reduce frame number of gaps (1.66 KB, patch)
2017-02-06 02:56 UTC, Hyunjun Ko
none Details | Review
libs: decoder: h264: reduce frame number of gaps (1.65 KB, patch)
2017-02-06 03:03 UTC, Hyunjun Ko
none Details | Review
libs: decoder: h264: reduce frame number of gaps (2.07 KB, patch)
2017-02-07 07:44 UTC, Hyunjun Ko
committed Details | Review

Description Julian Bouzas 2017-01-19 18:26:33 UTC
Created attachment 343830 [details]
Backtrace

Hi,

We just realized that the vaapi h264 decoder cannot decode a specific rtsp source with the following pipeline:

gst-launch-1.0 rtspsrc location=rtsp://... ! rtph264depay ! h264parse !
 vaapidecodebin ! fakesink

It segfaults after a few seconds. The backtrace shows many *gst_vaapi_picture_destroy* being called recursively tons of times and the segfault happens in glib (probably because of OOM). I have attached a reduced version of the backtrace showing the beginning and end of it.

Also, adding a caps filter between rtph264depay and h264parse to change the  alignment from *nal* to *au* seems to solve the problem:

gst-launch-1.0 rtspsrc ! rtph264depay ! video/x-h264,stream-format=byte-stream,alignment=au ! h264parse ! vaapidecode ! autovideosink

It would be great for the decoder to decode any kind of alignment without applying a capsfilter.

Please, let us know if you need more information. Thanks.
Comment 1 Víctor Manuel Jáquez Leal 2017-01-19 20:54:00 UTC
Hi, 

Thanks! a debug log would be useful too GST_DEBUG=vaapi*:5
Comment 2 Julian Bouzas 2017-01-20 09:54:55 UTC
Created attachment 343887 [details]
GST_DEBUG

Log with GST_DEBUG=vaapi*:5
Comment 3 Julian Bouzas 2017-01-20 09:55:25 UTC
There you go :)
Comment 4 Hyunjun Ko 2017-01-24 09:38:11 UTC
Seems that there is a problem in the loop in gstvaapidecoder_h264.c:3170

for(;;) {
  // lost picture has parent ref, which is prev_picture
  lost_picture = gst_vaapi_picture_h264_new_clone (prev_picture);
  ...
  // replace prev_picture with lost_picture which has parent(prev_picture)
  gst_vaapi_picture_replace (&prev_picture, lost_picture);
  ...
}

But I don't know how to handle this to fix this issue.
And i can't reproduce the case that is being through this loop.

Julian, can you share the media you are streaming?
Comment 5 Julian Bouzas 2017-01-24 16:36:26 UTC
Hi Hyunjun,

We are going to check if we can share the media with you so you can reproduce the issue. We will get back to you soon. Thanks!
Comment 6 Ben White 2017-02-02 10:00:52 UTC
(In reply to Hyunjun Ko from comment #4)
> Seems that there is a problem in the loop in gstvaapidecoder_h264.c:3170
> 
> for(;;) {
>   // lost picture has parent ref, which is prev_picture
>   lost_picture = gst_vaapi_picture_h264_new_clone (prev_picture);
>   ...
>   // replace prev_picture with lost_picture which has parent(prev_picture)
>   gst_vaapi_picture_replace (&prev_picture, lost_picture);
>   ...
> }
> 
> But I don't know how to handle this to fix this issue.
> And i can't reproduce the case that is being through this loop.
> 
> Julian, can you share the media you are streaming?

Hi Hyunjun

The camera that shows this issue is available on:

rtsp://71.252.214.80:855/trackID=1

The username is admin
The password is 11223

This is not our camera so please let us know when you're done with it as the end user wants to close it off.

Note that this is on a Ubiquiti wireless link - we suspect that is part of the issue. But since it plays with h264dec, it's odd it does not work with vaapi h264 dec.

Thanks a lot!
Ben
Comment 7 Hyunjun Ko 2017-02-02 10:43:22 UTC
(In reply to Ben White from comment #6)
> Hi Hyunjun
> 
> The camera that shows this issue is available on:
> 
> rtsp://71.252.214.80:855/trackID=1
> 
> The username is admin
> The password is 11223
> 
> This is not our camera so please let us know when you're done with it as the
> end user wants to close it off.
> 
> Note that this is on a Ubiquiti wireless link - we suspect that is part of
> the issue. But since it plays with h264dec, it's odd it does not work with
> vaapi h264 dec.
> 
> Thanks a lot!
> Ben

Thanks Ben!
I've done dupmping h264 data through this camera.
And I've seen the same thing as what this issue says.
Seems that it's caused by some data-loss during streaming but I'm not sure.
Comment 8 Hyunjun Ko 2017-02-03 09:30:48 UTC
Created attachment 344837 [details] [review]
libs: decoder: h264: reduce frame number of gaps

Reduce frame num gaps so that we don't have to create unnecessary
dummy pictures, just to throw them away
Comment 9 Hyunjun Ko 2017-02-03 09:36:10 UTC
Ben, Julian, can you test with this patch?

Victor, I refered to libav source code, https://github.com/libav/libav/blob/master/libavcodec/h264_slice.c#L1312

I think we have to handle these cases.
Due to, I guess, loss of data through network leads to this kind of issues.
Comment 10 Víctor Manuel Jáquez Leal 2017-02-03 10:22:22 UTC
Review of attachment 344837 [details] [review]:

nice!

::: gst-libs/gst/vaapi/gstvaapidecoder_h264.c
@@ +3168,3 @@
   priv->frame_num = priv->prev_ref_frame_num;
   for (;;) {
+    gint32 tmp_prev_ref_frame_num = priv->frame_num;

I wonder if we should execute this block only if 

if (priv->frame_num != slice_hdr->fram_num) {
...

@@ +3179,3 @@
+      if (tmp_prev_ref_frame_num < 0)
+        tmp_prev_ref_frame_num += MaxFrameNum;
+      priv->prev_ref_frame_num = tmp_prev_ref_frame_num;

this assignation seems to be spurious, since the assignation is done unconditionally below.
Comment 11 Julian Bouzas 2017-02-03 11:08:19 UTC
Hi Hyunjun,

Your patch works like a charm, the segfault does not happen any more :)

If you guys are happy with that patch, we will turn off the rtsp source and mark the bug as fixed.

Thanks.
Comment 12 Víctor Manuel Jáquez Leal 2017-02-03 11:18:45 UTC
(In reply to Julian Bouzas from comment #11)
> Hi Hyunjun,
> 
> Your patch works like a charm, the segfault does not happen any more :)

\o/

> If you guys are happy with that patch, we will turn off the rtsp source and
> mark the bug as fixed.

Just let us to mark it as fixed when we merge it ;)
Comment 13 Hyunjun Ko 2017-02-06 02:56:27 UTC
Created attachment 345007 [details] [review]
libs: decoder: h264: reduce frame number of gaps

Reduce frame num gaps so that we don't have to create unnecessary
dummy pictures, just to throw them away


Sanitize previous patch following to Victor's comments.
Comment 14 Hyunjun Ko 2017-02-06 03:03:34 UTC
Created attachment 345008 [details] [review]
libs: decoder: h264: reduce frame number of gaps

Reduce frame num gaps so that we don't have to create unnecessary
dummy pictures, just to throw them away


Sanitize previous patch following to Victor's comments.
Comment 15 Víctor Manuel Jáquez Leal 2017-02-06 18:18:06 UTC
Review of attachment 345008 [details] [review]:

::: gst-libs/gst/vaapi/gstvaapidecoder_h264.c
@@ +3172,3 @@
+      priv->frame_num -= MaxFrameNum;
+
+    if (slice_hdr->frame_num - priv->frame_num - 1 > sps->num_ref_frames)

is that -1 correct? it wasn't before, nor in the libav code.

@@ +3176,3 @@
+
+    if (priv->frame_num < 0)
+      priv->frame_num += MaxFrameNum;

in the libav this conditional is nested in the above if... is this correct?
Comment 16 Hyunjun Ko 2017-02-07 05:48:06 UTC
(In reply to Víctor Manuel Jáquez Leal from comment #15)
> Review of attachment 345008 [details] [review] [review]:
> 
> ::: gst-libs/gst/vaapi/gstvaapidecoder_h264.c
> @@ +3172,3 @@
> +      priv->frame_num -= MaxFrameNum;
> +
> +    if (slice_hdr->frame_num - priv->frame_num - 1 > sps->num_ref_frames)
> 
> is that -1 correct? it wasn't before, nor in the libav code.
I found if (slice_hdr->frame_num - priv->frame_num - 1) equals sps->num_ref_frames, it doesn't need to reset prev_ref_frame_num. But without -1, it's working likewise too, though it reset prev_ref_frame_num.

> 
> @@ +3176,3 @@
> +
> +    if (priv->frame_num < 0)
> +      priv->frame_num += MaxFrameNum;
> 
> in the libav this conditional is nested in the above if... is this correct?
This is a bit tricky, but I think it's correct.
And I don't think it's nested, since priv->frame_num is reset already.

Let's say, slice hdr frame is 1, sps->num ref frames is 1.
Then prev frame is -1, so it should be reset to assuming previous value which is positive.
Comment 17 Hyunjun Ko 2017-02-07 07:44:51 UTC
Created attachment 345085 [details] [review]
libs: decoder: h264: reduce frame number of gaps

Reduce frame num gaps so that we don't have to create unnecessary
dummy pictures, just to throw them away
Comment 18 Víctor Manuel Jáquez Leal 2017-02-07 10:43:51 UTC
Attachment 345085 [details] pushed as d89a3bd - libs: decoder: h264: reduce frame number of gaps
Comment 19 Víctor Manuel Jáquez Leal 2017-02-07 11:09:36 UTC
Also pushed in branch 1.10

commit de29f03361cd1c55195322be571582c3d4d0392b
Author: Hyunjun Ko <zzoon@igalia.com>
Date:   Tue Feb 7 16:17:39 2017 +0900

    libs: decoder: h264: reduce frame number of gaps

    Reduce frame num gaps so that we don't have to create unnecessary
    dummy pictures, just throw them away.

    Signed-off-by: Víctor Manuel Jáquez Leal <victorx.jaquez@intel.com>

    https://bugzilla.gnome.org/show_bug.cgi?id=777506