After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 787124 - vaapih264dec: segmentation fault when decoding kodi sample
vaapih264dec: segmentation fault when decoding kodi sample
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gstreamer-vaapi
git master
Other Linux
: Normal normal
: 1.15.1
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2017-09-01 13:30 UTC by Orestis Floros
Modified: 2018-08-28 22:08 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
libs: decoder: h264: reset context when the number of view is increased (1.15 KB, patch)
2017-09-14 05:32 UTC, Hyunjun Ko
committed Details | Review
Valgrind logs (1.28 MB, text/x-log)
2018-07-25 20:15 UTC, Nicolas Dufresne (ndufresne)
  Details
h264decoder: Fail decoding slice with missing inter-view reference (5.20 KB, patch)
2018-07-25 21:07 UTC, Nicolas Dufresne (ndufresne)
committed Details | Review
vaapidecoder_h264: Avoid use after free (1.56 KB, patch)
2018-08-28 00:44 UTC, Nicolas Dufresne (ndufresne)
reviewed Details | Review
libs: decoder: h264: Avoid using picture after it has been free (1.53 KB, patch)
2018-08-28 22:07 UTC, Nicolas Dufresne (ndufresne)
committed Details | Review

Description Orestis Floros 2017-09-01 13:30:52 UTC
From http://kodi.wiki/view/Samples I downloaded file "MVC 3D MKV" (https://drive.google.com/file/d/0BwxFVkl63-lETWMzM1dRZF9XMDA/view?usp=sharing), run:

gst-launch-1.0 -q filesrc location=3D-full-MVC.mkv ! matroskademux ! h264parse ! vaapih264dec base-only=FALSE ! vaapisink

The segfault seems to happen in gstvaapidecoder_h264.c:4010:

> f0 = fs->buffers[0];

The video plays with base-only=TRUE.
Comment 1 Hyunjun Ko 2017-09-14 05:32:06 UTC
Created attachment 359753 [details] [review]
libs: decoder: h264: reset context when the number of view is  increased

Usually in case of MVC decoding, dpb size is increasedi if subset sps.
That's why it resets context without this patch.
But for some media it doesn't increase dpb size. Even in this case we
should reset context to deal with MVC decoding.
Otherwise, it leads to assert.
Comment 2 Hyunjun Ko 2017-09-14 05:37:25 UTC
This is because there isn't prev_frames[1] for view 1 since it didn't reset context even though it found out it's mvc decoded.

But after this patch, it seems that there's still another problem to decode this sample, which I don't know.
Nonetheless we need to avoid this crash at least.
Comment 3 Hyunjun Ko 2017-09-15 09:36:22 UTC
Another problem is that decoder sends a bunch of warnings as below during decoding and decoded images are distorted.

gstvaapidecoder_h264.c:2374:find_inter_view_reference: failed to find inter-view reference picture for view_id: 0
gstvaapidecoder_h264.c:2934:exec_picture_refs_modification_1: list 0 entry 1 is empty
Comment 4 Víctor Manuel Jáquez Leal 2018-05-04 16:15:57 UTC
Comment on attachment 359753 [details] [review]
libs: decoder: h264: reset context when the number of view is  increased

the patch fixes the segmentation fault, but I am not sure if it is the correct fix for a broken/extreme sample (where the sps id is not correct). Perhaps the problem comes from the h264parser
Comment 5 Fei 2018-07-16 05:57:28 UTC
@Victor, I am looking at this issue. I have tested Hunjun's patch with some MVC clips, and there is no regression of this patch. I think we can merge this patch. Since the stream is a broken one, we just need to make sure there is no segment fault issue. Decoded images are distorted can be acceptable. After the patch merged, we can close this issue. How do you think of this?
Comment 6 Víctor Manuel Jáquez Leal 2018-07-16 11:27:52 UTC
@Fei, as far as I understand, the patch hides the problem. That's why I was delaying it in order to find some time to look a real fix.

We could apply this patch, close this bug, but we should open a new one, claiming for the real fix. What do you think?
Comment 7 Fei 2018-07-17 01:24:56 UTC
(In reply to Víctor Manuel Jáquez Leal from comment #6)
> @Fei, as far as I understand, the patch hides the problem. That's why I was
> delaying it in order to find some time to look a real fix.
> 
> We could apply this patch, close this bug, but we should open a new one,
> claiming for the real fix. What do you think?

Hi Victor, what is your expect of real fix? Do you expect to decode the stream correctly when set base-only as false? Or just can fix the segment fault issue without hide issue? By the way, is there cmdline to reproduce the hide issue?
Comment 8 Víctor Manuel Jáquez Leal 2018-07-19 09:44:29 UTC
(In reply to Fei from comment #7)
> (In reply to Víctor Manuel Jáquez Leal from comment #6)
> > @Fei, as far as I understand, the patch hides the problem. That's why I was
> > delaying it in order to find some time to look a real fix.
> > 
> > We could apply this patch, close this bug, but we should open a new one,
> > claiming for the real fix. What do you think?
> 
> Hi Victor, what is your expect of real fix? 

I have no time frames or expectations now so far.

> Do you expect to decode the stream correctly when set base-only as false?

It should! as ffmpeg does with vaapi

>  Or just can fix the segment
> fault issue without hide issue? By the way, is there cmdline to reproduce
> the hide issue?

As I told you, we could commit the patch to avoid the segfault, close this bug report, and open a new one, to handle this type of bad formatted streams.
Comment 9 Fei 2018-07-20 06:09:43 UTC
(In reply to Víctor Manuel Jáquez Leal from comment #8)
> (In reply to Fei from comment #7)
> > (In reply to Víctor Manuel Jáquez Leal from comment #6)
> > > @Fei, as far as I understand, the patch hides the problem. That's why I was
> > > delaying it in order to find some time to look a real fix.
> > > 
> > > We could apply this patch, close this bug, but we should open a new one,
> > > claiming for the real fix. What do you think?
> > 
> > Hi Victor, what is your expect of real fix? 
> 
> I have no time frames or expectations now so far.
> 
> > Do you expect to decode the stream correctly when set base-only as false?
> 
> It should! as ffmpeg does with vaapi
> 
> >  Or just can fix the segment
> > fault issue without hide issue? By the way, is there cmdline to reproduce
> > the hide issue?
> 
> As I told you, we could commit the patch to avoid the segfault, close this
> bug report, and open a new one, to handle this type of bad formatted streams.

Thanks Victor, could you help to merge the patch, I don't have access to merge it.
Comment 10 Nicolas Dufresne (ndufresne) 2018-07-23 01:44:46 UTC
vlc plays it cleanly, but spams this warning:

[h264 @ 0x7ffa88c30e00] sps_id 1 out of range
Comment 11 Nicolas Dufresne (ndufresne) 2018-07-23 01:53:35 UTC
Patch is not good. If you run with valgrind, you still see a lot of use after free before playback becomes being jittery.
Comment 12 Víctor Manuel Jáquez Leal 2018-07-25 19:30:09 UTC
I had some time to test this. A couple comments

1. With the patches from bug 796169 and the proposed patch here, the artifacts are lesser than before (the stream was impossible to see)

2. Perhaps I doing something wrong, but I don't perceive the jittery as stated in comment 11

3. Neither the "use after free" in valgrind (again, I might not doing it correctly)

4. Oddly playing this stream, with the patch, nothing is rendered using glimagesink, meanwhile using vaapisink or xvimagesink it is OK.

The problem with the stream is that, in certain point it announces two views, but they are not available, only one. Thus, imho, the patch looks correct but we need to ignore max_views changes when there aren't those views available, if this is possible.
Comment 13 Nicolas Dufresne (ndufresne) 2018-07-25 20:15:20 UTC
Created attachment 373166 [details]
Valgrind logs

I retested and the jittery issue is gone (different PC though). But valgrind still complains of read after free, so I just attached the logs.

On FFMPEG side, they have a single clean warning:

  libav :0:: sps_id 1 out of range

We don't have an equivalent warning, combined with valgrind logs, I assume we have no bound checking.
Comment 14 Nicolas Dufresne (ndufresne) 2018-07-25 20:16:56 UTC
(The glimagesink issue should be filed seperately, it seems some streams render in glimagesink, some don't, and on EGL it flicker to black a lot).
Comment 15 Nicolas Dufresne (ndufresne) 2018-07-25 21:07:16 UTC
Created attachment 373168 [details] [review]
h264decoder: Fail decoding slice with missing inter-view reference

Similarly to previous patch, we have no error concealment. As a side
effect, it's better to skip slices with missing references then passing
NULL pointers to the accelerator. Passing NULL pointer would lead to
major visual artifact, a behaviour that is likely undefined.
Comment 16 Nicolas Dufresne (ndufresne) 2018-07-25 21:11:55 UTC
This one just fixed the rendering issue, with small freeze moment, it's on top of the reset_context change. It's all still in pretty bad shape, but renders ok I guess. The main concern I have is this DPB use after free, and the decreasing timestamps.
Comment 17 Nicolas Dufresne (ndufresne) 2018-08-01 23:48:15 UTC
Comment on attachment 373168 [details] [review]
h264decoder: Fail decoding slice with missing inter-view reference

The code is correct, the ref has been given to the unit already.
Comment 18 Nicolas Dufresne (ndufresne) 2018-08-01 23:48:39 UTC
Comment on attachment 373168 [details] [review]
h264decoder: Fail decoding slice with missing inter-view reference

Wrong patch sorry.
Comment 19 Nicolas Dufresne (ndufresne) 2018-08-01 23:50:12 UTC
Comment on attachment 373168 [details] [review]
h264decoder: Fail decoding slice with missing inter-view reference

Attachment 373168 [details] pushed as 7a120c7 - h264decoder: Fail decoding slice with missing inter-view reference
Comment 20 Nicolas Dufresne (ndufresne) 2018-08-01 23:56:32 UTC
Comment on attachment 359753 [details] [review]
libs: decoder: h264: reset context when the number of view is  increased

commit 038c62011f009453d1609a4a79aeb4ccd78107eb
Author: Hyunjun Ko <zzoon@igalia.com>
Date:   Thu Sep 14 14:25:41 2017 +0900

    libs: decoder: h264: reset context when the number of view is increased
    
    Usually in case of MVC decoding, dpb size is increasedi if subset sps.
    That's why it resets context without this patch.
    But for some media it doesn't increase dpb size. Even in this case we
    should reset context to deal with MVC decoding.
    Otherwise, it leads to assert.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=787124
Comment 21 Nicolas Dufresne (ndufresne) 2018-08-02 02:10:47 UTC
Leaving open, as the use after free need to be looked at. Anyone else can reproduce ?
Comment 22 Nicolas Dufresne (ndufresne) 2018-08-28 00:44:01 UTC
Created attachment 373468 [details] [review]
vaapidecoder_h264: Avoid use after free

In some cases, the found_picture ended up being evicted and freed, which
would lead to a use after free when accessing picture->base.poc. In this
fix, we take a ref on the picture before calling dpb_evict.
Comment 23 Víctor Manuel Jáquez Leal 2018-08-28 15:32:56 UTC
Review of attachment 373468 [details] [review]:

Great! Thanks!

Just a couple nitpicks:

in the commit summary, we have being using a format. In this case it would be: "libs: decoder: h264: avoid use picture after free it"

::: gst-libs/gst/vaapi/gstvaapidecoder_h264.c
@@ +943,3 @@
     return FALSE;
 
+  gst_vaapi_mini_object_ref (GST_VAAPI_MINI_OBJECT (found_picture));

what about using gst_vaapi_picture_ref() for sake of homogeneity. Later, if we decide to change the gstvaapiminiobject to gstminiobject, we could search and replace easily.

@@ +958,3 @@
+
+done:
+  gst_vaapi_mini_object_unref (GST_VAAPI_MINI_OBJECT (found_picture));

ditto, gst_vaapi_picture_unref()
Comment 24 Nicolas Dufresne (ndufresne) 2018-08-28 18:40:29 UTC
Oh, just missed that define. I'll follow your commit convention.
Comment 25 Nicolas Dufresne (ndufresne) 2018-08-28 22:07:45 UTC
Created attachment 373487 [details] [review]
libs: decoder: h264: Avoid using picture after it has been free

In some cases, the found_picture ended up being evicted and freed, which
would lead to a use after free when accessing picture->base.poc. In this
fix, we take a ref on the picture before calling dpb_evict.
Comment 26 Nicolas Dufresne (ndufresne) 2018-08-28 22:08:02 UTC
Comment on attachment 373487 [details] [review]
libs: decoder: h264: Avoid using picture after it has been free

Attachment 373487 [details] pushed as 2922439 - libs: decoder: h264: Avoid using picture after it has been free