GNOME Bugzilla – Bug 787124
vaapih264dec: segmentation fault when decoding kodi sample
Last modified: 2018-08-28 22:08:47 UTC
From http://kodi.wiki/view/Samples I downloaded file "MVC 3D MKV" (https://drive.google.com/file/d/0BwxFVkl63-lETWMzM1dRZF9XMDA/view?usp=sharing), run: gst-launch-1.0 -q filesrc location=3D-full-MVC.mkv ! matroskademux ! h264parse ! vaapih264dec base-only=FALSE ! vaapisink The segfault seems to happen in gstvaapidecoder_h264.c:4010: > f0 = fs->buffers[0]; The video plays with base-only=TRUE.
Created attachment 359753 [details] [review] libs: decoder: h264: reset context when the number of view is increased Usually in case of MVC decoding, dpb size is increasedi if subset sps. That's why it resets context without this patch. But for some media it doesn't increase dpb size. Even in this case we should reset context to deal with MVC decoding. Otherwise, it leads to assert.
This is because there isn't prev_frames[1] for view 1 since it didn't reset context even though it found out it's mvc decoded. But after this patch, it seems that there's still another problem to decode this sample, which I don't know. Nonetheless we need to avoid this crash at least.
Another problem is that decoder sends a bunch of warnings as below during decoding and decoded images are distorted. gstvaapidecoder_h264.c:2374:find_inter_view_reference: failed to find inter-view reference picture for view_id: 0 gstvaapidecoder_h264.c:2934:exec_picture_refs_modification_1: list 0 entry 1 is empty
Comment on attachment 359753 [details] [review] libs: decoder: h264: reset context when the number of view is increased the patch fixes the segmentation fault, but I am not sure if it is the correct fix for a broken/extreme sample (where the sps id is not correct). Perhaps the problem comes from the h264parser
@Victor, I am looking at this issue. I have tested Hunjun's patch with some MVC clips, and there is no regression of this patch. I think we can merge this patch. Since the stream is a broken one, we just need to make sure there is no segment fault issue. Decoded images are distorted can be acceptable. After the patch merged, we can close this issue. How do you think of this?
@Fei, as far as I understand, the patch hides the problem. That's why I was delaying it in order to find some time to look a real fix. We could apply this patch, close this bug, but we should open a new one, claiming for the real fix. What do you think?
(In reply to Víctor Manuel Jáquez Leal from comment #6) > @Fei, as far as I understand, the patch hides the problem. That's why I was > delaying it in order to find some time to look a real fix. > > We could apply this patch, close this bug, but we should open a new one, > claiming for the real fix. What do you think? Hi Victor, what is your expect of real fix? Do you expect to decode the stream correctly when set base-only as false? Or just can fix the segment fault issue without hide issue? By the way, is there cmdline to reproduce the hide issue?
(In reply to Fei from comment #7) > (In reply to Víctor Manuel Jáquez Leal from comment #6) > > @Fei, as far as I understand, the patch hides the problem. That's why I was > > delaying it in order to find some time to look a real fix. > > > > We could apply this patch, close this bug, but we should open a new one, > > claiming for the real fix. What do you think? > > Hi Victor, what is your expect of real fix? I have no time frames or expectations now so far. > Do you expect to decode the stream correctly when set base-only as false? It should! as ffmpeg does with vaapi > Or just can fix the segment > fault issue without hide issue? By the way, is there cmdline to reproduce > the hide issue? As I told you, we could commit the patch to avoid the segfault, close this bug report, and open a new one, to handle this type of bad formatted streams.
(In reply to Víctor Manuel Jáquez Leal from comment #8) > (In reply to Fei from comment #7) > > (In reply to Víctor Manuel Jáquez Leal from comment #6) > > > @Fei, as far as I understand, the patch hides the problem. That's why I was > > > delaying it in order to find some time to look a real fix. > > > > > > We could apply this patch, close this bug, but we should open a new one, > > > claiming for the real fix. What do you think? > > > > Hi Victor, what is your expect of real fix? > > I have no time frames or expectations now so far. > > > Do you expect to decode the stream correctly when set base-only as false? > > It should! as ffmpeg does with vaapi > > > Or just can fix the segment > > fault issue without hide issue? By the way, is there cmdline to reproduce > > the hide issue? > > As I told you, we could commit the patch to avoid the segfault, close this > bug report, and open a new one, to handle this type of bad formatted streams. Thanks Victor, could you help to merge the patch, I don't have access to merge it.
vlc plays it cleanly, but spams this warning: [h264 @ 0x7ffa88c30e00] sps_id 1 out of range
Patch is not good. If you run with valgrind, you still see a lot of use after free before playback becomes being jittery.
I had some time to test this. A couple comments 1. With the patches from bug 796169 and the proposed patch here, the artifacts are lesser than before (the stream was impossible to see) 2. Perhaps I doing something wrong, but I don't perceive the jittery as stated in comment 11 3. Neither the "use after free" in valgrind (again, I might not doing it correctly) 4. Oddly playing this stream, with the patch, nothing is rendered using glimagesink, meanwhile using vaapisink or xvimagesink it is OK. The problem with the stream is that, in certain point it announces two views, but they are not available, only one. Thus, imho, the patch looks correct but we need to ignore max_views changes when there aren't those views available, if this is possible.
Created attachment 373166 [details] Valgrind logs I retested and the jittery issue is gone (different PC though). But valgrind still complains of read after free, so I just attached the logs. On FFMPEG side, they have a single clean warning: libav :0:: sps_id 1 out of range We don't have an equivalent warning, combined with valgrind logs, I assume we have no bound checking.
(The glimagesink issue should be filed seperately, it seems some streams render in glimagesink, some don't, and on EGL it flicker to black a lot).
Created attachment 373168 [details] [review] h264decoder: Fail decoding slice with missing inter-view reference Similarly to previous patch, we have no error concealment. As a side effect, it's better to skip slices with missing references then passing NULL pointers to the accelerator. Passing NULL pointer would lead to major visual artifact, a behaviour that is likely undefined.
This one just fixed the rendering issue, with small freeze moment, it's on top of the reset_context change. It's all still in pretty bad shape, but renders ok I guess. The main concern I have is this DPB use after free, and the decreasing timestamps.
Comment on attachment 373168 [details] [review] h264decoder: Fail decoding slice with missing inter-view reference The code is correct, the ref has been given to the unit already.
Comment on attachment 373168 [details] [review] h264decoder: Fail decoding slice with missing inter-view reference Wrong patch sorry.
Comment on attachment 373168 [details] [review] h264decoder: Fail decoding slice with missing inter-view reference Attachment 373168 [details] pushed as 7a120c7 - h264decoder: Fail decoding slice with missing inter-view reference
Comment on attachment 359753 [details] [review] libs: decoder: h264: reset context when the number of view is increased commit 038c62011f009453d1609a4a79aeb4ccd78107eb Author: Hyunjun Ko <zzoon@igalia.com> Date: Thu Sep 14 14:25:41 2017 +0900 libs: decoder: h264: reset context when the number of view is increased Usually in case of MVC decoding, dpb size is increasedi if subset sps. That's why it resets context without this patch. But for some media it doesn't increase dpb size. Even in this case we should reset context to deal with MVC decoding. Otherwise, it leads to assert. https://bugzilla.gnome.org/show_bug.cgi?id=787124
Leaving open, as the use after free need to be looked at. Anyone else can reproduce ?
Created attachment 373468 [details] [review] vaapidecoder_h264: Avoid use after free In some cases, the found_picture ended up being evicted and freed, which would lead to a use after free when accessing picture->base.poc. In this fix, we take a ref on the picture before calling dpb_evict.
Review of attachment 373468 [details] [review]: Great! Thanks! Just a couple nitpicks: in the commit summary, we have being using a format. In this case it would be: "libs: decoder: h264: avoid use picture after free it" ::: gst-libs/gst/vaapi/gstvaapidecoder_h264.c @@ +943,3 @@ return FALSE; + gst_vaapi_mini_object_ref (GST_VAAPI_MINI_OBJECT (found_picture)); what about using gst_vaapi_picture_ref() for sake of homogeneity. Later, if we decide to change the gstvaapiminiobject to gstminiobject, we could search and replace easily. @@ +958,3 @@ + +done: + gst_vaapi_mini_object_unref (GST_VAAPI_MINI_OBJECT (found_picture)); ditto, gst_vaapi_picture_unref()
Oh, just missed that define. I'll follow your commit convention.
Created attachment 373487 [details] [review] libs: decoder: h264: Avoid using picture after it has been free In some cases, the found_picture ended up being evicted and freed, which would lead to a use after free when accessing picture->base.poc. In this fix, we take a ref on the picture before calling dpb_evict.
Comment on attachment 373487 [details] [review] libs: decoder: h264: Avoid using picture after it has been free Attachment 373487 [details] pushed as 2922439 - libs: decoder: h264: Avoid using picture after it has been free