After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 750754 - h264 decoder: One too many unref of a codecframe
h264 decoder: One too many unref of a codecframe
Status: RESOLVED OBSOLETE
Product: GStreamer
Classification: Platform
Component: gstreamer-vaapi
git master
Other Linux
: High normal
: git master
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks: 758397
 
 
Reported: 2015-06-11 02:21 UTC by Olivier Crête
Modified: 2018-11-03 15:47 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
vaapidecoder_h264: Don't keep dropped frames (1.86 KB, patch)
2015-06-12 00:02 UTC, Olivier Crête
reviewed Details | Review
vaapidecoder_h264: Don't try to access invalid prev_picture (1.06 KB, patch)
2015-06-12 00:03 UTC, Olivier Crête
rejected Details | Review

Description Olivier Crête 2015-06-11 02:21:33 UTC
The most obvious symptom is:
** (mmfd:10989): CRITICAL **: gst_video_codec_frame_unref: assertion 'frame->ref_count > 0' failed


With some interlaced H.264 streams, in the presence of packet loss (simulated with identity drop-probablity=0.01), the first field decoding fails in dpb_dump() (from gst_vaapi_decoder_decode->do_decode->do_decode_1->end_frame->decode_current_picture->dpb_add), this results in gst_vaapi_decoder_decode returning an error and gst_video_decoder_drop_frame() being called.

The problem is that before failing in dpb_bump(), the picture has been stored in a framestore which has been added to priv->prev_frames. And from prev_frames, find_first_field() retrieves it on the next frame and uses it as a base, which will cause it to push outputted by pushing it to gst_video_decoder_finish_frame.

This caues a problem, as when handle_frame is called, a single reference is given to the subclass, calling drop_frame() drops this reference... But calling finish_frame expects another reference!

I'm not sure how to solve that problem exactly.
Comment 1 Gwenole Beauchesne 2015-06-11 12:26:28 UTC
Hi, could you please report the git master hash you are using? Normally, outputted frames are no longer pushed again with the "ghost" frame mechanism, when reconstructing a frame solely used for decoding. Thanks.
Comment 2 Gwenole Beauchesne 2015-06-11 15:24:29 UTC
I could not reproduce this issue with identity drop-probability=0.01. However, I got a collateral issue (crash) from the previous fixes with drop-probability=0.05. This is unrelated to what you are relating here, but worth mentioning this is the only issue I found so far.
Comment 3 Olivier Crête 2015-06-11 23:20:12 UTC
With today's git master, I indeed need to bump up the drop probability to get crashes. I still get the gst_video_codec_frame_unref() assertion, but not as often and I'm having a hard time debugging it.

That said, I got onto another crash, in fill_picture_gaps, at line 3478, it uses prev_picture to get the poc, but prev_picture can be NULL, so it segfaults!
Comment 4 Olivier Crête 2015-06-12 00:02:44 UTC
Created attachment 305103 [details] [review]
vaapidecoder_h264: Don't keep dropped frames

This is meant to fix the original problem in this bug, but it feels like a workaround, I don't understand the rest of the code well enough to write the correct patch, so please review like 5 times before merging anything!

This prevent frames that couldn't be inserted in the DPB from being
re-vived through the prev_frames array. Ideally, we should instead
figure out if a frame is in prev_frames and delay dropping it until later...
Comment 5 Olivier Crête 2015-06-12 00:03:02 UTC
Created attachment 305104 [details] [review]
vaapidecoder_h264: Don't try to access invalid prev_picture

I have no idea if it's the correct fix or not, but it does prevent the crash!
Comment 6 Gwenole Beauchesne 2015-06-12 09:02:00 UTC
(In reply to Olivier Crête from comment #3)
> That said, I got onto another crash, in fill_picture_gaps, at line 3478, it
> uses prev_picture to get the poc, but prev_picture can be NULL, so it
> segfaults!

Yes, that was the other issue I was talking about. :)
Comment 7 Gwenole Beauchesne 2015-06-12 09:08:06 UTC
(In reply to Olivier Crête from comment #4)
> Created attachment 305103 [details] [review] [review]
> vaapidecoder_h264: Don't keep dropped frames
> 
> This is meant to fix the original problem in this bug, but it feels like a
> workaround, I don't understand the rest of the code well enough to write the
> correct patch, so please review like 5 times before merging anything!

The right would probably to move down the replacement of the previous frame store, after the current (new) frame store was correctly added, at the end of the function. This would, however, require additional regression testing.
Comment 8 sreerenj 2015-06-23 10:07:03 UTC
How could I reproduce this issue, ? bump up the drop-probability is not hitting any issue for me...

Olivier, could you please share some interlaced content which I can use to reproduce the issue???
Comment 9 sreerenj 2015-06-23 10:11:36 UTC
Gwenole, are you working on this??

Honestly I am bit worried to push the fix for  0.6 release..specifically anything which changes dpb and ref_pic management... :)
I wonder whether we can consider it as a corner case,,,
Will it be a problem for you guys if I postpone it for next release??
Comment 10 Gwenole Beauchesne 2015-06-23 11:15:59 UTC
(In reply to sreerenj from comment #9)
> Gwenole, are you working on this??

I was, and IIRC, locally part 1 of the fix was to assign a correct POC to missing frames, and the other part is not implemented. In my testing, I were not too successful at reproducing the second issue either (by increasing drop-probability). Will try again with another clip.

> Honestly I am bit worried to push the fix for  0.6 release..specifically
> anything which changes dpb and ref_pic management... :)
> I wonder whether we can consider it as a corner case,,,
> Will it be a problem for you guys if I postpone it for next release??

gstreamer-vaapi is the only VA enabled stack that almost correctly supports lost frames, so this is already a great improvement that is worth testing and that last corner could possibly wait for 0.6.1.
Comment 11 Olivier Crête 2015-06-23 16:30:50 UTC
Would it be possible to make the replacement frames not green (like gray or something?). Also, it would be nice to flag frames that referenced a fake frame so that we can later mark the resulthing buffers with GST_BUFFER_FLAG_CORRUPTED  like the libav decoder does.
Comment 12 Gwenole Beauchesne 2015-06-24 08:56:58 UTC
(In reply to Olivier Crête from comment #11)
> Would it be possible to make the replacement frames not green (like gray or
> something?).

No, this is the default color that results from having bits set to 0, when allocated. Better option is to see why "green frames" still occur.

> Also, it would be nice to flag frames that referenced a fake
> frame so that we can later mark the resulthing buffers with
> GST_BUFFER_FLAG_CORRUPTED  like the libav decoder does.

OK.
Comment 13 Gwenole Beauchesne 2015-06-24 15:15:31 UTC
Updates:
- There seems to be a leak of codec frames under some occasions ;
- Some "green frames" can occur when previous picture with the last recorded PrevRefFrameNum disappeared.

We could:
- Retain the last picture used as reference (+) ;
- Find whatever picture with probable POC and use it (-).

BTW,
I am not too convinced to get into the init_picture_poc() path for prev_picture == NULL case, but we need to derive a convincing enough POC value for the missing frame. Possibility is to use Min(POC(frames in DPB)) - 2, so that the fake picture is evicted soonish if no longer referenced (this is what matters).

The "Don't keep dropped frames" patch is somewhat acceptable but is missing one case where return FALSE is possible. An easy change could be to s/dpb_add/dpb_add_1/ and make dpb_add() reset prev_frames[] to NULL if dpb_add_1() returned FALSE.
Comment 14 Nicolas Dufresne (ndufresne) 2018-09-06 02:27:09 UTC
I would like to verify if this issue still exist, but the report does not provide steps to reproduce. Would it be possible to document how to reproduce ?
Comment 15 GStreamer system administrator 2018-11-03 15:47:09 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gstreamer-vaapi/issues/33.