GNOME Bugzilla – Bug 705598
regression h264parse: incorrect keyframe/delta-unit detection
Last modified: 2013-08-29 10:42:50 UTC
In the legacy h264parser (pre-baseparse), detecting whether a frame/buffer was a delta-unit or not was based on the slice_type whereas in the new parser it is based on whether the input contains a SPS or PPS or IDR. The new behaviour is wrong since you can have streams that have new PPS on every frame, resulting in all output buffers being marked as keyframes. The previous (correct) behaviour was to check the slice_type, and if it was 2,4,7 or 9 it was considered as a keyframe, else as a delta unit. Amongst other things this breaks (re)muxing of h264 in matroska (all frames are considered keyframes and only one cue/index entry is created).
Created attachment 251090 [details] [review] h264parse: Use slice type to determine if frame is keyframe This is the same behaviour as pre-baseparse-refactoring
Review of attachment 251090 [details] [review]: ::: gst/videoparsers/gsth264parse.c @@ -477,3 @@ /* we have a peek as well */ nal_type = nalu->type; - h264parse->keyframe |= NAL_TYPE_IS_KEY (nal_type); Maybe also get rid of that confusing and wrong macro then? ;)
commit 8074a48594acd17eaa2a2e7ea3fd03d85f41a664 Author: Edward Hervey <bilboed@bilboed.com> Date: Wed Aug 7 09:04:39 2013 +0200 h264parse: Use slice type to determine if frame is keyframe This is the same behaviour as pre-baseparse-refactoring https://bugzilla.gnome.org/show_bug.cgi?id=705598
Two questions: - why not use the macros and enums from gsth264parse.h ? e.g. GST_H264_IS_I_SLICE() and GST_H264_*_SLICE - does a slice always start at the top (line 0)? If not, should we only mark those slices as keyframes?
> Two questions: Edward: ping? ^^
(In reply to comment #4) > Two questions: > > - why not use the macros and enums from gsth264parse.h ? > e.g. GST_H264_IS_I_SLICE() and GST_H264_*_SLICE Because I didn't realize those macros existed :) > > - does a slice always start at the top (line 0)? If not, should > we only mark those slices as keyframes? Good question, I'm not sure. That being said, marking only some slices as keyframes sound a bit weird (key-frame, not key-slice).
> > - does a slice always start at the top (line 0)? If not, should > > we only mark those slices as keyframes? > > Good question, I'm not sure. That being said, marking only some slices as > keyframes sound a bit weird (key-frame, not key-slice). I think if a key unit is split across multiple buffers, we would/should only mark the first buffer as a key unit, not all of them. Think of things like multifdsink that basically keeps data back to the last key unit so it can burst on a client connect, for example. If we marked all buffers as key units, then it would probably throw away the first one and only keep the last chunk, meaning the client will only get a partial frame as key unit on connect.
Maybe we need an additional flag to mark other re-sync units that aren't full keyframes, e.g. for the intra-refresh case - then multifdsink could always keep enough buffers to deliver a whole fresh frame on burst, but a decoder in a live videoconferencing scenario could start decoding on the intra-refresh already to show some pixels.
All not full-sync-points should have the DELTA flag set. Everything else will break assumptions everywhere right now. Adding additional flags for describing more fine-grained sync units would be useful nonetheless. Or a new meta? :) Make that a different bug though. Edward, can you update your patch to use the macros? :)