GNOME Bugzilla – Bug 788918
vaapi: supports hierarchical prediction encoding
Last modified: 2017-11-08 12:59:31 UTC
This is to support hierarchical prediction encoding for H264. I bring patches from https://cgit.freedesktop.org/~sree/gstreamer-vaapi/log/?h=advanced_enc and fix something major and minor. This includes: - Implements Temporal scalable encoding - Implements hierarchical P frames. - Implements hierarchical B frames. For more information, see https://bugzilla.gnome.org/show_bug.cgi?id=725536#c9
Created attachment 361486 [details] [review] libs: encoder: objects: Add a reference flag We can have p-frame as non-ref and also b-frame as ref which are not supported yet. Reference flag is the first machinery needed for more advanced reference picture selection modes.
Created attachment 361487 [details] [review] libs: encoder: h264: Add property "temporal-levels" Adds new property "temporal-levels" to select the number of temporal levels to be included in the encoded stream.
Created attachment 361488 [details] [review] libs: encoder: objects: Add a global field to track the temporal level Adds a temporal_id field in encoder picture base class to track the temporal level each frame belongs to
Created attachment 361489 [details] [review] libs: encoder: h264: Add machinery for implementing hierarchical-prediction Adds some basic building blocks to ease the implementation of hierarchical prediction modes. -- add an utility method to find temporal level of each frame -- define max_ref_frame count based on temporal level count -- add temporal_level_div[] for finding temporal level each frame to be encoded. -- find ip_period based on temporal level count
Created attachment 361490 [details] [review] libs: encoder: h264: Add new property "prediction-type" Adds new property "prediction-type" to select different reference picture selection modes like hierarchical-p, hierarchical-b etc.
Created attachment 361492 [details] [review] libs: encoder: h264: Fix frame_num generation The frame_num generation was not correctly implemented. According to h264 spec, frame_num should get incremented for each frame if previous frame is a referece frame. For eg: IPBPB sequece should have the frame numbers 0,1,2,2,3
Created attachment 361493 [details] [review] libs: encoder: h264: Add Hierarchical-P encode Frames are encoded as differnt layers. A frame in a particular layer will use pictures in lower or same layer as references. Which means decoder can drop the frames in upper layer ,but still decode lower layer frames. eg: with 3 temporal layers T3: P1 P3 P5 P7 T2: P2 P6 T1: P0 P4 P8 T1, T2, T3: Temporal Layers P1...pn: P-Frames: P0->P1 , P0->P2, P2->P3, P0->P4......repeat
Created attachment 361494 [details] [review] libs: encoder: h264: Add Hierarchical-B encode Frames are encoded as differnt layers. Frame in a particular layer will use pictures in lower or same layer as references. Which means decoder can drop the frames in upper layer ,but still decode lower layer frames. B-frames, except the one in top most layer are reference frames, all th base layer frames are I or P. eg: with 3 temporal layers T3: B1 B3 B5 B7 T2: B2 B6 T1: I0 P4 P8 T1, T2, T3: Temporal Layers P1...Pn: P-Frames: B1...Bn: B-frames: T1: I0->P4 , P4->P8 etc.. T2: I0--> B2 <-- P4 T3: I0--> B1 <-- B2, B2 --> B3 <-- P4
Review of attachment 361486 [details] [review]: lgtm
Review of attachment 361487 [details] [review]: ::: gst-libs/gst/vaapi/gstvaapiencoder_h264.c @@ +2995,3 @@ encoder->num_views = 1; encoder->view_idx = 0; + encoder->temporal_levels = 1; what about use MIN_TEMPORAL_LEVELS ?
Review of attachment 361488 [details] [review]: this patch doesn't make sense to me. It should be merged to other patch where logic is added, or just removed.
Review of attachment 361489 [details] [review]: lgtm
Review of attachment 361490 [details] [review]: ::: gst-libs/gst/vaapi/gstvaapiencoder_h264.c @@ +114,3 @@ + GST_VAAPI_ENCODER_H264_PREDICTION_DEFAULT = 0, + GST_VAAPI_ENCODER_H264_PREDICTION_HIERARCHICAL_P = 1, + GST_VAAPI_ENCODER_H264_PREDICTION_HIERARCHICAL_B = 2 AFAIU in this case there is no need to set a manual value to the enums, since that is the default @@ +132,3 @@ + {GST_VAAPI_ENCODER_H264_PREDICTION_HIERARCHICAL_B, + "Hierarchical B frame encode", + "hierarchical-b"} The GEnumValue array must be terminated by a struct with all members being 0. {0, NULL, NULL},
Review of attachment 361492 [details] [review]: ::: gst-libs/gst/vaapi/gstvaapiencoder_h264.c @@ +2852,3 @@ + reorder_pool->prev_frame_is_ref = TRUE; + else + reorder_pool->prev_frame_is_ref = FALSE; this would be simpler as reorder_pool->prev_frame_is_ref = GST_VAAPI_ENC_PICTURE_IS_REFRENCE (picture); or even better reorder_pool->prev_frame_is_ref = GST_VAAPI_ENC_PICTURE_IS_REFRENCE (picture) || GST_VAAPI_ENC_PICTURE_IS_IDR (picture); removing the up assignation
Review of attachment 361493 [details] [review]: ::: gst-libs/gst/vaapi/gstvaapiencoder_h264.c @@ +1931,3 @@ + + if (picture->type != GST_VAAPI_PICTURE_TYPE_B) + return TRUE; these two lines don't make sense. Perhaps they should be merged in the next patch
Review of attachment 361494 [details] [review]: also, please correct the log message: differnt => different, comma usage
Review of attachment 361493 [details] [review]: check also the commit log message
Hyunjun, thanks for bringing these patches to Bugzilla! BTW please make sure the low-power encoding is disabled in Hierarchical-B (B-frames are not supported by the intel driver in VDENC/low-power). @Victor: Thanks for the review :)
Created attachment 361945 [details] [review] libs: encoder: h264: Add property "temporal-levels" Adds new property "temporal-levels" to select the number of temporal levels to be included in the encoded stream.
Created attachment 361946 [details] [review] libs: encoder: h264: Add machinery for implementing hierarchical-prediction Adds some basic building blocks to ease the implementation of hierarchical prediction modes. -- add an utility method to find temporal level of each frame -- define max_ref_frame count based on temporal level count -- add temporal_level_div[] for finding temporal level each frame to be encoded. -- find ip_period based on temporal level count Signed-off-by: Sreerenj Balachandran <sreerenj.balachandran@intel.com> https://bugzilla.gnome.org/show_bug.cgi?id=788918
Created attachment 361947 [details] [review] libs: encoder: h264: Add new property "prediction-type" Adds new property "prediction-type" to select different reference picture selection modes like hierarchical-p, hierarchical-b etc.
Created attachment 361948 [details] [review] libs: encoder: h264: Fix frame_num generation The frame_num generation was not correctly implemented. According to h264 spec, frame_num should get incremented for each frame if previous frame is a referece frame. For eg: IPBPB sequece should have the frame numbers 0,1,2,2,3
Created attachment 361949 [details] [review] libs: encoder: h264: Add Hierarchical-P encode Frames are encoded as different layers. A frame in a particular layer will use pictures in lower or same layer as references. Which means decoder can drop the frames in upper layer but still decode lower layer frames. eg: with 3 temporal layers T3: P1 P3 P5 P7 T2: P2 P6 T1: P0 P4 P8 T1, T2, T3: Temporal Layers P1...pn: P-Frames: P0->P1 , P0->P2, P2->P3, P0->P4......repeat
Created attachment 361950 [details] [review] libs: encoder: h264: Add Hierarchical-B encode Frames are encoded as different layers. Frame in a particular layer will use pictures in lower or same layer as references. Which means decoder can drop the frames in upper layer but still decode lower layer frames. B-frames, except the one in top most layer, are reference frames. All the base layer frames are I or P. eg: with 3 temporal layers T3: B1 B3 B5 B7 T2: B2 B6 T1: I0 P4 P8 T1, T2, T3: Temporal Layers P1...Pn: P-Frames: B1...Bn: B-frames: T1: I0->P4 , P4->P8 etc.. T2: I0--> B2 <-- P4 T3: I0--> B1 <-- B2, B2 --> B3 <-- P4
(In reply to Víctor Manuel Jáquez Leal from comment #14) > Review of attachment 361492 [details] [review] [review]: > > ::: gst-libs/gst/vaapi/gstvaapiencoder_h264.c > @@ +2852,3 @@ > + reorder_pool->prev_frame_is_ref = TRUE; > + else > + reorder_pool->prev_frame_is_ref = FALSE; > > this would be simpler as > > reorder_pool->prev_frame_is_ref = GST_VAAPI_ENC_PICTURE_IS_REFRENCE > (picture); > > or even better > > reorder_pool->prev_frame_is_ref = GST_VAAPI_ENC_PICTURE_IS_REFRENCE > (picture) || GST_VAAPI_ENC_PICTURE_IS_IDR (picture); > > removing the up assignation Thanks for simpler suggesion. I think GST_VAAPI_ENC_PICTURE_IS_IDR is not necessary to see if it's reference or not since it should be set if it's IDR picture beforehand. I modified according to your first suggession.:)
(In reply to sreerenj from comment #18) > Hyunjun, thanks for bringing these patches to Bugzilla! > > > BTW please make sure the low-power encoding is disabled in Hierarchical-B > (B-frames are not supported by the intel driver in VDENC/low-power). > Sure. It disables Hierarchical-B if it's low power encoding mode here. @@ -2695,6 +2778,10 @@ reset_properties (GstVaapiEncoderH264 * encoder) if (base_encoder->max_num_ref_frames_1 < 1 && encoder->num_bframes > 0) { GST_WARNING ("Disabling b-frame since the driver doesn't support it"); encoder->num_bframes = 0; + + if (encoder->prediction_type == + GST_VAAPI_ENCODER_H264_PREDICTION_HIERARCHICAL_B) + encoder->prediction_type = GST_VAAPI_ENCODER_H264_PREDICTION_DEFAULT; }
Attachment 361486 [details] pushed as 89717a4 - libs: encoder: objects: Add a reference flag Attachment 361945 [details] pushed as e49859f - libs: encoder: h264: Add property "temporal-levels" Attachment 361946 [details] pushed as dc58345 - libs: encoder: h264: Add machinery for implementing hierarchical-prediction Attachment 361947 [details] pushed as ff91d62 - libs: encoder: h264: Add new property "prediction-type" Attachment 361948 [details] pushed as 904931d - libs: encoder: h264: Fix frame_num generation Attachment 361949 [details] pushed as 53c8369 - libs: encoder: h264: Add Hierarchical-P encode Attachment 361950 [details] pushed as 6a3f30a - libs: encoder: h264: Add Hierarchical-B encode