After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 788918 - vaapi: supports hierarchical prediction encoding
vaapi: supports hierarchical prediction encoding
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gstreamer-vaapi
git master
Other All
: Normal normal
: 1.13.1
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2017-10-13 08:51 UTC by Hyunjun Ko
Modified: 2017-11-08 12:59 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
libs: encoder: objects: Add a reference flag (1.82 KB, patch)
2017-10-13 08:58 UTC, Hyunjun Ko
committed Details | Review
libs: encoder: h264: Add property "temporal-levels" (3.74 KB, patch)
2017-10-13 08:58 UTC, Hyunjun Ko
none Details | Review
libs: encoder: objects: Add a global field to track the temporal level (1.03 KB, patch)
2017-10-13 08:59 UTC, Hyunjun Ko
none Details | Review
libs: encoder: h264: Add machinery for implementing hierarchical-prediction (4.16 KB, patch)
2017-10-13 08:59 UTC, Hyunjun Ko
none Details | Review
libs: encoder: h264: Add new property "prediction-type" (4.34 KB, patch)
2017-10-13 09:00 UTC, Hyunjun Ko
none Details | Review
libs: encoder: h264: Fix frame_num generation (6.25 KB, patch)
2017-10-13 09:01 UTC, Hyunjun Ko
none Details | Review
libs: encoder: h264: Add Hierarchical-P encode (10.89 KB, patch)
2017-10-13 09:02 UTC, Hyunjun Ko
none Details | Review
libs: encoder: h264: Add Hierarchical-B encode (12.72 KB, patch)
2017-10-13 09:02 UTC, Hyunjun Ko
none Details | Review
libs: encoder: h264: Add property "temporal-levels" (3.76 KB, patch)
2017-10-20 13:22 UTC, Hyunjun Ko
committed Details | Review
libs: encoder: h264: Add machinery for implementing hierarchical-prediction (4.67 KB, patch)
2017-10-20 13:24 UTC, Hyunjun Ko
committed Details | Review
libs: encoder: h264: Add new property "prediction-type" (4.37 KB, patch)
2017-10-20 13:25 UTC, Hyunjun Ko
committed Details | Review
libs: encoder: h264: Fix frame_num generation (6.14 KB, patch)
2017-10-20 13:26 UTC, Hyunjun Ko
committed Details | Review
libs: encoder: h264: Add Hierarchical-P encode (10.84 KB, patch)
2017-10-20 13:27 UTC, Hyunjun Ko
committed Details | Review
libs: encoder: h264: Add Hierarchical-B encode (12.84 KB, patch)
2017-10-20 13:28 UTC, Hyunjun Ko
committed Details | Review

Description Hyunjun Ko 2017-10-13 08:51:43 UTC
This is to support hierarchical prediction encoding for H264.

I bring patches from https://cgit.freedesktop.org/~sree/gstreamer-vaapi/log/?h=advanced_enc and fix something major and minor.

This includes:
- Implements Temporal scalable encoding
- Implements hierarchical P frames.
- Implements hierarchical B frames.

For more information, see https://bugzilla.gnome.org/show_bug.cgi?id=725536#c9
Comment 1 Hyunjun Ko 2017-10-13 08:58:25 UTC
Created attachment 361486 [details] [review]
libs: encoder: objects: Add a reference flag

We can have p-frame as non-ref and also b-frame as ref 
which are not supported yet. Reference flag
is the first machinery needed for more advanced
reference picture selection modes.
Comment 2 Hyunjun Ko 2017-10-13 08:58:59 UTC
Created attachment 361487 [details] [review]
libs: encoder: h264: Add property "temporal-levels"

Adds new property "temporal-levels" to select the number of
temporal levels to be included in the encoded stream.
Comment 3 Hyunjun Ko 2017-10-13 08:59:28 UTC
Created attachment 361488 [details] [review]
libs: encoder: objects: Add a global field to track the   temporal level

Adds a temporal_id field in encoder picture base class to track
the temporal level each frame belongs to
Comment 4 Hyunjun Ko 2017-10-13 08:59:57 UTC
Created attachment 361489 [details] [review]
libs: encoder: h264: Add machinery for implementing  hierarchical-prediction

Adds some basic building blocks to ease the implementation
of hierarchical prediction modes.

-- add an utility method to find temporal level of each frame
-- define max_ref_frame count based on temporal level count
-- add temporal_level_div[] for finding temporal level each frame
to be encoded.
-- find ip_period based on temporal level count
Comment 5 Hyunjun Ko 2017-10-13 09:00:40 UTC
Created attachment 361490 [details] [review]
libs: encoder: h264: Add new property "prediction-type"

Adds new property "prediction-type" to select different reference
picture selection modes like hierarchical-p, hierarchical-b etc.
Comment 6 Hyunjun Ko 2017-10-13 09:01:19 UTC
Created attachment 361492 [details] [review]
libs: encoder: h264: Fix frame_num generation

The frame_num generation was not correctly implemented.
According to h264 spec, frame_num should get incremented
for each frame if previous frame is a referece frame.

For eg: IPBPB sequece should have the frame numbers 0,1,2,2,3
Comment 7 Hyunjun Ko 2017-10-13 09:02:03 UTC
Created attachment 361493 [details] [review]
libs: encoder: h264: Add Hierarchical-P encode

Frames are encoded as differnt layers. A frame in a particular
layer will use pictures in lower or same layer as references.
Which means decoder can drop the frames in upper layer ,but still
decode lower layer frames.

eg: with 3 temporal layers

T3:             P1            P3              P5              P7

T2:                   P2                              P6

T1:   P0                                P4                        P8

T1, T2, T3: Temporal Layers
P1...pn:   P-Frames:
P0->P1 , P0->P2, P2->P3, P0->P4......repeat
Comment 8 Hyunjun Ko 2017-10-13 09:02:36 UTC
Created attachment 361494 [details] [review]
libs: encoder: h264: Add Hierarchical-B encode

Frames are encoded as differnt layers. Frame in a particular
layer will use pictures in lower or same layer as references.
Which means decoder can drop the frames in upper layer ,but still
decode lower layer frames.

B-frames, except the one in top most layer are reference frames, all 
th base layer frames are I or P.

eg: with 3 temporal layers

    T3:             B1            B3              B5              B7

    T2:                   B2                              B6

    T1:   I0                                P4                        P8

    T1, T2, T3: Temporal Layers
    P1...Pn:   P-Frames:
    B1...Bn:   B-frames:
    T1: I0->P4 , P4->P8 etc..
    T2: I0--> B2 <-- P4
    T3: I0--> B1 <-- B2, B2 --> B3 <-- P4
Comment 9 Víctor Manuel Jáquez Leal 2017-10-19 13:28:34 UTC
Review of attachment 361486 [details] [review]:

lgtm
Comment 10 Víctor Manuel Jáquez Leal 2017-10-19 13:29:52 UTC
Review of attachment 361487 [details] [review]:

::: gst-libs/gst/vaapi/gstvaapiencoder_h264.c
@@ +2995,3 @@
   encoder->num_views = 1;
   encoder->view_idx = 0;
+  encoder->temporal_levels = 1;

what about use MIN_TEMPORAL_LEVELS ?
Comment 11 Víctor Manuel Jáquez Leal 2017-10-19 13:30:48 UTC
Review of attachment 361488 [details] [review]:

this patch doesn't make sense to me. It should be merged to other patch where logic is added, or just removed.
Comment 12 Víctor Manuel Jáquez Leal 2017-10-19 13:33:58 UTC
Review of attachment 361489 [details] [review]:

lgtm
Comment 13 Víctor Manuel Jáquez Leal 2017-10-19 13:46:52 UTC
Review of attachment 361490 [details] [review]:

::: gst-libs/gst/vaapi/gstvaapiencoder_h264.c
@@ +114,3 @@
+  GST_VAAPI_ENCODER_H264_PREDICTION_DEFAULT = 0,
+  GST_VAAPI_ENCODER_H264_PREDICTION_HIERARCHICAL_P = 1,
+  GST_VAAPI_ENCODER_H264_PREDICTION_HIERARCHICAL_B = 2

AFAIU in this case there is no need to set a manual value to the enums, since that is the default

@@ +132,3 @@
+      {GST_VAAPI_ENCODER_H264_PREDICTION_HIERARCHICAL_B,
+            "Hierarchical B frame encode",
+          "hierarchical-b"}

The GEnumValue array must be terminated by a struct with all members being 0. 

{0, NULL, NULL},
Comment 14 Víctor Manuel Jáquez Leal 2017-10-19 13:53:08 UTC
Review of attachment 361492 [details] [review]:

::: gst-libs/gst/vaapi/gstvaapiencoder_h264.c
@@ +2852,3 @@
+    reorder_pool->prev_frame_is_ref = TRUE;
+  else
+    reorder_pool->prev_frame_is_ref = FALSE;

this would be simpler as

reorder_pool->prev_frame_is_ref = GST_VAAPI_ENC_PICTURE_IS_REFRENCE (picture);

or even better

reorder_pool->prev_frame_is_ref = GST_VAAPI_ENC_PICTURE_IS_REFRENCE (picture) || GST_VAAPI_ENC_PICTURE_IS_IDR (picture);

removing the up assignation
Comment 15 Víctor Manuel Jáquez Leal 2017-10-19 14:31:21 UTC
Review of attachment 361493 [details] [review]:

::: gst-libs/gst/vaapi/gstvaapiencoder_h264.c
@@ +1931,3 @@
+
+  if (picture->type != GST_VAAPI_PICTURE_TYPE_B)
+    return TRUE;

these two lines don't make sense. Perhaps they should be merged in the next patch
Comment 16 Víctor Manuel Jáquez Leal 2017-10-19 14:44:44 UTC
Review of attachment 361494 [details] [review]:

also, please correct the log message: differnt => different, comma usage
Comment 17 Víctor Manuel Jáquez Leal 2017-10-19 14:44:58 UTC
Review of attachment 361493 [details] [review]:

check also the commit log message
Comment 18 sreerenj 2017-10-20 12:53:08 UTC
Hyunjun, thanks for bringing these patches to Bugzilla!


BTW please make sure the low-power encoding is disabled in Hierarchical-B (B-frames are not supported by the intel driver in VDENC/low-power).

@Victor: Thanks for the review :)
Comment 19 Hyunjun Ko 2017-10-20 13:22:52 UTC
Created attachment 361945 [details] [review]
libs: encoder: h264: Add property "temporal-levels"

Adds new property "temporal-levels" to select the number of
temporal levels to be included in the encoded stream.
Comment 20 Hyunjun Ko 2017-10-20 13:24:41 UTC
Created attachment 361946 [details] [review]
libs: encoder: h264: Add machinery for implementing   hierarchical-prediction

Adds some basic building blocks to ease the implementation
of hierarchical prediction modes.

-- add an utility method to find temporal level of each frame
-- define max_ref_frame count based on temporal level count
-- add temporal_level_div[] for finding temporal level each frame
to be encoded.
-- find ip_period based on temporal level count

Signed-off-by: Sreerenj Balachandran <sreerenj.balachandran@intel.com>

https://bugzilla.gnome.org/show_bug.cgi?id=788918
Comment 21 Hyunjun Ko 2017-10-20 13:25:55 UTC
Created attachment 361947 [details] [review]
libs: encoder: h264: Add new property "prediction-type"

Adds new property "prediction-type" to select different reference
picture selection modes like hierarchical-p, hierarchical-b etc.
Comment 22 Hyunjun Ko 2017-10-20 13:26:56 UTC
Created attachment 361948 [details] [review]
libs: encoder: h264: Fix frame_num generation

The frame_num generation was not correctly implemented.
According to h264 spec, frame_num should get incremented
for each frame if previous frame is a referece frame.

For eg: IPBPB sequece should have the frame numbers 0,1,2,2,3
Comment 23 Hyunjun Ko 2017-10-20 13:27:33 UTC
Created attachment 361949 [details] [review]
libs: encoder: h264: Add Hierarchical-P encode

Frames are encoded as different layers. A frame in a particular
layer will use pictures in lower or same layer as references.
Which means decoder can drop the frames in upper layer but still
decode lower layer frames.

eg: with 3 temporal layers

T3:             P1            P3              P5              P7

T2:                   P2                              P6

T1:   P0                                P4                        P8

T1, T2, T3: Temporal Layers
P1...pn:   P-Frames:
P0->P1 , P0->P2, P2->P3, P0->P4......repeat
Comment 24 Hyunjun Ko 2017-10-20 13:28:07 UTC
Created attachment 361950 [details] [review]
libs: encoder: h264: Add Hierarchical-B encode


Frames are encoded as different layers. Frame in a particular
layer will use pictures in lower or same layer as references.
Which means decoder can drop the frames in upper layer but still
decode lower layer frames.

B-frames, except the one in top most layer, are reference frames.
All the base layer frames are I or P.

eg: with 3 temporal layers

    T3:             B1            B3              B5              B7

    T2:                   B2                              B6

    T1:   I0                                P4                        P8

    T1, T2, T3: Temporal Layers
    P1...Pn:   P-Frames:
    B1...Bn:   B-frames:
    T1: I0->P4 , P4->P8 etc..
    T2: I0--> B2 <-- P4
    T3: I0--> B1 <-- B2, B2 --> B3 <-- P4
Comment 25 Hyunjun Ko 2017-10-20 13:30:35 UTC
(In reply to Víctor Manuel Jáquez Leal from comment #14)
> Review of attachment 361492 [details] [review] [review]:
> 
> ::: gst-libs/gst/vaapi/gstvaapiencoder_h264.c
> @@ +2852,3 @@
> +    reorder_pool->prev_frame_is_ref = TRUE;
> +  else
> +    reorder_pool->prev_frame_is_ref = FALSE;
> 
> this would be simpler as
> 
> reorder_pool->prev_frame_is_ref = GST_VAAPI_ENC_PICTURE_IS_REFRENCE
> (picture);
> 
> or even better
> 
> reorder_pool->prev_frame_is_ref = GST_VAAPI_ENC_PICTURE_IS_REFRENCE
> (picture) || GST_VAAPI_ENC_PICTURE_IS_IDR (picture);
> 
> removing the up assignation

Thanks for simpler suggesion. I think GST_VAAPI_ENC_PICTURE_IS_IDR is not necessary to see if it's reference or not since it should be set if it's IDR picture beforehand. I modified according to your first suggession.:)
Comment 26 Hyunjun Ko 2017-10-20 13:41:00 UTC
(In reply to sreerenj from comment #18)
> Hyunjun, thanks for bringing these patches to Bugzilla!
> 
> 
> BTW please make sure the low-power encoding is disabled in Hierarchical-B
> (B-frames are not supported by the intel driver in VDENC/low-power).
> 

Sure. It disables Hierarchical-B if it's low power encoding mode here.

@@ -2695,6 +2778,10 @@ reset_properties (GstVaapiEncoderH264 * encoder)
   if (base_encoder->max_num_ref_frames_1 < 1 && encoder->num_bframes > 0) {
     GST_WARNING ("Disabling b-frame since the driver doesn't support it");
     encoder->num_bframes = 0;
+
+    if (encoder->prediction_type ==
+        GST_VAAPI_ENCODER_H264_PREDICTION_HIERARCHICAL_B)
+      encoder->prediction_type = GST_VAAPI_ENCODER_H264_PREDICTION_DEFAULT;
   }
Comment 27 Víctor Manuel Jáquez Leal 2017-11-08 12:57:54 UTC
Attachment 361486 [details] pushed as 89717a4 - libs: encoder: objects: Add a reference flag
Attachment 361945 [details] pushed as e49859f - libs: encoder: h264: Add property "temporal-levels"
Attachment 361946 [details] pushed as dc58345 - libs: encoder: h264: Add machinery for implementing hierarchical-prediction
Attachment 361947 [details] pushed as ff91d62 - libs: encoder: h264: Add new property "prediction-type"
Attachment 361948 [details] pushed as 904931d - libs: encoder: h264: Fix frame_num generation
Attachment 361949 [details] pushed as 53c8369 - libs: encoder: h264: Add Hierarchical-P encode
Attachment 361950 [details] pushed as 6a3f30a - libs: encoder: h264: Add Hierarchical-B encode