Bug 795462 – video: Add support for the 10bit yuv pixel format from the Rockchip SoC platform

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 795462 - video: Add support for the 10bit yuv pixel format from the Rockchip SoC platform


Summary:	video: Add support for the 10bit yuv pixel format from the Rockchip SoC platform


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gst-plugins-base
Version:	git master
Hardware:	Other Linux

Importance:	Normal enhancement
Target Milestone:	1.15.1
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2018-04-22 13:44 UTC by Randy Li (ayaka)
Modified:	2018-05-19 17:37 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Only unpack work version (7.08 KB, patch) 2018-04-22 13:44 UTC, Randy Li (ayaka)	none	Details \| Review
A state machine implementation sample (7.35 KB, patch) 2018-04-22 19:53 UTC, Randy Li (ayaka)	none	Details \| Review
Both pack and unpack work (8.14 KB, patch) 2018-04-24 07:07 UTC, Randy Li (ayaka)	none	Details \| Review
The final version? (8.17 KB, patch) 2018-04-24 19:57 UTC, Randy Li (ayaka)	none	Details \| Review
Fix the compiler warnings (8.19 KB, patch) 2018-04-25 01:07 UTC, Randy Li (ayaka)	none	Details \| Review
video: Add NV12_10LE40 pixel format (8.56 KB, patch) 2018-04-29 00:24 UTC, Nicolas Dufresne (ndufresne)	reviewed	Details \| Review
Fix the unpack function, it won't shift chroma value twice (8.77 KB, patch) 2018-05-14 02:59 UTC, Randy Li (ayaka)	committed	Details \| Review

Description Randy Li (ayaka) 2018-04-22 13:44:45 UTC

Created attachment 371229 [details] [review]
Only unpack work version

This pixel format is a variant of P010_10LE, without any filled bits between pixels in a stride.


You can download the raw dump here, and i420_10le from the software decoder. Also the P010_10LEC to i420_10le results are uploaded.
https://drive.google.com/drive/folders/1VceFGn67E4e_zzq5---85caotY_h3REp?usp=sharing

Comment 1 Nicolas Dufresne (ndufresne) 2018-04-22 14:21:32 UTC

Review of attachment 371229 [details] [review]:

Ok, so this is fully packet 10bit (no padding). Please correct me, but for the Y plane you'll have, 4 pixels per 5bytes:

5bytes:  Y1 0-9, Y2 10-19, Y3 20-29, Y4 20-39
5bytes:  Y5 ...

And for UV plane, you'll have 2 pixels per 5 bytes:
5bytes:  U1V1: 0-19, U2V2: 20-39
5bytes:  U3V3 ...

If that is correct, my suggestion is to create a state machine with 5 states, and then combine to reduce the algo. That's what I did for the 32bit packed variants. I didn't unroll the loops though, as it's just for debugging. Then the state machine can be revered for the pack function.

::: gst-libs/gst/video/video-format.c
@@ +5348,3 @@
       GST_MAKE_FOURCC ('X', 'V', '2', '0'), DPTH10_10_10, PSTR0, PLANE011,
       OFFS001, SUB422, PACK_NV16_10LE32),
+  MAKE_YUV_C_LE_FORMAT (P010_10LEC, "raw video", 0x00000000, DPTH10_10_10,

For the fourcc, as you can see for the Xilinx variant, we have used the fourcc used by the vendor (X is for Xilinx). May I suggest using something like RV20, or RK20, or if you prefer it could be RV12, as long as it's not already used. I like the idea to use two letter RK.., makes the origin pretty clear ;-P

::: gst-libs/gst/video/video-format.h
@@ +114,3 @@
  * @GST_VIDEO_FORMAT_Y444_12BE: planar 4:4:4 YUV, 12 bits per channel (Since: 1.12)
  * @GST_VIDEO_FORMAT_Y444_12LE: planar 4:4:4 YUV, 12 bits per channel (Since: 1.12)
+ * @GST_VIDEO_FORMAT_P010_10LEC: planar 4:2:0 YUV with interleaved UV plane, 10 bits per channel, a compat variant of P010

I'd use the "Fully packed variant of .." when describing this. Though, P010 is already a variant of NV12, so may I proposed the following name ?

  NV12_10LE40

Which reads NV12 layout (one Y plane and one UV plane), 10bit per component, packed over 40bits (hence no padding).

::: gst-libs/gst/video/video-info.c
@@ +1030,3 @@
       break;
+    case GST_VIDEO_FORMAT_P010_10LEC:
+      info->stride[0] = GST_ROUND_UP_64 (width * 10 / 8);

Shouldn't this roundup to 5, while rounding up to 64 is hardware specific ? Extra alignment could be used by default if it more performant for the pack/unpack functions. Though, these function are likely just for debugging, the HW will take care of it in normal usage.

@@ +1034,3 @@
+      info->offset[0] = 0;
+      info->offset[1] = info->stride[0] * GST_ROUND_UP_16 (height);
+      cr_h = GST_ROUND_UP_16 (height) / 2;

CbCr plane (or UV plane) only need to be ROUND_UP_2 (so the odd last line of the Y plane have a line of UV), 16 seems like an H264/H265 specific (16x16 macro blocks).

Comment 2 Randy Li (ayaka) 2018-04-22 19:53:33 UTC

Created attachment 371243 [details] [review]
A state machine implementation sample

I wonder whether it would get a better performance with the helper of the state machine, but the code looks better.
About the packing function, unless writing 4 pixel at the same time, you need to read back the data written.

Comment 3 Nicolas Dufresne (ndufresne) 2018-04-22 20:29:18 UTC

Review of attachment 371243 [details] [review]:

It's nicer, but not very efficient if I read correctly, since you are reading each 16bit short twice.

Comment 4 Randy Li (ayaka) 2018-04-24 07:07:54 UTC

Created attachment 371312 [details] [review]
Both pack and unpack work

Would you accept this version?

Comment 5 Nicolas Dufresne (ndufresne) 2018-04-24 18:33:15 UTC

Review of attachment 371312 [details] [review]:

::: gst-libs/gst/video/video-info.c
@@ +1030,3 @@
       break;
+    case GST_VIDEO_FORMAT_NV12_10LE40:
+      info->stride[0] = ((width * 5 >> 2) + 4) & (~4);

As per IRC, you said you wanted to round up by 4, but this roundup technique only applies to power of two. You'll need * / instead.

Comment 6 Nicolas Dufresne (ndufresne) 2018-04-24 18:47:58 UTC

Review of attachment 371312 [details] [review]:

::: gst-libs/gst/video/video-format.h
@@ +114,3 @@
  * @GST_VIDEO_FORMAT_Y444_12BE: planar 4:4:4 YUV, 12 bits per channel (Since: 1.12)
  * @GST_VIDEO_FORMAT_Y444_12LE: planar 4:4:4 YUV, 12 bits per channel (Since: 1.12)
+ * @GST_VIDEO_FORMAT_NV12_10LE40: Fully packed variant of NV12_10LE32

As you made me notice, (Since 1.16) please.

Comment 7 Randy Li (ayaka) 2018-04-24 19:57:40 UTC

Created attachment 371346 [details] [review]
The final version?

Comment 8 Nicolas Dufresne (ndufresne) 2018-04-24 20:35:47 UTC

Review of attachment 371346 [details] [review]:

Looks good, thanks.

Comment 9 Nicolas Dufresne (ndufresne) 2018-04-24 20:51:50 UTC

commit 35d0783fca504997e1be30cfebfa8de042ea49ab (HEAD -> master, origin/master, origin/HEAD)
Author: ayaka <ayaka@soulik.info>
Date:   Sun Mar 26 04:54:42 2017 +0800

    video: Add NV12_10LE40 pixel format
    
    This pixel format is a fully packed variant of NV12, a luma
    pixel would take 10bits in memory, without any filled bits
    between pixels in a stride. The color range follows
    the BT.2020 standard.
    
    In order to get a performance in hardware memory
    operation, it may expend the stride, append zero data at the
    end of echo lines.
    
    Signed-off-by: ayaka <ayaka@soulik.info>
    
    https://bugzilla.gnome.org/show_bug.cgi?id=795462

Comment 10 Nicolas Dufresne (ndufresne) 2018-04-24 21:06:15 UTC

I forget to run make check, it fails the generic unit test:

commit 9a3ee4838f2ab3af85ac003eab22be27f4c0392d (HEAD -> master, origin/master, origin/HEAD)
Author: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Date:   Tue Apr 24 17:05:17 2018 -0400

    Revert "video: Add NV12_10LE40 pixel format"
    
    This reverts commit 35d0783fca504997e1be30cfebfa8de042ea49ab.

Comment 11 Randy Li (ayaka) 2018-04-25 01:07:44 UTC

Created attachment 371357 [details] [review]
Fix the compiler warnings

Comment 12 Randy Li (ayaka) 2018-04-27 03:03:59 UTC

0:02:07.808667125 11220       0x646430 ERROR             video-info video-info.c:694:fill_planes: Frame size 2147483647x1073741823
 would overflow                                                                                                                   
                                                                                                                                  
Breakpoint 1, fill_planes (info=0x7fffffffd5f0) at video-info.c:694                                                               
694         GST_ERROR ("Frame size %ux%u would overflow", info->width, info->height);                                             
(gdb) print info->finfo->format                                                                                                   
$7 = GST_VIDEO_FORMAT_ARGB                                                                                                        
(gdb) continue                                                                                                                    
Continuing.                                                                                                                       
0:02:10.782590916 11220       0x646430 ERROR             video-info video-info.c:694:fill_planes: Frame size 1073741823x1073741823
 would overflow                                                                                                                   
                                                                                                                                  
Breakpoint 1, fill_planes (info=0x7fffffffd5f0) at video-info.c:694                                                               
694         GST_ERROR ("Frame size %ux%u would overflow", info->width, info->height);                                             
(gdb) print info->finfo->format                                                                                                   
$8 = GST_VIDEO_FORMAT_ARGB                                                                                                        
(gdb) continue                                                                                                                    
Continuing.                                                                                                                       
0:02:13.972354418 11220       0x646430 ERROR             video-info video-info.c:694:fill_planes: Frame size 2147483647x214748364$ would overflow                 

Breakpoint 1, fill_planes (info=0x7fffffffd5f0) at video-info.c:694                                                               
694         GST_ERROR ("Frame size %ux%u would overflow", info->width, info->height);                                             
(gdb) print info->finfo->format 
$9 = GST_VIDEO_FORMAT_ARGB      
(gdb) continue                  
Continuing.                     
0:02:17.165945971 11220       0x646430 ERROR             video-info video-info.c:694:fill_planes: Frame size 2147483647x214748364$ would overflow                 

Breakpoint 1, fill_planes (info=0x7fffffffd5f0) at video-info.c:694                                                               
694         GST_ERROR ("Frame size %ux%u would overflow", info->width, info->height);                                             
(gdb) print info->finfo->format 
$10 = GST_VIDEO_FORMAT_ARGB     
(gdb) continue                  
Continuing.                     
0:02:22.672242129 11220       0x646430 ERROR             video-info video-info.c:694:fill_planes: Frame size 1073741824x1 would o$erflow 

Does log show that it is not my pixel format cause the bug?

Comment 13 Nicolas Dufresne (ndufresne) 2018-04-27 07:08:19 UTC

These are expected errors in the test. What fails is a binary diff, and it dumps the values.

Comment 14 Nicolas Dufresne (ndufresne) 2018-04-29 00:24:07 UTC

Created attachment 371507 [details] [review]
video: Add NV12_10LE40 pixel format

This pixel format is a fully packed variant of NV12_10LE32,
a luma pixel would take 10bits in memory, without any
filled bits between pixels in a stride. The color range
follows the BT.2020 standard.

In order to get a better performance in hardware memory
operation, it may expend the stride, append zero data at the
end of echo lines.

Pack function by Nicolas Dufresne <nicolas@ndufresne.ca>

Comment 15 Randy Li (ayaka) 2018-04-29 12:55:19 UTC

Review of attachment 371507 [details] [review]:

4989	    if (!(flags & GST_VIDEO_PACK_FLAG_TRUNCATE_RANGE)) {
			4990	      Yn |= Yn >> 10;
			4991	      Un |= Un >> 10;
			4992	      Vn |= Vn >> 10;
			4993	    }
In the case 1 and 3, the Yn has modified with preivous operation, then first 10 bits may be modified again.
That is why comment those code when I sent it to you, I want solve this problem later.

About pack function part, it would still generate a different result than hardware decoder and I convert from a i420_10le manually.
Maybe it won't be a problem in a small resolution.

Comment 16 Randy Li (ayaka) 2018-05-14 02:59:22 UTC

Created attachment 371987 [details] [review]
Fix the unpack function, it won't shift chroma value twice

The way to fill truncate range is strange.
Yn |= Yn >> 10;
It would make the 4bit from MSB to and with the 4bit above LSB.
That is why the unpack and pack manual test never match the result from the software decoder.
But it looks all the previous pixel format are ok with that. So I won't fix it this time.
And it passes the test.

Comment 17 Nicolas Dufresne (ndufresne) 2018-05-14 16:29:38 UTC

Review of attachment 371987 [details] [review]:

I believe this is ready now, thanks a lot!

Comment 18 Nicolas Dufresne (ndufresne) 2018-05-19 17:36:59 UTC

commit 388b48511e90e008138e1842640b76934bd891dc (HEAD -> master)
Author: ayaka <ayaka@soulik.info>
Date:   Sat Apr 28 20:22:31 2018 -0400

    video: Add NV12_10LE40 pixel format
    
    This pixel format is a fully packed variant of NV12_10LE32,
    a luma pixel would take 10bits in memory, without any
    filled bits between pixels in a stride. The color range
    follows the BT.2020 standard.
    
    In order to get a better performance in hardware memory
    operation, it may expend the stride, append zero data at the
    end of echo lines.
    
    Pack function by Nicolas Dufresne.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=795462
    
    Signed-off-by: Nicolas Dufresne <nicolas@ndufresne.ca>
    Signed-off-by: ayaka <ayaka@soulik.info>