After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 781537 - NVDEC - Nvidia Decoder plugin
NVDEC - Nvidia Decoder plugin
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-bad
git master
Other All
: Normal enhancement
: 1.13.1
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2017-04-20 13:59 UTC by Benjamin
Modified: 2018-01-19 13:56 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
NVDEC - Nvidia Decoder plugin (33.43 KB, patch)
2017-05-10 13:12 UTC, atopilski@gmail.com
needs-work Details | Review
Change copyrights (6.00 KB, patch)
2017-05-10 15:39 UTC, atopilski@gmail.com
none Details | Review
2-commits (39.44 KB, patch)
2017-05-10 17:55 UTC, atopilski@gmail.com
none Details | Review
Fix picture buffer leak (47.48 KB, patch)
2017-05-11 17:21 UTC, atopilski@gmail.com
none Details | Review
FIX_INVALID_CUDA_CONTEXT (53.90 KB, patch)
2017-05-13 10:42 UTC, atopilski@gmail.com
none Details | Review
nvdec: New plugin using NVIDIA nvcuvid decoder API (33.51 KB, patch)
2017-05-25 09:46 UTC, Sebastian Dröge (slomo)
none Details | Review
nvdec: New plugin using NVIDIA nvcuvid decoder API (35.62 KB, patch)
2017-05-25 09:48 UTC, Sebastian Dröge (slomo)
needs-work Details | Review
Fixes after review (8.29 KB, patch)
2017-05-25 16:22 UTC, atopilski@gmail.com
none Details | Review
nvdec: New plugin for NVIDIA hardware video decode (42.27 KB, patch)
2017-05-30 07:24 UTC, Per-Erik Brodin
none Details | Review
nvdec: New plugin for NVIDIA hardware video decode (42.79 KB, patch)
2017-05-30 18:41 UTC, Per-Erik Brodin
none Details | Review
nvdec: New plugin for NVIDIA hardware video decode (44.12 KB, patch)
2017-06-05 06:25 UTC, Per-Erik Brodin
none Details | Review
nvdec: New plugin for NVIDIA hardware video decode (43.87 KB, patch)
2017-06-06 06:28 UTC, Per-Erik Brodin
none Details | Review
nvdec: New plugin for NVIDIA hardware video decode (44.39 KB, patch)
2017-06-27 00:03 UTC, Per-Erik Brodin
committed Details | Review

Description Benjamin 2017-04-20 13:59:51 UTC
I see there is no plugin for Nvidia Decoder, It would be great to have this plugin for Gstreamer.

as of right now if you try to transcode something with NVidia's GPU you must use CPU to decode, can't take advantage of GPU's decoder.

I see FFMPEG has it called "h264_cuvid" maybe a wrapper can be done for it to save time and not develop it from scratch.

Or if it has to be done based on NVdecoder API here is a guide.
http://developer2.download.nvidia.com/designworks/video-codec-sdk/secure/7.1/01/NVDEC_VideoDecoder_API_ProgGuide.pdf?aSJOwwYMXlACrYcwv0LSRlqNRXumKJdn_iw_30ZHJefdBydvRZu-8fxL_Yhva5WNhkUQ-qQuf3XvmEJiVAmUONdZX71YLCo5oT-3o0nlRGb0lRDZXboo9GfURffv8YPBexyP7kgPBhiezRBHGBDNvwHdWK41xyCwquc9sHTt-EzF1ra_4r2Qd4iY2aTm
Comment 1 Olivier Crête 2017-04-24 18:20:14 UTC
How is that different from using VDPAU to decode?
Comment 2 Tim-Philipp Müller 2017-04-24 18:31:57 UTC
I believe vdpau is deprecated/dead?
Comment 3 Sebastian Dröge (slomo) 2017-04-25 09:28:29 UTC
It's the (only?) supported, and recommended way of doing video decoding by NVIDIA.
Comment 4 Benjamin 2017-04-27 18:19:25 UTC
Yes correct VDPAU is dead and it only supports old chips of Nvidia Maxwell and Kepler, no support for Pascal and other newer chips.

Sebastian that's correct NVDEC is the only way to use Nvidia Decoding.
Comment 5 Víctor Manuel Jáquez Leal 2017-04-28 10:25:36 UTC
VDPAU is still supported by radeon driver in mesa and by nouveau. But for Nvidia propietary driver, NVDEC is the way to go.
Comment 6 Benjamin 2017-05-02 12:39:33 UTC
Hi, guys we started developing a plugin for NVDEC we got it to work we decode successfully and wanted to share what we have so far. 

It's our first plugin we are wiring so please don't judge if you see something funny in the code lol

I'm not sure what it takes if or if this is ready to be posted as a plugin in Gstreamer side for other developers to review and contribute to make it better but if this is good enough to be part of Gstreamer we can take down our repo and continue to contribute in Gstreamer's official repo.


https://github.com/bldsoft/gst-plugin-bad-nvdec/
Comment 7 Sebastian Dröge (slomo) 2017-05-02 13:15:01 UTC
That's great, thanks! Can you provide a patch (in "git format-patch" format) against gst-plugins-bad here, then we can review it and get it merged.
Comment 8 Benjamin 2017-05-03 15:52:13 UTC
to do that we would probably need help we have done this plugin based on http://siilo.dyndns.org/wiki/index.php/Creating_a_Plugin_template

and depends on:

gstreamer-1.0 >= $GST_REQUIRED
gstreamer-base-1.0 >= $GST_REQUIRED
gstreamer-controller-1.0 >= $GST_REQUIRED
gstreamer-video-1.0 >= $GST_REQUIRED

so it was done more like an extension.
Comment 9 Sebastian Dröge (slomo) 2017-05-08 14:17:32 UTC
Basically you would move your files into gst-plugins-bad/sys/nvdec, add things to sys/Makefile.am like the existing stuff there already, and then checks to configure.ac (like you already have now).

Just try it, and I'll help you with anything where you get stuck / review that patch then.
Comment 10 atopilski@gmail.com 2017-05-10 12:42:34 UTC
Hi Sebastian, i don't like old patch style and i created pull request https://github.com/GStreamer/gst-plugins-bad/pull/3 please review it.
Comment 11 Sebastian Dröge (slomo) 2017-05-10 12:57:20 UTC
We don't use github for patch review (or anything), it's just a mirror of our git repositories. Please check here: https://gstreamer.freedesktop.org/documentation/contribute/#how-to-submit-patches

tl;dr is: "git format-patch" style patch attached to this Bugzilla ticket :) Sorry for the inconvenience.
Comment 12 atopilski@gmail.com 2017-05-10 13:12:02 UTC
Created attachment 351544 [details] [review]
NVDEC - Nvidia Decoder plugin
Comment 13 atopilski@gmail.com 2017-05-10 13:15:53 UTC
Hi Sebastian, i created patch please review.
Comment 14 Sebastian Dröge (slomo) 2017-05-10 13:24:37 UTC
Great, thanks! I'll go through that some time this week or early next week, it's on my list :)
Comment 15 Sebastian Dröge (slomo) 2017-05-10 13:26:09 UTC
Review of attachment 351544 [details] [review]:

Build system parts look good, didn't look at the code but:

::: sys/nvdec/gstnvh264plugin.c
@@ +3,3 @@
+    Rixjob is free software: you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation, either version 3 of the License, or

Can you make this available under the LGPL 2.1+? We generally don't merge GPL code, even less GPL3.
Comment 16 atopilski@gmail.com 2017-05-10 13:49:15 UTC
Sure, we will update license notes.
Comment 17 atopilski@gmail.com 2017-05-10 13:58:51 UTC
I should add new patch file, or need update previous?
Comment 18 Sebastian Dröge (slomo) 2017-05-10 14:21:11 UTC
Just a new patch. Bugzilla can create diffs between them as needed.
Comment 19 atopilski@gmail.com 2017-05-10 15:39:56 UTC
Created attachment 351562 [details] [review]
Change copyrights

Updated copyrights
Comment 20 Sebastian Dröge (slomo) 2017-05-10 17:18:12 UTC
Please attach a full patch with all changes at once, not an incremental patch :) Looks good though, thanks!
Comment 21 atopilski@gmail.com 2017-05-10 17:55:08 UTC
Created attachment 351573 [details] [review]
2-commits
Comment 22 atopilski@gmail.com 2017-05-11 17:21:54 UTC
Created attachment 351650 [details] [review]
Fix picture buffer leak
Comment 23 atopilski@gmail.com 2017-05-13 10:42:58 UTC
Created attachment 351772 [details] [review]
FIX_INVALID_CUDA_CONTEXT
Comment 24 Victor Toso 2017-05-16 18:18:11 UTC
Review of attachment 351772 [details] [review]:

Hi,

- I have a m2000 and K620 here with me so I plan to help out with testing as much as possible.
- Do you have your code in some git to make it easier to review/test new changes?
- This bug has duplicated patches, when you attach a new patch please mark the older ones as obsolete :)

I tried building this, I have installed cuda-8.0, I hope this is correct!
Warnings:

gstnvh264dec.c: In function ‘pop_cuda_ctx’:
gstnvh264dec.c:160:24: error: passing argument 1 of ‘cuCtxPopCurrent_v2’ from incompatible pointer type [-Werror=incompatible-pointer-types]
       cuCtxPopCurrent (pnvh264dec->cuda_ctx));

gstnvh264dec.c: In function ‘gst_nvh264dec_handle_picture_decode’:
gstnvh264dec.c:314:3: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement]
   gboolean is_ok = IS_CUDA_CALL_SUCCSESS (nvh264dec,
   ^~~~~~~~

gstnvh264dec.c:153:1: error: ‘pop_cuda_ctx’ defined but not used [-Werror=unused-function]
 pop_cuda_ctx (GstNvh264dec * nvh264dec)
 ^~~~~~~~~~~~

pop_cuda_ctx() seems to be commented out for some reason and seems that you can fix the first one by passing &pnvh264dec->cuda_ctx

I hope to provide better feedback later this week.
Comment 25 atopilski@gmail.com 2017-05-17 10:40:58 UTC
Hi sure we have public repository, you can see here: https://github.com/bldsoft/gst-plugins-bad . I updated according your style.
Comment 26 Snir 2017-05-25 09:18:38 UTC
Hi

I was able to build and use the plugin on k620 and it works pretty well!!
i had some small issues but maybe i did something wrong while running\building it.. 

-Sometimes it shows only the first frame and freezes, so i have to start it again (i'm using gst-launch-1.0 filesrc location=/tmp/file.h264 ! nvh264dec ! glimagesink (or ! videoconvert ! xvimagesink))

-I tried to play two movies one had 1365*1024 resolution and the other one 2560*1450, both were encoded same way using Nvidia encoder, but the first one gives:
gstnvh264dec.c:763:gst_nvh264dec_send_decoded_frame:<nvh264dec0> Cuda buffer size: 2359296, not equal gstreamer buffer size: 2101248

-And one thing that i had to change is 
--- a/sys/nvdec/gstnvh264plugin.c
+++ b/sys/nvdec/gstnvh264plugin.c
@@ -35,7 +35,7 @@ plugin_init (GstPlugin * plugin)
 
 GST_PLUGIN_DEFINE (GST_VERSION_MAJOR,
     GST_VERSION_MINOR,
-    nvh264dec,
+    nvdec,
     "Nvidia cuda decoder plugin",
otherwise the .so file is blacklisted

Did you encounter these issues?

Which Nvidia tools are needed in order to build it? currently my driver is 381.22 and i'm using cuda packages from non-official repo, but i would prefer to use only Nvidia proprietary tools (maybe that's why i had this issues? i'm not sure :/)

(btw I'm building on fedora, the cuda pkgs were called differently so i had to change this too:
-    PKG_CHECK_MODULES([CUDA], [cuda-8.0 cudart-8.0],, [
+    PKG_CHECK_MODULES([CUDA], [cuda cudart],, [
)


Thank you!
Comment 27 Sebastian Dröge (slomo) 2017-05-25 09:38:41 UTC
(In reply to Snir from comment #26)

> -Sometimes it shows only the first frame and freezes, so i have to start it
> again (i'm using gst-launch-1.0 filesrc location=/tmp/file.h264 ! nvh264dec
> ! glimagesink (or ! videoconvert ! xvimagesink))

You should use h264parse after the filesrc

> -I tried to play two movies one had 1365*1024 resolution and the other one
> 2560*1450, both were encoded same way using Nvidia encoder, but the first
> one gives:
> gstnvh264dec.c:763:gst_nvh264dec_send_decoded_frame:<nvh264dec0> Cuda buffer
> size: 2359296, not equal gstreamer buffer size: 2101248

This needs some changes: either use the GstVideoMeta to configure the correct plane offsets and strides, or copy internally to a newly allocated GStreamer buffer.

> -And one thing that i had to change is 
> --- a/sys/nvdec/gstnvh264plugin.c
> +++ b/sys/nvdec/gstnvh264plugin.c
> @@ -35,7 +35,7 @@ plugin_init (GstPlugin * plugin)
>  
>  GST_PLUGIN_DEFINE (GST_VERSION_MAJOR,
>      GST_VERSION_MINOR,
> -    nvh264dec,
> +    nvdec,
>      "Nvidia cuda decoder plugin",
> otherwise the .so file is blacklisted

That's newly required since 1.13.0 and correct.

> (btw I'm building on fedora, the cuda pkgs were called differently so i had
> to change this too:
> -    PKG_CHECK_MODULES([CUDA], [cuda-8.0 cudart-8.0],, [
> +    PKG_CHECK_MODULES([CUDA], [cuda cudart],, [
> )

Maybe we should check for all the different versions that are supported here and use whatever is available?
Comment 28 Sebastian Dröge (slomo) 2017-05-25 09:46:40 UTC
Created attachment 352563 [details] [review]
nvdec: New plugin using NVIDIA nvcuvid decoder API
Comment 29 Sebastian Dröge (slomo) 2017-05-25 09:48:10 UTC
Created attachment 352564 [details] [review]
nvdec: New plugin using NVIDIA nvcuvid decoder API
Comment 30 Sebastian Dröge (slomo) 2017-05-25 09:49:50 UTC
Updated the patch from the GIT repository with the latest version, and wrote a proper commit message, and rebased it against master. No other changes
Comment 31 Sebastian Dröge (slomo) 2017-05-25 10:03:07 UTC
Review of attachment 352564 [details] [review]:

Sorry for the late review! Some comments below

::: configure.ac
@@ +2036,3 @@
+    PKG_CHECK_MODULES([CUDA], [cuda-7.5 cudart-7.5],, [
+      PKG_CHECK_MODULES([CUDA], [cuda-7.0 cudart-7.0],, [
+        PKG_CHECK_MODULES([CUDA], [cuda-6.5 cudart-6.5],, [

Maybe make this a for loop over all the versions, also check for unversioned (and then check the module's version number instead). And 8.0

::: sys/nvdec/gstnvh264dec.c
@@ +206,3 @@
+      GST_DEBUG_FUNCPTR (gst_nvh264dec_set_format);
+  video_decoder_class->drain = GST_DEBUG_FUNCPTR (gst_nvh264dec_drain);
+  video_decoder_class->reset = GST_DEBUG_FUNCPTR (gst_nvh264dec_reset);

reset() is deprecated, finish() / flush() / drain() replace it

@@ +224,3 @@
+{
+  GstNvh264decPrivate *pnvh264dec = GST_NVH264DEC_GET_PRIVATE (nvh264dec);
+  IS_CUDA_CALL_SUCCSESS (nvh264dec, cuInit (0));

Can this be done multiple times?

@@ +233,3 @@
+  pnvh264dec->host_data_size = 0;
+  pnvh264dec->width = 0;
+  pnvh264dec->height = 0;

Usually you would just store the output GstVideoCodecState and use the width/height from the GstVideoInfo in there

@@ +234,3 @@
+  pnvh264dec->width = 0;
+  pnvh264dec->height = 0;
+  g_mutex_init (&pnvh264dec->queue_mutex);

Need to clear() mutex (and probably other things) in GObject::finalize()

@@ +317,3 @@
+  pnvh264dec->is_frame_in_use[cuviddisp->picture_index] = TRUE;
+  // Wait until we have a free entry in the display queue (should never block if we have enough
+  // entries)

Instead of a loop without waiting here, maybe use a GCond? Also maybe these queues should be of configurable size? You could use a GstQueueArray

@@ +623,3 @@
+  if (gst_structure_get_int (pad_struct, "width", &width)) {
+  }
+  if (gst_structure_get_int (pad_struct, "height", &height)) {

These should already be in the GstVideoInfo in state

@@ +626,3 @@
+  }
+
+  codec_data = gst_structure_get_string (pad_struct, "codec_data");

It's never a string but a GstBuffer, and will only exist for stream-format=avc

@@ +753,3 @@
+  }
+
+  memcpy (omap_info.data, pnvh264dec->host_data, pnvh264dec->host_data_size);

Why do you memcpy twice here? Once with cuMemcpyDtoH(), once with memcpy()

Ideally (in a later version) we would somehow implement zerocopy here if downstream can support it

@@ +822,3 @@
+      && gst_nvh264dec_dequeue_frame (pnvh264dec, &cuviddisp)) {
+    GstFlowReturn ret =
+        gst_nvh264dec_send_decoded_frame (decoder, &cuviddisp, frame);

This seems to assume that coding and display order are the same (which is not the case), and that the frame that is currently output is the same as the input frame (which is also not always the case). You need to get the correct GstVideoCodecFrame here that corresponds to the input frame that we're going to output now.

There's various API in GstVideoDecoder for tracking these.

::: sys/nvdec/gstnvh264plugin.c
@@ +36,3 @@
+GST_PLUGIN_DEFINE (GST_VERSION_MAJOR,
+    GST_VERSION_MINOR,
+    nvh264dec,

Has to be nvdec as that's the plugin filename

@@ +39,3 @@
+    "Nvidia cuda decoder plugin",
+    plugin_init,
+    VERSION, "BSD", "Setplex GStreamer plugins", "http://www.setplex.com")

VERSION, GST_LICENSE, GST_PACKAGE_NAME, GST_PACKAGE_ORIGIN
Comment 32 atopilski@gmail.com 2017-05-25 15:31:56 UTC
Review of attachment 352564 [details] [review]:

::: configure.ac
@@ +2036,3 @@
+    PKG_CHECK_MODULES([CUDA], [cuda-7.5 cudart-7.5],, [
+      PKG_CHECK_MODULES([CUDA], [cuda-7.0 cudart-7.0],, [
+        PKG_CHECK_MODULES([CUDA], [cuda-6.5 cudart-6.5],, [

agree

::: sys/nvdec/gstnvh264dec.c
@@ +206,3 @@
+      GST_DEBUG_FUNCPTR (gst_nvh264dec_set_format);
+  video_decoder_class->drain = GST_DEBUG_FUNCPTR (gst_nvh264dec_drain);
+  video_decoder_class->reset = GST_DEBUG_FUNCPTR (gst_nvh264dec_reset);

I don't know about this, please do it.

@@ +224,3 @@
+{
+  GstNvh264decPrivate *pnvh264dec = GST_NVH264DEC_GET_PRIVATE (nvh264dec);
+  IS_CUDA_CALL_SUCCSESS (nvh264dec, cuInit (0));

Yes, we tested on many streams on one machine, cuInit can be called many times.

@@ +234,3 @@
+  pnvh264dec->width = 0;
+  pnvh264dec->height = 0;
+  g_mutex_init (&pnvh264dec->queue_mutex);

agree

@@ +317,3 @@
+  pnvh264dec->is_frame_in_use[cuviddisp->picture_index] = TRUE;
+  // Wait until we have a free entry in the display queue (should never block if we have enough
+  // entries)

agree

@@ +623,3 @@
+  if (gst_structure_get_int (pad_struct, "width", &width)) {
+  }
+  if (gst_structure_get_int (pad_struct, "height", &height)) {

Please do it, i don't know about this struct.

@@ +626,3 @@
+  }
+
+  codec_data = gst_structure_get_string (pad_struct, "codec_data");

agree

@@ +753,3 @@
+  }
+
+  memcpy (omap_info.data, pnvh264dec->host_data, pnvh264dec->host_data_size);

Yes, should allocate omap_info struct and copy from cuda memory.

@@ +822,3 @@
+      && gst_nvh264dec_dequeue_frame (pnvh264dec, &cuviddisp)) {
+    GstFlowReturn ret =
+        gst_nvh264dec_send_decoded_frame (decoder, &cuviddisp, frame);

In mostly you are right, but i don't know how to get decoded frame in same thread where received input data, if play with push/pop context after some execution this calls will destroy cuda context(all cuda calls should be done in one thread). If you know how to do decoding(in mostly only coping from cuda memory) and displaying in same thread please do it.

::: sys/nvdec/gstnvh264plugin.c
@@ +36,3 @@
+GST_PLUGIN_DEFINE (GST_VERSION_MAJOR,
+    GST_VERSION_MINOR,
+    nvh264dec,

May be better rename target, because cuda cards can decode other stream types.

@@ +39,3 @@
+    "Nvidia cuda decoder plugin",
+    plugin_init,
+    VERSION, "BSD", "Setplex GStreamer plugins", "http://www.setplex.com")

agree
Comment 33 atopilski@gmail.com 2017-05-25 16:22:23 UTC
Created attachment 352587 [details] [review]
Fixes after review
Comment 34 atopilski@gmail.com 2017-05-25 16:39:31 UTC
Also updated in our public repository, i don't know how to fix this line: +    VERSION, "BSD", "Setplex GStreamer plugins", "http://www.setplex.com")
Comment 35 atopilski@gmail.com 2017-05-25 16:45:50 UTC
(In reply to Snir from comment #26)
> Hi
> 
> I was able to build and use the plugin on k620 and it works pretty well!!
> i had some small issues but maybe i did something wrong while
> running\building it.. 
> 
> -Sometimes it shows only the first frame and freezes, so i have to start it
> again (i'm using gst-launch-1.0 filesrc location=/tmp/file.h264 ! nvh264dec
> ! glimagesink (or ! videoconvert ! xvimagesink))
> 
> -I tried to play two movies one had 1365*1024 resolution and the other one
> 2560*1450, both were encoded same way using Nvidia encoder, but the first
> one gives:
> gstnvh264dec.c:763:gst_nvh264dec_send_decoded_frame:<nvh264dec0> Cuda buffer
> size: 2359296, not equal gstreamer buffer size: 2101248
> 
> -And one thing that i had to change is 
> --- a/sys/nvdec/gstnvh264plugin.c
> +++ b/sys/nvdec/gstnvh264plugin.c
> @@ -35,7 +35,7 @@ plugin_init (GstPlugin * plugin)
>  
>  GST_PLUGIN_DEFINE (GST_VERSION_MAJOR,
>      GST_VERSION_MINOR,
> -    nvh264dec,
> +    nvdec,
>      "Nvidia cuda decoder plugin",
> otherwise the .so file is blacklisted
> 
> Did you encounter these issues?
> 
> Which Nvidia tools are needed in order to build it? currently my driver is
> 381.22 and i'm using cuda packages from non-official repo, but i would
> prefer to use only Nvidia proprietary tools (maybe that's why i had this
> issues? i'm not sure :/)
> 
> (btw I'm building on fedora, the cuda pkgs were called differently so i had
> to change this too:
> -    PKG_CHECK_MODULES([CUDA], [cuda-8.0 cudart-8.0],, [
> +    PKG_CHECK_MODULES([CUDA], [cuda cudart],, [
> )
> 
> 
> Thank you!

Hi, I removed check of different video buffer, this happens in ths lines
pnvh264dec->width = dec_param.ulTargetWidth = PAD_ALIGN(width, 0x3F) * 4 / 3;
pnvh264dec->height = dec_param.ulTargetHeight = PAD_ALIGN(height, 0x0F) * 4 / 3;
 you have non standard video size and round algorithm not working properly.

Also in new version doing only 1 copying and after that spend less 2X memory.
Comment 36 Sebastian Dröge (slomo) 2017-05-29 08:23:06 UTC
(In reply to atopilski@gmail.com from comment #34)
> Also updated in our public repository, i don't know how to fix this line: + 
> VERSION, "BSD", "Setplex GStreamer plugins", "http://www.setplex.com")

Replace it with:
VERSION, GST_LICENSE, GST_PACKAGE_NAME, GST_PACKAGE_ORIGIN
Comment 37 Sebastian Dröge (slomo) 2017-05-29 08:29:39 UTC
(In reply to atopilski@gmail.com from comment #32)
> ::: sys/nvdec/gstnvh264dec.c
> @@ +206,3 @@
> +      GST_DEBUG_FUNCPTR (gst_nvh264dec_set_format);
> +  video_decoder_class->drain = GST_DEBUG_FUNCPTR (gst_nvh264dec_drain);
> +  video_decoder_class->reset = GST_DEBUG_FUNCPTR (gst_nvh264dec_reset);
> 
> I don't know about this, please do it.

I don't have the hardware to test this, so either you'll have to do it or someone else with the hardware.

But basically you just replace reset() with flush() in your case, that's all.
 
> @@ +623,3 @@
> +  if (gst_structure_get_int (pad_struct, "width", &width)) {
> +  }
> +  if (gst_structure_get_int (pad_struct, "height", &height)) {
> 
> Please do it, i don't know about this struct.

Same as above, I can't do that.

Just look at the "info" member of the GstVideoCodec state you get passed in here. You can e.g. do GST_VIDEO_INFO_WIDTH(&state->info) to get the width, it will be 0 if unset.

> @@ +822,3 @@
> +      && gst_nvh264dec_dequeue_frame (pnvh264dec, &cuviddisp)) {
> +    GstFlowReturn ret =
> +        gst_nvh264dec_send_decoded_frame (decoder, &cuviddisp, frame);
> 
> In mostly you are right, but i don't know how to get decoded frame in same
> thread where received input data, if play with push/pop context after some
> execution this calls will destroy cuda context(all cuda calls should be done
> in one thread). If you know how to do decoding(in mostly only coping from
> cuda memory) and displaying in same thread please do it.

I don't know, and I don't have the hardware to test anything. You could check the code in ffmpeg for example.

There's probably a way to attach some opaque information to the frames you pass to the decoder, and you later get that same information back in the decoded frame. Like a frame number, or some pointer where you can store something.

You could then use the system_frame_number field of the GstVideoCodecFrame. Store that in there, and get it back with gst_video_decoder_get_frame().


Without doing this correctly, at least playback of video streams with B-frames will be broken.

> ::: sys/nvdec/gstnvh264plugin.c
> @@ +36,3 @@
> +GST_PLUGIN_DEFINE (GST_VERSION_MAJOR,
> +    GST_VERSION_MINOR,
> +    nvh264dec,
> 
> May be better rename target, because cuda cards can decode other stream
> types.

Yes
Comment 38 Per-Erik Brodin 2017-05-30 06:15:39 UTC
Ericsson has an implementation that we can contribute. It uses GstGL and CUDA-OpenGL interoperability so that decoded frames remain in GPU memory. I can provide a patch here if you are interested.
Comment 39 Sebastian Dröge (slomo) 2017-05-30 07:20:02 UTC
Sure, that would be great. Thanks! Sounds like your version is already much more advanced than this one.
Comment 40 Per-Erik Brodin 2017-05-30 07:24:02 UTC
Created attachment 352853 [details] [review]
nvdec: New plugin for NVIDIA hardware video decode
Comment 41 Sebastian Dröge (slomo) 2017-05-30 07:35:04 UTC
Review of attachment 352853 [details] [review]:

Just some initial comments below

::: configure.ac
@@ +1920,3 @@
+  save_CPPFLAGS="$CPPFLAGS"
+  CPPFLAGS="$NVCUVID_CFLAGS $save_CPPFLAGS"
+  AC_CHECK_HEADER([nvcuvid.h], [HAVE_NVCUVID_H=yes],

Why don't you use pkg-config?

::: sys/nvdec/gstnvdec.c
@@ +120,3 @@
+  CUgraphicsResource *resources;
+  guint num_resources;
+} GstNvDecCudaGraphicsResourcesMeta;

Why is this a GstMeta instead of a custom GstMemory (e.g. based on GstGLBaseMemory)?

@@ +617,3 @@
+  else if (!g_strcmp0 (caps_name, "video/x-h265"))
+    parser_params.CodecType = cudaVideoCodec_HEVC;
+  else {

Add some {} above here

@@ +735,3 @@
+    pending_frame = pending_frames->data;
+    if (!pending_frame->system_frame_number)
+      break;

Why?

@@ +790,3 @@
+        decode_params = (CUVIDPICPARAMS *) item->data;
+        pending_frame = pending_frames->data;
+        pending_frame->system_frame_number = decode_params->CurrPicIdx + 1;

You should never override the system_frame_number, it's read-only from a subclass point of view
Comment 42 Per-Erik Brodin 2017-05-30 18:41:59 UTC
Created attachment 352899 [details] [review]
nvdec: New plugin for NVIDIA hardware video decode
Comment 43 Per-Erik Brodin 2017-05-30 18:44:46 UTC
(In reply to Sebastian Dröge (slomo) from comment #41)

Thank you for the quick review!

> Review of attachment 352853 [details] [review] [review]:
> 
> Just some initial comments below
> 
> ::: configure.ac
> @@ +1920,3 @@
> +  save_CPPFLAGS="$CPPFLAGS"
> +  CPPFLAGS="$NVCUVID_CFLAGS $save_CPPFLAGS"
> +  AC_CHECK_HEADER([nvcuvid.h], [HAVE_NVCUVID_H=yes],
> 
> Why don't you use pkg-config?

At least on Debian/Ubuntu there is no .pc file provided by nvidia-cuda-dev. nvcuvid.h is located directly under /usr/include so typically you don’t have to set the NVCUVID_CFLAGS.

> 
> ::: sys/nvdec/gstnvdec.c
> @@ +120,3 @@
> +  CUgraphicsResource *resources;
> +  guint num_resources;
> +} GstNvDecCudaGraphicsResourcesMeta;
> 
> Why is this a GstMeta instead of a custom GstMemory (e.g. based on
> GstGLBaseMemory)?

I considered that but then we would also need a custom allocator and maybe a custom buffer pool and then the downstream element won’t be able to do the allocation. I think using a GstMeta is the best temporary solution until we have a GstCUDA library.

> 
> @@ +617,3 @@
> +  else if (!g_strcmp0 (caps_name, "video/x-h265"))
> +    parser_params.CodecType = cudaVideoCodec_HEVC;
> +  else {
> 
> Add some {} above here

Done

> 
> @@ +735,3 @@
> +    pending_frame = pending_frames->data;
> +    if (!pending_frame->system_frame_number)
> +      break;
> 
> Why?

Added a comment to clarify that we are looking for the oldest unfinished frame that has not yet been passed to the decoder.

> 
> @@ +790,3 @@
> +        decode_params = (CUVIDPICPARAMS *) item->data;
> +        pending_frame = pending_frames->data;
> +        pending_frame->system_frame_number = decode_params->CurrPicIdx + 1;
> 
> You should never override the system_frame_number, it's read-only from a
> subclass point of view

Ok, fixed. I thought the purpose of the separate decode_frame_number was so that system_frame_number could be changed. At least currently there is nothing in the base class that breaks when changing it.
Comment 44 Per-Erik Brodin 2017-06-05 06:25:37 UTC
Created attachment 353164 [details] [review]
nvdec: New plugin for NVIDIA hardware video decode
Comment 45 Per-Erik Brodin 2017-06-06 06:28:12 UTC
Created attachment 353240 [details] [review]
nvdec: New plugin for NVIDIA hardware video decode

Fixed a problem that was introduced when starting to use the list of pending frames to look up the frame to finish.
Comment 46 Sebastian Dröge (slomo) 2017-06-24 07:58:09 UTC
(In reply to Per-Erik Brodin from comment #43)
> (In reply to Sebastian Dröge (slomo) from comment #41)
> 
> Thank you for the quick review!
> 
> > Review of attachment 352853 [details] [review] [review] [review]:
> > 
> > Just some initial comments below
> > 
> > ::: configure.ac
> > @@ +1920,3 @@
> > +  save_CPPFLAGS="$CPPFLAGS"
> > +  CPPFLAGS="$NVCUVID_CFLAGS $save_CPPFLAGS"
> > +  AC_CHECK_HEADER([nvcuvid.h], [HAVE_NVCUVID_H=yes],
> > 
> > Why don't you use pkg-config?
> 
> At least on Debian/Ubuntu there is no .pc file provided by nvidia-cuda-dev.
> nvcuvid.h is located directly under /usr/include so typically you don’t have
> to set the NVCUVID_CFLAGS.

There are checks for cuda/etc for nvenc already, which use the .pc files if available or $CUDA_PREFIX otherwise. Can you reuse those instead of adding your own, and just keep the nvenc/nvdec specific parts for those plugins?

> > ::: sys/nvdec/gstnvdec.c
> > @@ +120,3 @@
> > +  CUgraphicsResource *resources;
> > +  guint num_resources;
> > +} GstNvDecCudaGraphicsResourcesMeta;
> > 
> > Why is this a GstMeta instead of a custom GstMemory (e.g. based on
> > GstGLBaseMemory)?
> 
> I considered that but then we would also need a custom allocator and maybe a
> custom buffer pool and then the downstream element won’t be able to do the
> allocation. I think using a GstMeta is the best temporary solution until we
> have a GstCUDA library.

What do you mean here? Couldn't downstream allocate any kind of GL buffer (e.g. glimagesink) and nvdec could make use of that? The meta limits how useful this is in the end, as the meta could disappear and also requires elements to know about the meta (while GL memory is generally supported by various elements already, including nvenc).


Having an allocator, buffer pool, etc seems not a problem though.
Comment 47 Per-Erik Brodin 2017-06-27 00:03:03 UTC
Created attachment 354537 [details] [review]
nvdec: New plugin for NVIDIA hardware video decode
Comment 48 Per-Erik Brodin 2017-06-27 00:05:39 UTC
(In reply to Sebastian Dröge (slomo) from comment #46)
> (In reply to Per-Erik Brodin from comment #43)
> > (In reply to Sebastian Dröge (slomo) from comment #41)
> > 
> > Thank you for the quick review!
> > 
> > > Review of attachment 352853 [details] [review] [review] [review] [review]:
> > > 
> > > Just some initial comments below
> > > 
> > > ::: configure.ac
> > > @@ +1920,3 @@
> > > +  save_CPPFLAGS="$CPPFLAGS"
> > > +  CPPFLAGS="$NVCUVID_CFLAGS $save_CPPFLAGS"
> > > +  AC_CHECK_HEADER([nvcuvid.h], [HAVE_NVCUVID_H=yes],
> > > 
> > > Why don't you use pkg-config?
> > 
> > At least on Debian/Ubuntu there is no .pc file provided by nvidia-cuda-dev.
> > nvcuvid.h is located directly under /usr/include so typically you don’t have
> > to set the NVCUVID_CFLAGS.
> 
> There are checks for cuda/etc for nvenc already, which use the .pc files if
> available or $CUDA_PREFIX otherwise. Can you reuse those instead of adding
> your own, and just keep the nvenc/nvdec specific parts for those plugins?

Done

> 
> > > ::: sys/nvdec/gstnvdec.c
> > > @@ +120,3 @@
> > > +  CUgraphicsResource *resources;
> > > +  guint num_resources;
> > > +} GstNvDecCudaGraphicsResourcesMeta;
> > > 
> > > Why is this a GstMeta instead of a custom GstMemory (e.g. based on
> > > GstGLBaseMemory)?
> > 
> > I considered that but then we would also need a custom allocator and maybe a
> > custom buffer pool and then the downstream element won’t be able to do the
> > allocation. I think using a GstMeta is the best temporary solution until we
> > have a GstCUDA library.
> 
> What do you mean here? Couldn't downstream allocate any kind of GL buffer
> (e.g. glimagesink) and nvdec could make use of that? The meta limits how
> useful this is in the end, as the meta could disappear and also requires
> elements to know about the meta (while GL memory is generally supported by
> various elements already, including nvenc).
> 

The meta is only there so that we can call cuGraphicsGLRegisterImage once per image, since it’s a costly operation, and associate the returned handle with the buffer so that we can retrieve it when needed, and also so that cuGraphicsUnregisterResource can be called when the buffer is freed. There’s no need for other elements to know about it. I think a downstream element would have to know about a custom GLMemory in order to allocate it, or are you suggesting that nvdec adds additional memory to an already allocated buffer similarly to how the meta is added now? Another option if you want to get rid of the meta is to keep a table in nvdec with weak references to the GLMemory that has been registered with CUDA, but that can be a bit problematic since the buffers could potentially outlive the element.
Comment 49 Sebastian Dröge (slomo) 2017-06-27 06:02:00 UTC
I'll merge this for now so we can work on this incrementally, but a meta seems problematic as what could happen is that the contained memory of your buffer is transferred to a new buffer (which then does not have the meta), and the old buffer might get destroyed. qdata on the memory might work better.
Comment 50 Sebastian Dröge (slomo) 2017-06-27 06:02:23 UTC
Attachment 354537 [details] pushed as ab9d87f - nvdec: New plugin for NVIDIA hardware video decode
Comment 51 leigh123linux@googlemail.com 2017-12-17 14:27:02 UTC
cuda-9 no longer uses/provides nvcuvid.h , it uses dynlink_nvcuvid.h instead

https://cgit.freedesktop.org/gstreamer/gst-plugins-bad/tree/sys/nvdec/gstnvdec.h#n32


-#include <nvcuvid.h>
-
+#if CUDA_VERSION >= 9000
+    #include <dynlink_nvcuvid.h>
+#else
+    #include <nvcuvid.h>
+#endif
Comment 52 Tim-Philipp Müller 2017-12-17 14:48:08 UTC
Thanks, do you want to make a patch?
Comment 53 Snir 2017-12-18 09:09:19 UTC
(In reply to leigh123linux@googlemail.com from comment #51)
> cuda-9 no longer uses/provides nvcuvid.h , it uses dynlink_nvcuvid.h instead
> 
> https://cgit.freedesktop.org/gstreamer/gst-plugins-bad/tree/sys/nvdec/
> gstnvdec.h#n32
> 
> 
> -#include <nvcuvid.h>
> -
> +#if CUDA_VERSION >= 9000
> +    #include <dynlink_nvcuvid.h>
> +#else
> +    #include <nvcuvid.h>
> +#endif

Hi,

AFAIK this header file is built for run-time dynamic linking,
so it will require additional code changes in order to work.
Comment 54 Tim-Philipp Müller 2017-12-18 10:12:37 UTC
-> bug #791724