GNOME Bugzilla – Bug 781537
NVDEC - Nvidia Decoder plugin
Last modified: 2018-01-19 13:56:38 UTC
I see there is no plugin for Nvidia Decoder, It would be great to have this plugin for Gstreamer. as of right now if you try to transcode something with NVidia's GPU you must use CPU to decode, can't take advantage of GPU's decoder. I see FFMPEG has it called "h264_cuvid" maybe a wrapper can be done for it to save time and not develop it from scratch. Or if it has to be done based on NVdecoder API here is a guide. http://developer2.download.nvidia.com/designworks/video-codec-sdk/secure/7.1/01/NVDEC_VideoDecoder_API_ProgGuide.pdf?aSJOwwYMXlACrYcwv0LSRlqNRXumKJdn_iw_30ZHJefdBydvRZu-8fxL_Yhva5WNhkUQ-qQuf3XvmEJiVAmUONdZX71YLCo5oT-3o0nlRGb0lRDZXboo9GfURffv8YPBexyP7kgPBhiezRBHGBDNvwHdWK41xyCwquc9sHTt-EzF1ra_4r2Qd4iY2aTm
How is that different from using VDPAU to decode?
I believe vdpau is deprecated/dead?
It's the (only?) supported, and recommended way of doing video decoding by NVIDIA.
Yes correct VDPAU is dead and it only supports old chips of Nvidia Maxwell and Kepler, no support for Pascal and other newer chips. Sebastian that's correct NVDEC is the only way to use Nvidia Decoding.
VDPAU is still supported by radeon driver in mesa and by nouveau. But for Nvidia propietary driver, NVDEC is the way to go.
Hi, guys we started developing a plugin for NVDEC we got it to work we decode successfully and wanted to share what we have so far. It's our first plugin we are wiring so please don't judge if you see something funny in the code lol I'm not sure what it takes if or if this is ready to be posted as a plugin in Gstreamer side for other developers to review and contribute to make it better but if this is good enough to be part of Gstreamer we can take down our repo and continue to contribute in Gstreamer's official repo. https://github.com/bldsoft/gst-plugin-bad-nvdec/
That's great, thanks! Can you provide a patch (in "git format-patch" format) against gst-plugins-bad here, then we can review it and get it merged.
to do that we would probably need help we have done this plugin based on http://siilo.dyndns.org/wiki/index.php/Creating_a_Plugin_template and depends on: gstreamer-1.0 >= $GST_REQUIRED gstreamer-base-1.0 >= $GST_REQUIRED gstreamer-controller-1.0 >= $GST_REQUIRED gstreamer-video-1.0 >= $GST_REQUIRED so it was done more like an extension.
Basically you would move your files into gst-plugins-bad/sys/nvdec, add things to sys/Makefile.am like the existing stuff there already, and then checks to configure.ac (like you already have now). Just try it, and I'll help you with anything where you get stuck / review that patch then.
Hi Sebastian, i don't like old patch style and i created pull request https://github.com/GStreamer/gst-plugins-bad/pull/3 please review it.
We don't use github for patch review (or anything), it's just a mirror of our git repositories. Please check here: https://gstreamer.freedesktop.org/documentation/contribute/#how-to-submit-patches tl;dr is: "git format-patch" style patch attached to this Bugzilla ticket :) Sorry for the inconvenience.
Created attachment 351544 [details] [review] NVDEC - Nvidia Decoder plugin
Hi Sebastian, i created patch please review.
Great, thanks! I'll go through that some time this week or early next week, it's on my list :)
Review of attachment 351544 [details] [review]: Build system parts look good, didn't look at the code but: ::: sys/nvdec/gstnvh264plugin.c @@ +3,3 @@ + Rixjob is free software: you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation, either version 3 of the License, or Can you make this available under the LGPL 2.1+? We generally don't merge GPL code, even less GPL3.
Sure, we will update license notes.
I should add new patch file, or need update previous?
Just a new patch. Bugzilla can create diffs between them as needed.
Created attachment 351562 [details] [review] Change copyrights Updated copyrights
Please attach a full patch with all changes at once, not an incremental patch :) Looks good though, thanks!
Created attachment 351573 [details] [review] 2-commits
Created attachment 351650 [details] [review] Fix picture buffer leak
Created attachment 351772 [details] [review] FIX_INVALID_CUDA_CONTEXT
Review of attachment 351772 [details] [review]: Hi, - I have a m2000 and K620 here with me so I plan to help out with testing as much as possible. - Do you have your code in some git to make it easier to review/test new changes? - This bug has duplicated patches, when you attach a new patch please mark the older ones as obsolete :) I tried building this, I have installed cuda-8.0, I hope this is correct! Warnings: gstnvh264dec.c: In function ‘pop_cuda_ctx’: gstnvh264dec.c:160:24: error: passing argument 1 of ‘cuCtxPopCurrent_v2’ from incompatible pointer type [-Werror=incompatible-pointer-types] cuCtxPopCurrent (pnvh264dec->cuda_ctx)); gstnvh264dec.c: In function ‘gst_nvh264dec_handle_picture_decode’: gstnvh264dec.c:314:3: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] gboolean is_ok = IS_CUDA_CALL_SUCCSESS (nvh264dec, ^~~~~~~~ gstnvh264dec.c:153:1: error: ‘pop_cuda_ctx’ defined but not used [-Werror=unused-function] pop_cuda_ctx (GstNvh264dec * nvh264dec) ^~~~~~~~~~~~ pop_cuda_ctx() seems to be commented out for some reason and seems that you can fix the first one by passing &pnvh264dec->cuda_ctx I hope to provide better feedback later this week.
Hi sure we have public repository, you can see here: https://github.com/bldsoft/gst-plugins-bad . I updated according your style.
Hi I was able to build and use the plugin on k620 and it works pretty well!! i had some small issues but maybe i did something wrong while running\building it.. -Sometimes it shows only the first frame and freezes, so i have to start it again (i'm using gst-launch-1.0 filesrc location=/tmp/file.h264 ! nvh264dec ! glimagesink (or ! videoconvert ! xvimagesink)) -I tried to play two movies one had 1365*1024 resolution and the other one 2560*1450, both were encoded same way using Nvidia encoder, but the first one gives: gstnvh264dec.c:763:gst_nvh264dec_send_decoded_frame:<nvh264dec0> Cuda buffer size: 2359296, not equal gstreamer buffer size: 2101248 -And one thing that i had to change is --- a/sys/nvdec/gstnvh264plugin.c +++ b/sys/nvdec/gstnvh264plugin.c @@ -35,7 +35,7 @@ plugin_init (GstPlugin * plugin) GST_PLUGIN_DEFINE (GST_VERSION_MAJOR, GST_VERSION_MINOR, - nvh264dec, + nvdec, "Nvidia cuda decoder plugin", otherwise the .so file is blacklisted Did you encounter these issues? Which Nvidia tools are needed in order to build it? currently my driver is 381.22 and i'm using cuda packages from non-official repo, but i would prefer to use only Nvidia proprietary tools (maybe that's why i had this issues? i'm not sure :/) (btw I'm building on fedora, the cuda pkgs were called differently so i had to change this too: - PKG_CHECK_MODULES([CUDA], [cuda-8.0 cudart-8.0],, [ + PKG_CHECK_MODULES([CUDA], [cuda cudart],, [ ) Thank you!
(In reply to Snir from comment #26) > -Sometimes it shows only the first frame and freezes, so i have to start it > again (i'm using gst-launch-1.0 filesrc location=/tmp/file.h264 ! nvh264dec > ! glimagesink (or ! videoconvert ! xvimagesink)) You should use h264parse after the filesrc > -I tried to play two movies one had 1365*1024 resolution and the other one > 2560*1450, both were encoded same way using Nvidia encoder, but the first > one gives: > gstnvh264dec.c:763:gst_nvh264dec_send_decoded_frame:<nvh264dec0> Cuda buffer > size: 2359296, not equal gstreamer buffer size: 2101248 This needs some changes: either use the GstVideoMeta to configure the correct plane offsets and strides, or copy internally to a newly allocated GStreamer buffer. > -And one thing that i had to change is > --- a/sys/nvdec/gstnvh264plugin.c > +++ b/sys/nvdec/gstnvh264plugin.c > @@ -35,7 +35,7 @@ plugin_init (GstPlugin * plugin) > > GST_PLUGIN_DEFINE (GST_VERSION_MAJOR, > GST_VERSION_MINOR, > - nvh264dec, > + nvdec, > "Nvidia cuda decoder plugin", > otherwise the .so file is blacklisted That's newly required since 1.13.0 and correct. > (btw I'm building on fedora, the cuda pkgs were called differently so i had > to change this too: > - PKG_CHECK_MODULES([CUDA], [cuda-8.0 cudart-8.0],, [ > + PKG_CHECK_MODULES([CUDA], [cuda cudart],, [ > ) Maybe we should check for all the different versions that are supported here and use whatever is available?
Created attachment 352563 [details] [review] nvdec: New plugin using NVIDIA nvcuvid decoder API
Created attachment 352564 [details] [review] nvdec: New plugin using NVIDIA nvcuvid decoder API
Updated the patch from the GIT repository with the latest version, and wrote a proper commit message, and rebased it against master. No other changes
Review of attachment 352564 [details] [review]: Sorry for the late review! Some comments below ::: configure.ac @@ +2036,3 @@ + PKG_CHECK_MODULES([CUDA], [cuda-7.5 cudart-7.5],, [ + PKG_CHECK_MODULES([CUDA], [cuda-7.0 cudart-7.0],, [ + PKG_CHECK_MODULES([CUDA], [cuda-6.5 cudart-6.5],, [ Maybe make this a for loop over all the versions, also check for unversioned (and then check the module's version number instead). And 8.0 ::: sys/nvdec/gstnvh264dec.c @@ +206,3 @@ + GST_DEBUG_FUNCPTR (gst_nvh264dec_set_format); + video_decoder_class->drain = GST_DEBUG_FUNCPTR (gst_nvh264dec_drain); + video_decoder_class->reset = GST_DEBUG_FUNCPTR (gst_nvh264dec_reset); reset() is deprecated, finish() / flush() / drain() replace it @@ +224,3 @@ +{ + GstNvh264decPrivate *pnvh264dec = GST_NVH264DEC_GET_PRIVATE (nvh264dec); + IS_CUDA_CALL_SUCCSESS (nvh264dec, cuInit (0)); Can this be done multiple times? @@ +233,3 @@ + pnvh264dec->host_data_size = 0; + pnvh264dec->width = 0; + pnvh264dec->height = 0; Usually you would just store the output GstVideoCodecState and use the width/height from the GstVideoInfo in there @@ +234,3 @@ + pnvh264dec->width = 0; + pnvh264dec->height = 0; + g_mutex_init (&pnvh264dec->queue_mutex); Need to clear() mutex (and probably other things) in GObject::finalize() @@ +317,3 @@ + pnvh264dec->is_frame_in_use[cuviddisp->picture_index] = TRUE; + // Wait until we have a free entry in the display queue (should never block if we have enough + // entries) Instead of a loop without waiting here, maybe use a GCond? Also maybe these queues should be of configurable size? You could use a GstQueueArray @@ +623,3 @@ + if (gst_structure_get_int (pad_struct, "width", &width)) { + } + if (gst_structure_get_int (pad_struct, "height", &height)) { These should already be in the GstVideoInfo in state @@ +626,3 @@ + } + + codec_data = gst_structure_get_string (pad_struct, "codec_data"); It's never a string but a GstBuffer, and will only exist for stream-format=avc @@ +753,3 @@ + } + + memcpy (omap_info.data, pnvh264dec->host_data, pnvh264dec->host_data_size); Why do you memcpy twice here? Once with cuMemcpyDtoH(), once with memcpy() Ideally (in a later version) we would somehow implement zerocopy here if downstream can support it @@ +822,3 @@ + && gst_nvh264dec_dequeue_frame (pnvh264dec, &cuviddisp)) { + GstFlowReturn ret = + gst_nvh264dec_send_decoded_frame (decoder, &cuviddisp, frame); This seems to assume that coding and display order are the same (which is not the case), and that the frame that is currently output is the same as the input frame (which is also not always the case). You need to get the correct GstVideoCodecFrame here that corresponds to the input frame that we're going to output now. There's various API in GstVideoDecoder for tracking these. ::: sys/nvdec/gstnvh264plugin.c @@ +36,3 @@ +GST_PLUGIN_DEFINE (GST_VERSION_MAJOR, + GST_VERSION_MINOR, + nvh264dec, Has to be nvdec as that's the plugin filename @@ +39,3 @@ + "Nvidia cuda decoder plugin", + plugin_init, + VERSION, "BSD", "Setplex GStreamer plugins", "http://www.setplex.com") VERSION, GST_LICENSE, GST_PACKAGE_NAME, GST_PACKAGE_ORIGIN
Review of attachment 352564 [details] [review]: ::: configure.ac @@ +2036,3 @@ + PKG_CHECK_MODULES([CUDA], [cuda-7.5 cudart-7.5],, [ + PKG_CHECK_MODULES([CUDA], [cuda-7.0 cudart-7.0],, [ + PKG_CHECK_MODULES([CUDA], [cuda-6.5 cudart-6.5],, [ agree ::: sys/nvdec/gstnvh264dec.c @@ +206,3 @@ + GST_DEBUG_FUNCPTR (gst_nvh264dec_set_format); + video_decoder_class->drain = GST_DEBUG_FUNCPTR (gst_nvh264dec_drain); + video_decoder_class->reset = GST_DEBUG_FUNCPTR (gst_nvh264dec_reset); I don't know about this, please do it. @@ +224,3 @@ +{ + GstNvh264decPrivate *pnvh264dec = GST_NVH264DEC_GET_PRIVATE (nvh264dec); + IS_CUDA_CALL_SUCCSESS (nvh264dec, cuInit (0)); Yes, we tested on many streams on one machine, cuInit can be called many times. @@ +234,3 @@ + pnvh264dec->width = 0; + pnvh264dec->height = 0; + g_mutex_init (&pnvh264dec->queue_mutex); agree @@ +317,3 @@ + pnvh264dec->is_frame_in_use[cuviddisp->picture_index] = TRUE; + // Wait until we have a free entry in the display queue (should never block if we have enough + // entries) agree @@ +623,3 @@ + if (gst_structure_get_int (pad_struct, "width", &width)) { + } + if (gst_structure_get_int (pad_struct, "height", &height)) { Please do it, i don't know about this struct. @@ +626,3 @@ + } + + codec_data = gst_structure_get_string (pad_struct, "codec_data"); agree @@ +753,3 @@ + } + + memcpy (omap_info.data, pnvh264dec->host_data, pnvh264dec->host_data_size); Yes, should allocate omap_info struct and copy from cuda memory. @@ +822,3 @@ + && gst_nvh264dec_dequeue_frame (pnvh264dec, &cuviddisp)) { + GstFlowReturn ret = + gst_nvh264dec_send_decoded_frame (decoder, &cuviddisp, frame); In mostly you are right, but i don't know how to get decoded frame in same thread where received input data, if play with push/pop context after some execution this calls will destroy cuda context(all cuda calls should be done in one thread). If you know how to do decoding(in mostly only coping from cuda memory) and displaying in same thread please do it. ::: sys/nvdec/gstnvh264plugin.c @@ +36,3 @@ +GST_PLUGIN_DEFINE (GST_VERSION_MAJOR, + GST_VERSION_MINOR, + nvh264dec, May be better rename target, because cuda cards can decode other stream types. @@ +39,3 @@ + "Nvidia cuda decoder plugin", + plugin_init, + VERSION, "BSD", "Setplex GStreamer plugins", "http://www.setplex.com") agree
Created attachment 352587 [details] [review] Fixes after review
Also updated in our public repository, i don't know how to fix this line: + VERSION, "BSD", "Setplex GStreamer plugins", "http://www.setplex.com")
(In reply to Snir from comment #26) > Hi > > I was able to build and use the plugin on k620 and it works pretty well!! > i had some small issues but maybe i did something wrong while > running\building it.. > > -Sometimes it shows only the first frame and freezes, so i have to start it > again (i'm using gst-launch-1.0 filesrc location=/tmp/file.h264 ! nvh264dec > ! glimagesink (or ! videoconvert ! xvimagesink)) > > -I tried to play two movies one had 1365*1024 resolution and the other one > 2560*1450, both were encoded same way using Nvidia encoder, but the first > one gives: > gstnvh264dec.c:763:gst_nvh264dec_send_decoded_frame:<nvh264dec0> Cuda buffer > size: 2359296, not equal gstreamer buffer size: 2101248 > > -And one thing that i had to change is > --- a/sys/nvdec/gstnvh264plugin.c > +++ b/sys/nvdec/gstnvh264plugin.c > @@ -35,7 +35,7 @@ plugin_init (GstPlugin * plugin) > > GST_PLUGIN_DEFINE (GST_VERSION_MAJOR, > GST_VERSION_MINOR, > - nvh264dec, > + nvdec, > "Nvidia cuda decoder plugin", > otherwise the .so file is blacklisted > > Did you encounter these issues? > > Which Nvidia tools are needed in order to build it? currently my driver is > 381.22 and i'm using cuda packages from non-official repo, but i would > prefer to use only Nvidia proprietary tools (maybe that's why i had this > issues? i'm not sure :/) > > (btw I'm building on fedora, the cuda pkgs were called differently so i had > to change this too: > - PKG_CHECK_MODULES([CUDA], [cuda-8.0 cudart-8.0],, [ > + PKG_CHECK_MODULES([CUDA], [cuda cudart],, [ > ) > > > Thank you! Hi, I removed check of different video buffer, this happens in ths lines pnvh264dec->width = dec_param.ulTargetWidth = PAD_ALIGN(width, 0x3F) * 4 / 3; pnvh264dec->height = dec_param.ulTargetHeight = PAD_ALIGN(height, 0x0F) * 4 / 3; you have non standard video size and round algorithm not working properly. Also in new version doing only 1 copying and after that spend less 2X memory.
(In reply to atopilski@gmail.com from comment #34) > Also updated in our public repository, i don't know how to fix this line: + > VERSION, "BSD", "Setplex GStreamer plugins", "http://www.setplex.com") Replace it with: VERSION, GST_LICENSE, GST_PACKAGE_NAME, GST_PACKAGE_ORIGIN
(In reply to atopilski@gmail.com from comment #32) > ::: sys/nvdec/gstnvh264dec.c > @@ +206,3 @@ > + GST_DEBUG_FUNCPTR (gst_nvh264dec_set_format); > + video_decoder_class->drain = GST_DEBUG_FUNCPTR (gst_nvh264dec_drain); > + video_decoder_class->reset = GST_DEBUG_FUNCPTR (gst_nvh264dec_reset); > > I don't know about this, please do it. I don't have the hardware to test this, so either you'll have to do it or someone else with the hardware. But basically you just replace reset() with flush() in your case, that's all. > @@ +623,3 @@ > + if (gst_structure_get_int (pad_struct, "width", &width)) { > + } > + if (gst_structure_get_int (pad_struct, "height", &height)) { > > Please do it, i don't know about this struct. Same as above, I can't do that. Just look at the "info" member of the GstVideoCodec state you get passed in here. You can e.g. do GST_VIDEO_INFO_WIDTH(&state->info) to get the width, it will be 0 if unset. > @@ +822,3 @@ > + && gst_nvh264dec_dequeue_frame (pnvh264dec, &cuviddisp)) { > + GstFlowReturn ret = > + gst_nvh264dec_send_decoded_frame (decoder, &cuviddisp, frame); > > In mostly you are right, but i don't know how to get decoded frame in same > thread where received input data, if play with push/pop context after some > execution this calls will destroy cuda context(all cuda calls should be done > in one thread). If you know how to do decoding(in mostly only coping from > cuda memory) and displaying in same thread please do it. I don't know, and I don't have the hardware to test anything. You could check the code in ffmpeg for example. There's probably a way to attach some opaque information to the frames you pass to the decoder, and you later get that same information back in the decoded frame. Like a frame number, or some pointer where you can store something. You could then use the system_frame_number field of the GstVideoCodecFrame. Store that in there, and get it back with gst_video_decoder_get_frame(). Without doing this correctly, at least playback of video streams with B-frames will be broken. > ::: sys/nvdec/gstnvh264plugin.c > @@ +36,3 @@ > +GST_PLUGIN_DEFINE (GST_VERSION_MAJOR, > + GST_VERSION_MINOR, > + nvh264dec, > > May be better rename target, because cuda cards can decode other stream > types. Yes
Ericsson has an implementation that we can contribute. It uses GstGL and CUDA-OpenGL interoperability so that decoded frames remain in GPU memory. I can provide a patch here if you are interested.
Sure, that would be great. Thanks! Sounds like your version is already much more advanced than this one.
Created attachment 352853 [details] [review] nvdec: New plugin for NVIDIA hardware video decode
Review of attachment 352853 [details] [review]: Just some initial comments below ::: configure.ac @@ +1920,3 @@ + save_CPPFLAGS="$CPPFLAGS" + CPPFLAGS="$NVCUVID_CFLAGS $save_CPPFLAGS" + AC_CHECK_HEADER([nvcuvid.h], [HAVE_NVCUVID_H=yes], Why don't you use pkg-config? ::: sys/nvdec/gstnvdec.c @@ +120,3 @@ + CUgraphicsResource *resources; + guint num_resources; +} GstNvDecCudaGraphicsResourcesMeta; Why is this a GstMeta instead of a custom GstMemory (e.g. based on GstGLBaseMemory)? @@ +617,3 @@ + else if (!g_strcmp0 (caps_name, "video/x-h265")) + parser_params.CodecType = cudaVideoCodec_HEVC; + else { Add some {} above here @@ +735,3 @@ + pending_frame = pending_frames->data; + if (!pending_frame->system_frame_number) + break; Why? @@ +790,3 @@ + decode_params = (CUVIDPICPARAMS *) item->data; + pending_frame = pending_frames->data; + pending_frame->system_frame_number = decode_params->CurrPicIdx + 1; You should never override the system_frame_number, it's read-only from a subclass point of view
Created attachment 352899 [details] [review] nvdec: New plugin for NVIDIA hardware video decode
(In reply to Sebastian Dröge (slomo) from comment #41) Thank you for the quick review! > Review of attachment 352853 [details] [review] [review]: > > Just some initial comments below > > ::: configure.ac > @@ +1920,3 @@ > + save_CPPFLAGS="$CPPFLAGS" > + CPPFLAGS="$NVCUVID_CFLAGS $save_CPPFLAGS" > + AC_CHECK_HEADER([nvcuvid.h], [HAVE_NVCUVID_H=yes], > > Why don't you use pkg-config? At least on Debian/Ubuntu there is no .pc file provided by nvidia-cuda-dev. nvcuvid.h is located directly under /usr/include so typically you don’t have to set the NVCUVID_CFLAGS. > > ::: sys/nvdec/gstnvdec.c > @@ +120,3 @@ > + CUgraphicsResource *resources; > + guint num_resources; > +} GstNvDecCudaGraphicsResourcesMeta; > > Why is this a GstMeta instead of a custom GstMemory (e.g. based on > GstGLBaseMemory)? I considered that but then we would also need a custom allocator and maybe a custom buffer pool and then the downstream element won’t be able to do the allocation. I think using a GstMeta is the best temporary solution until we have a GstCUDA library. > > @@ +617,3 @@ > + else if (!g_strcmp0 (caps_name, "video/x-h265")) > + parser_params.CodecType = cudaVideoCodec_HEVC; > + else { > > Add some {} above here Done > > @@ +735,3 @@ > + pending_frame = pending_frames->data; > + if (!pending_frame->system_frame_number) > + break; > > Why? Added a comment to clarify that we are looking for the oldest unfinished frame that has not yet been passed to the decoder. > > @@ +790,3 @@ > + decode_params = (CUVIDPICPARAMS *) item->data; > + pending_frame = pending_frames->data; > + pending_frame->system_frame_number = decode_params->CurrPicIdx + 1; > > You should never override the system_frame_number, it's read-only from a > subclass point of view Ok, fixed. I thought the purpose of the separate decode_frame_number was so that system_frame_number could be changed. At least currently there is nothing in the base class that breaks when changing it.
Created attachment 353164 [details] [review] nvdec: New plugin for NVIDIA hardware video decode
Created attachment 353240 [details] [review] nvdec: New plugin for NVIDIA hardware video decode Fixed a problem that was introduced when starting to use the list of pending frames to look up the frame to finish.
(In reply to Per-Erik Brodin from comment #43) > (In reply to Sebastian Dröge (slomo) from comment #41) > > Thank you for the quick review! > > > Review of attachment 352853 [details] [review] [review] [review]: > > > > Just some initial comments below > > > > ::: configure.ac > > @@ +1920,3 @@ > > + save_CPPFLAGS="$CPPFLAGS" > > + CPPFLAGS="$NVCUVID_CFLAGS $save_CPPFLAGS" > > + AC_CHECK_HEADER([nvcuvid.h], [HAVE_NVCUVID_H=yes], > > > > Why don't you use pkg-config? > > At least on Debian/Ubuntu there is no .pc file provided by nvidia-cuda-dev. > nvcuvid.h is located directly under /usr/include so typically you don’t have > to set the NVCUVID_CFLAGS. There are checks for cuda/etc for nvenc already, which use the .pc files if available or $CUDA_PREFIX otherwise. Can you reuse those instead of adding your own, and just keep the nvenc/nvdec specific parts for those plugins? > > ::: sys/nvdec/gstnvdec.c > > @@ +120,3 @@ > > + CUgraphicsResource *resources; > > + guint num_resources; > > +} GstNvDecCudaGraphicsResourcesMeta; > > > > Why is this a GstMeta instead of a custom GstMemory (e.g. based on > > GstGLBaseMemory)? > > I considered that but then we would also need a custom allocator and maybe a > custom buffer pool and then the downstream element won’t be able to do the > allocation. I think using a GstMeta is the best temporary solution until we > have a GstCUDA library. What do you mean here? Couldn't downstream allocate any kind of GL buffer (e.g. glimagesink) and nvdec could make use of that? The meta limits how useful this is in the end, as the meta could disappear and also requires elements to know about the meta (while GL memory is generally supported by various elements already, including nvenc). Having an allocator, buffer pool, etc seems not a problem though.
Created attachment 354537 [details] [review] nvdec: New plugin for NVIDIA hardware video decode
(In reply to Sebastian Dröge (slomo) from comment #46) > (In reply to Per-Erik Brodin from comment #43) > > (In reply to Sebastian Dröge (slomo) from comment #41) > > > > Thank you for the quick review! > > > > > Review of attachment 352853 [details] [review] [review] [review] [review]: > > > > > > Just some initial comments below > > > > > > ::: configure.ac > > > @@ +1920,3 @@ > > > + save_CPPFLAGS="$CPPFLAGS" > > > + CPPFLAGS="$NVCUVID_CFLAGS $save_CPPFLAGS" > > > + AC_CHECK_HEADER([nvcuvid.h], [HAVE_NVCUVID_H=yes], > > > > > > Why don't you use pkg-config? > > > > At least on Debian/Ubuntu there is no .pc file provided by nvidia-cuda-dev. > > nvcuvid.h is located directly under /usr/include so typically you don’t have > > to set the NVCUVID_CFLAGS. > > There are checks for cuda/etc for nvenc already, which use the .pc files if > available or $CUDA_PREFIX otherwise. Can you reuse those instead of adding > your own, and just keep the nvenc/nvdec specific parts for those plugins? Done > > > > ::: sys/nvdec/gstnvdec.c > > > @@ +120,3 @@ > > > + CUgraphicsResource *resources; > > > + guint num_resources; > > > +} GstNvDecCudaGraphicsResourcesMeta; > > > > > > Why is this a GstMeta instead of a custom GstMemory (e.g. based on > > > GstGLBaseMemory)? > > > > I considered that but then we would also need a custom allocator and maybe a > > custom buffer pool and then the downstream element won’t be able to do the > > allocation. I think using a GstMeta is the best temporary solution until we > > have a GstCUDA library. > > What do you mean here? Couldn't downstream allocate any kind of GL buffer > (e.g. glimagesink) and nvdec could make use of that? The meta limits how > useful this is in the end, as the meta could disappear and also requires > elements to know about the meta (while GL memory is generally supported by > various elements already, including nvenc). > The meta is only there so that we can call cuGraphicsGLRegisterImage once per image, since it’s a costly operation, and associate the returned handle with the buffer so that we can retrieve it when needed, and also so that cuGraphicsUnregisterResource can be called when the buffer is freed. There’s no need for other elements to know about it. I think a downstream element would have to know about a custom GLMemory in order to allocate it, or are you suggesting that nvdec adds additional memory to an already allocated buffer similarly to how the meta is added now? Another option if you want to get rid of the meta is to keep a table in nvdec with weak references to the GLMemory that has been registered with CUDA, but that can be a bit problematic since the buffers could potentially outlive the element.
I'll merge this for now so we can work on this incrementally, but a meta seems problematic as what could happen is that the contained memory of your buffer is transferred to a new buffer (which then does not have the meta), and the old buffer might get destroyed. qdata on the memory might work better.
Attachment 354537 [details] pushed as ab9d87f - nvdec: New plugin for NVIDIA hardware video decode
cuda-9 no longer uses/provides nvcuvid.h , it uses dynlink_nvcuvid.h instead https://cgit.freedesktop.org/gstreamer/gst-plugins-bad/tree/sys/nvdec/gstnvdec.h#n32 -#include <nvcuvid.h> - +#if CUDA_VERSION >= 9000 + #include <dynlink_nvcuvid.h> +#else + #include <nvcuvid.h> +#endif
Thanks, do you want to make a patch?
(In reply to leigh123linux@googlemail.com from comment #51) > cuda-9 no longer uses/provides nvcuvid.h , it uses dynlink_nvcuvid.h instead > > https://cgit.freedesktop.org/gstreamer/gst-plugins-bad/tree/sys/nvdec/ > gstnvdec.h#n32 > > > -#include <nvcuvid.h> > - > +#if CUDA_VERSION >= 9000 > + #include <dynlink_nvcuvid.h> > +#else > + #include <nvcuvid.h> > +#endif Hi, AFAIK this header file is built for run-time dynamic linking, so it will require additional code changes in order to work.
-> bug #791724