GNOME Bugzilla – Bug 765435
plugins: rework dmabuf import
Last modified: 2016-10-31 14:24:13 UTC
Right now the dmabuf import into a VA-API surface is quite hackish. We need to fix it following what gstglupload does.
This a heads up of this task. During the hackfest in Thessaloniki I've been working on this issue. So far, the changes in gstvaapipluginbase and gstvaapivideobufferpool are considerable. The logic 1) Set the allocator from the element and not from vaapivideopool through pool config 2) In the case of the dmabuf import, don't use the vaapivideopool, but create a custom buffer associated with the upstream buffer (GstParentBufferMeta). This is how GstGL works, which looks simpler and more efficient. But I ran across with bug 759533, and no packet formats are uploaded correctly into the sink, but adding vaapipostproc, it is workaround-ed. Nonetheless, in the case of multiplanar/semiplanar color formats, it is completely broken (not a regression since it never worked), since the gstvaapi API is flawed: only one plane can be passed to VA.
Another round of comments: 1) All my research done during the hackfest was using v4l2src io-mode=dmabuf since it is the use-case supported by glimagesink and kmsssink 2) In the hackfest I got a hackish support for this use case (comment #1) 3) Until [1] I came aware that the use case supported by gstreamer-vaapi is v4l2ssrc io-mode=dmabuf-import 1. https://bugzilla.gnome.org/show_bug.cgi?id=755072#c25 3.1) My hackish dmabuf support removes the support of dmabuf-import 3.2) The support of dmabuf-import is base on a ugly-hack: the function has_dmabuf_capable_peer() 4) According to the history (bug 759481) we support a icamerasrc[2] but I don't know these devices. Does it support io-mode=dmabuf too? 2. https://github.com/01org/icamerasrc 5) Still, we need the vaapi dmabuf allocator as a srcpad allocator if the sink support dmabuf (glimagesink or kmssink) So, my work plan is try to bring the io-mode=dmabuf support without remove the io-mode=dmabuf-import, but I would like to remove it since it depends on a hack. What do you think?
(In reply to Víctor Manuel Jáquez Leal from comment #2) > 4) According to the history (bug 759481) we support a icamerasrc[2] but I > don't know these devices. Does it support io-mode=dmabuf too? > > 2. https://github.com/01org/icamerasrc I think there are some binary dependency for icamerasrc :) > > 5) Still, we need the vaapi dmabuf allocator as a srcpad allocator if the > sink support dmabuf (glimagesink or kmssink) > > So, my work plan is try to bring the io-mode=dmabuf support without remove > the io-mode=dmabuf-import, but I would like to remove it since it depends on > a hack. > Please don't remove vaapi dmabuf export capability (io-mode=dmabuf-import) with out having another working solution. There are customers who using it in production environment. We remove it only when we agreed/implement a proper solution. BTW, so no plan to implement dmabuf-capsfeature ?? > What do you think?
(In reply to sreerenj from comment #3) > BTW, so no plan to implement dmabuf-capsfeature ?? The agreement is try to avoid it if we can unless we find a use-case were it is unavoidable. Just as memory:VASurface capsfeature, I feel it is "avoidable" (removable) too ... hehehe...
(In reply to Víctor Manuel Jáquez Leal from comment #4) > (In reply to sreerenj from comment #3) > > BTW, so no plan to implement dmabuf-capsfeature ?? > > The agreement is try to avoid it if we can unless we find a use-case were it > is unavoidable. > > Just as memory:VASurface capsfeature, I feel it is "avoidable" (removable) > too ... hehehe... Hm I thought the correct solution could be like this (considering v4l2src and vaapipostproc as example) -- vaapipostproc always provide dmabuf-capable bufferpool upstream if dmabuf capsfeatures get negotiated. -- v4l2src by deafult work in import mode (io-mode=dmabuf-import), which means it will try to use the pool supplied by downstream vaapipostproc. -- in case if v4l2src failed to use the downstream supplied pool, it will jump to export mode (io-mode=export) and push buffers from it's on pool which will work fine with vaapipostproc.
(In reply to sreerenj from comment #5) > > Hm I thought the correct solution could be like this (considering v4l2src > and vaapipostproc as example) > > -- vaapipostproc always provide dmabuf-capable bufferpool upstream if dmabuf > capsfeatures get negotiated. When v4l2src is io-mode=dmabuf, it pushes buffers with dmabuf memory: if (gst_is_dmabuf_memory()) { .. } Thus, downstream element (vaapipostproc, e.g.) can recognize those buffers and "import" them, bypassing its sinkpad bufferpool, which is an improvement. Neither, a capsfeature, is needed to negotiate. Simple and clean. But (!) if v4l2src is io-mode=dmabuf-import, downstream element (vaapipostproc, e.g.) needs to know it (for now, using the hackish function of has_dmabuf_capable_peer()) and share a dmabuf capable bufferpool to upstream. This case may need a capsfeature, but I would prefer the previous mechanism. > -- v4l2src by deafult work in import mode (io-mode=dmabuf-import), which > means it will try to use the pool supplied by downstream vaapipostproc. I don't think v4l2src uses io-mode as dmabuf-import as default. Neither I see it as the best dmabuf related io-mode. > -- in case if v4l2src failed to use the downstream supplied pool, it will > jump to export mode (io-mode=export) and push buffers from it's on pool > which will work fine with vaapipostproc. The other problem is when VAAPI needs to push dmabuf buffers (bug 755072). This case is becoming a needed use-case since transfering 4K buffers to glimagesink is broken. The simple solution would be that all raw caps negotiation should deliver dmabuf memory based buffers, so glimagesink would "import" them without mem copy. No caps negotiation, no complex buffer pool sharing to upstream.
If be pleased to help unhack that code path. Since buffer pool have an activated function, there is no need to do anything before that moment. So vaapi element could offer multiple pool. That is one step. The next step would be in v4l2src. Currently the default is auto. But only mmap and read/write is implemented, we ignore downstream pool (unless downstream does not support video meta). What we could do is delay the io-mode decision to decide_allocation. From there, we would need to test each allocator to see if the memory is DMA buf, and can be imported. If not, we'd go for mmap (dmabuf export whenever available), otherwise read/write for legacy drivers that don't do anything else. What do you think?
(In reply to Nicolas Dufresne (stormer) from comment #7) > If be pleased to help unhack that code path. Since buffer pool have an > activated function, there is no need to do anything before that moment. So > vaapi element could offer multiple pool. That is one step. Are there elements in upstream already offering multiple pools? My current approach is offering one pools but juggling with the allocators. > > The next step would be in v4l2src. Currently the default is auto. But only > mmap and read/write is implemented, we ignore downstream pool (unless > downstream does not support video meta). What we could do is delay the > io-mode decision to decide_allocation. From there, we would need to test > each allocator to see if the memory is DMA buf, and can be imported. If not, > we'd go for mmap (dmabuf export whenever available), otherwise read/write > for legacy drivers that don't do anything else. What do you think? Sounds like a good approach. Another thing to consider is bug 765600 (Add gst_memory_try_map or new GstMapFlags to silent GST_ERROR on some situation)
(In reply to Víctor Manuel Jáquez Leal from comment #8) > (In reply to Nicolas Dufresne (stormer) from comment #7) > > If be pleased to help unhack that code path. Since buffer pool have an > > activated function, there is no need to do anything before that moment. So > > vaapi element could offer multiple pool. That is one step. > > Are there elements in upstream already offering multiple pools? > > My current approach is offering one pools but juggling with the allocators. Offering an allocator requires that you can use it to allocate (see allocate virtual). So far we have tried to avoid having to break things by inforcing a check on the CUSTOM_ALLOC flag before using an allocator, as this would break many existing elements. At the same time, most, if not all uses the first allocator.
(In reply to Víctor Manuel Jáquez Leal from comment #4) > (In reply to sreerenj from comment #3) > > BTW, so no plan to implement dmabuf-capsfeature ?? > > The agreement is try to avoid it if we can unless we find a use-case were it > is unavoidable. > > Just as memory:VASurface capsfeature, I feel it is "avoidable" (removable) > too ... hehehe... Yes In theory :), we could have avoid "memory:VASurface" negotiation too and do like dmabuf strategy. Like dmabuf, VASurface is mapable unless the Surface is encrypted. Currently vaapidecode already checking whether the surface is mapable or not by doing vaDeriveImage. If we didn't get the surface format then it is encrypted so we are setting(are supposed to set) "memory:VASurface, format=GST_VIDEO_FORMAT_ENCODED". In future we can may be change like: -- always use raw format , SystemMemory caps features -- set "memory:VASurface" as capsfeatreus only if unmapable/underiveable (encrypted) But IIRC, the decodebin logic to autoplug the hw accelearated elements were based on the number of capsfeature count supported by the element, non-systmemroy-capsfeature etc.
Created attachment 328812 [details] [review] gstcompat: add gst_video_info_changed() helper This function is shared among different elements, so let factorized it.
Created attachment 328813 [details] [review] gstcompat: add gst_video_info_force_nv12_if_encoded() This lines repeat a couple times in the code, so it would be better to put it a helper function.
Created attachment 328814 [details] [review] vaapivideobufferpool: keep only current video info Instead of keeping old and new GstVideoInfo video structure, we only keep one, the current one, the negotiated. The old one is not needed at all.
Created attachment 328815 [details] [review] pluginbase negotiates allocator with bufferpool Originally vaapivideobufferpool instantiates its own allocator regardless the received configuration, and it relies in custom configuration options to choose which kind of allocator instantiate. This patch transfers the responsibility of the allocator instantiate to vaapipluginbase and pass it to the vaapivideobufferpool through its configuration. * gst/vaapi/gstvaapipluginbase.c + set_dmabuf_allocator(): inserts a dmabuf allocator in the bufferpool + ensure_sinkpad_buffer_pool(): set a normal vaapi video allocator in bufferpool configuration + gst_vaapi_plugin_base_propose_allocation(): call set_dmabuf_allocator() if needed. + gst_vaapi_plugin_base_decide_allocation(): set a normal vaapi video allocator in bufferpool configuration * gst/vaapi/gstvaapivideobufferpool.c + gst_vaapi_video_buffer_pool_set_config(): instead of instantiate the allocator, process the received one through its configuration. * gst/vaapi/gstvaapivideobufferpool.h: removed GST_BUFFER_POOL_OPTION_DMABUF_MEMORY since it is not used anymore. * gst/vaapi/gstvaapivideomemory.c + gst_vaapi_is_dmabuf_allocator(): new helper function to identify a dmabuf allocator with the vaapi qdata.
Created attachment 328816 [details] [review] plugins: add gst_vaapi_plugin_base_create_pool() This patch refactors the code in pluginbase in order to centralize the buffer pool instantiation. As the buffer pool config may have different options, these are gathered using a bitwise flag.
Created attachment 328817 [details] [review] vaapivideobufferpool: share options flag with pluginbase Instead of keeping a set of boolean variables to set the bufferpool options, a single bitwise is used, just as it is used in pluginbase. The enum was moved to vaapivideobufferpool header.
Created attachment 328818 [details] [review] plugins: use an unique allocator per pad Instead of instantiating an allocator per vaapivideobufferpool, only one allocator is instantiated per element's pad and shared among future pools. If the pad's caps changes, the allocator is reset.
Created attachment 328819 [details] [review] libs: change gst_vaapi_surface_new_with_dma_buf_handle() Instead of passing the data already in GstVideoInfo, let's just pass the GstVideoInfo structure.
Created attachment 328820 [details] [review] plugins: cache VASurfaces from dmabufs This patch avoids the creation of a VASurface each time a new input buffer is processed, caching them in the input buffer itself.
Created attachment 328821 [details] [review] plugins: use GstBufferParentMeta Instead of using the VASurface proxy's notify, which is internal gstvaapi API, use the GStreamer's GstBufferParentMeta.
Do you think it is better to move the helper functions gst_video_info_changed() and gst_video_info_force_nv12_if_encoded() to somewhere else other than gstcompat.h?? These helper utility methods are not any compatibility glue codes, so may be gstvaapiplugintuils.{h,c} could be a better place IMHO.
(In reply to sreerenj from comment #21) > Do you think it is better to move the helper functions > gst_video_info_changed() and > gst_video_info_force_nv12_if_encoded() to somewhere else other than > gstcompat.h?? > These helper utility methods are not any compatibility glue codes, so may be > gstvaapiplugintuils.{h,c} could be a better place IMHO. Agreed. I'll change it.
Review of attachment 328817 [details] [review]: ::: gst/vaapi/gstvaapivideobufferpool.h @@ +77,3 @@ + GST_VAAPI_VIDEO_BUFFER_POOL_OPTION_VIDEO_ALIGNMENT = (1u << 1), + GST_VAAPI_VIDEO_BUFFER_POOL_OPTION_GL_TEXTURE_UPLOAD = (1u << 2), +}; How about giving a typedef name for this enum , may be like GstVaapiVideoBufferPoolOption ? It could be more readable.. For eg: the gst_vaapi_plugin_base_create_pool() comment for options says "@options: #GstBufferPool options encoded as bit-wise flags" I would prefer "#GstVaapiVideoBufferPoolOption encoded as bit-wise flags"
Review of attachment 328817 [details] [review]: ::: gst/vaapi/gstvaapivideobufferpool.c @@ +187,2 @@ + if (gst_buffer_pool_config_has_option (config, + GST_BUFFER_POOL_OPTION_VIDEO_ALIGNMENT)) { priv->options |= GST_VAAPI_VIDEO_BUFFER_POOL_OPTION_VIDEO_ALIGNMEN; Missing ????
Review of attachment 328812 [details] [review]: https://bugzilla.gnome.org/show_bug.cgi?id=765435#c22
Review of attachment 328813 [details] [review]: https://bugzilla.gnome.org/show_bug.cgi?id=765435#c22
Review of attachment 328814 [details] [review]: lgtm..
Review of attachment 328819 [details] [review]: Could be Implemented like this with an intention of clean API with minimal gst dependency for future libgstvaapi, which is not the case anymore :) lgtm ...
Review of attachment 328814 [details] [review]: changing the patch status
Review of attachment 328816 [details] [review]: See https://bugzilla.gnome.org/show_bug.cgi?id=765435#c23 ,otherwise lgtm
Review of attachment 328815 [details] [review]: lgtm
Review of attachment 328818 [details] [review]: lgtm, ::: gst/vaapi/gstvaapipluginbase.c @@ +493,3 @@ + if (plugin->sinkpad_allocator) + gst_object_unref (plugin->sinkpad_allocator); + plugin->sinkpad_allocator = allocator; So set_dmabuf_allocator() is only for upstream elements?
Review of attachment 328820 [details] [review]: lgtm, hope it wont hit race case ;)
Review of attachment 328821 [details] [review]: %s/GstBufferParentMeta/GstParentBufferMeta otherwise lgtm.
(In reply to sreerenj from comment #33) > Review of attachment 328820 [details] [review] [review]: > > lgtm, hope it wont hit race case ;) That's why the further usage of GstParentBufferMeta: the input buffer won't be modified until the child buffer is released.
Review of attachment 328817 [details] [review]: ::: gst/vaapi/gstvaapivideobufferpool.c @@ +187,2 @@ + if (gst_buffer_pool_config_has_option (config, + GST_BUFFER_POOL_OPTION_VIDEO_ALIGNMENT)) { has_video_alignment is only used locally to the function. It should not be hold in the private structure. That's why there is no need to add it in the options.
Review of attachment 328818 [details] [review]: ::: gst/vaapi/gstvaapipluginbase.c @@ +493,3 @@ + if (plugin->sinkpad_allocator) + gst_object_unref (plugin->sinkpad_allocator); + plugin->sinkpad_allocator = allocator; Nope, :) In bug 755072, I changed this a bit in order to use it also for downstream element.
Created attachment 329251 [details] [review] pluginutil: add gst_video_info_changed() helper This function is shared among different elements, so let factorized it.
Created attachment 329252 [details] [review] pluginutil: add gst_video_info_force_nv12_if_encoded() This lines repeat a couple times in the code, so it would be better to put it a helper function.
Created attachment 329253 [details] [review] vaapivideobufferpool: keep only current video info Instead of keeping old and new GstVideoInfo video structure, we only keep one, the current one, the negotiated. The old one is not needed at all.
Created attachment 329254 [details] [review] pluginbase negotiates allocator with bufferpool Originally vaapivideobufferpool instantiates its own allocator regardless the received configuration, and it relies in custom configuration options to choose which kind of allocator instantiate. This patch transfers the responsibility of the allocator instantiate to vaapipluginbase and pass it to the vaapivideobufferpool through its configuration. * gst/vaapi/gstvaapipluginbase.c + set_dmabuf_allocator(): inserts a dmabuf allocator in the bufferpool + ensure_sinkpad_buffer_pool(): set a normal vaapi video allocator in bufferpool configuration + gst_vaapi_plugin_base_propose_allocation(): call set_dmabuf_allocator() if needed. + gst_vaapi_plugin_base_decide_allocation(): set a normal vaapi video allocator in bufferpool configuration * gst/vaapi/gstvaapivideobufferpool.c + gst_vaapi_video_buffer_pool_set_config(): instead of instantiate the allocator, process the received one through its configuration. * gst/vaapi/gstvaapivideobufferpool.h: removed GST_BUFFER_POOL_OPTION_DMABUF_MEMORY since it is not used anymore. * gst/vaapi/gstvaapivideomemory.c + gst_vaapi_is_dmabuf_allocator(): new helper function to identify a dmabuf allocator with the vaapi qdata.
Created attachment 329255 [details] [review] plugins: add gst_vaapi_plugin_base_create_pool() This patch refactors the code in pluginbase in order to centralize the buffer pool instantiation. As the buffer pool config may have different options, these are gathered using a bitwise flag.
Created attachment 329256 [details] [review] vaapivideobufferpool: share options flag with pluginbase Originally, vaapivideobufferpool has a set of boolean variables for the buffer configuration options. This pach changes these boolean variables for a single bitwise, just as it is used in pluginbase. Hence, the internal enum was moved to vaapivideobufferpool header.
Created attachment 329257 [details] [review] plugins: use an unique allocator per pad Instead of instantiating an allocator per vaapivideobufferpool, only one allocator is instantiated per element's pad and shared among future pools. If the pad's caps changes, the allocator is reset.
Created attachment 329258 [details] [review] libs: change gst_vaapi_surface_new_with_dma_buf_handle() Instead of passing the data already in GstVideoInfo, let's just pass the GstVideoInfo structure.
Created attachment 329259 [details] [review] plugins: cache VASurfaces from dmabufs This patch avoids the creation of a VASurface each time a new input buffer is processed, caching them in the input buffer itself.
Created attachment 329260 [details] [review] plugins: use GstBufferParentMeta Instead of using the VASurface proxy's notify, which is internal gstvaapi API, use the GStreamer's GstBufferParentMeta.
Created attachment 329261 [details] [review] plugins: use GstParentBufferMeta Instead of using the VASurface proxy's notify, which is internal gstvaapi API, use the GStreamer's GstParentBufferMeta.
Attachment 329251 [details] pushed as 2562cd5 - pluginutil: add gst_video_info_changed() helper Attachment 329252 [details] pushed as 3d9f8dd - pluginutil: add gst_video_info_force_nv12_if_encoded() Attachment 329253 [details] pushed as 2643ae9 - vaapivideobufferpool: keep only current video info Attachment 329254 [details] pushed as 50b2423 - pluginbase negotiates allocator with bufferpool Attachment 329255 [details] pushed as d0c7218 - plugins: add gst_vaapi_plugin_base_create_pool() Attachment 329256 [details] pushed as 6c5b623 - vaapivideobufferpool: share options flag with pluginbase Attachment 329257 [details] pushed as ad4c38b - plugins: use an unique allocator per pad Attachment 329258 [details] pushed as 73d1228 - libs: change gst_vaapi_surface_new_with_dma_buf_handle() Attachment 329259 [details] pushed as 8292acf - plugins: cache VASurfaces from dmabufs Attachment 329261 [details] pushed as 8d7a0ae - plugins: use GstParentBufferMeta