GNOME Bugzilla – Bug 783521
gl: Add "direct" dmabuf uploader
Last modified: 2018-11-02 13:19:49 UTC
Currently, the GStreamer OpenGL stack uses shaders to do the YUV->RGB conversion. For that, it needs to be able to set up EGLImages with R8 and R8G8 pixel format. Some GPUs/drivers might not support that (etnaviv in its current form for example). In addition, some GPUs like the Vivante ones can perform the YUV->RGB conversion internally, which means no custom conversion shaders are required. Currently, it is not possible to make use of such a feature. These set of patches introduce support for such internal converters. The patches are not finished - in particular, the gluploader changes are WIP - but they already work on i.MX6. The remaining issues are documented in the commit messages, and also in the code as TODOs.
Created attachment 353349 [details] [review] 0001 fix sanity check in glmemory There is a curious sanity check in gstglmemory.c that prevents the direct dmabuf uploader from working. It is unclear what the intent was, but it seems faulty to me. Here is a proposed fix.
Created attachment 353351 [details] [review] 0002 add new gst_egl_image_from_dmabuf variant For direct uploading, it is necessary to set up the EGLImage to use all planes, but the existing gst_egl_image_from_dmabuf() function is hardcoded to create one EGLImage per plane. Therefore, a new variant is needed, gst_egl_image_from_dmabuf_direct(). To keep consistency in the names, the existing gst_egl_image_from_dmabuf() is renamed to gst_egl_image_from_dmabuf_as_rgba_plane(). Note that there is one TODO in the code. It is unclear whether to pass RGB or RGBA to gst_egl_image_new_wrapped(). Both seem to work, but perhaps one of them is more efficient/more compatible with more GPUs?
Created attachment 353352 [details] [review] 0003 add direct dmabuf uploader The actual direct dmabuf uploader code. Note the remaining issues in the commit message of the patch. Here it is: - It is currently unknown how to determine if the GPU can handle these conversions - Currently it is not possible to probe for what EGLImage DMABUF input formats the GPU can handle - The EGLImage cache is there, but cannot be used, because some drivers like etnaviv keep a cached copy of the texture inside, and it is currently not clear how to let etnaviv know that the cache has to be flushed once the associated DMABUF gets a new video frame Interestingly, disabling the cache does not seem to impact performance much on the i.MX6 (CPU usage is <10% when using a v4l2videodec capture-io-mode=dmabuf ! glimagesink pipeline even with 1080p content), even though in kmscube, the EGLImage is similarly constantly created and destroyed, and there, we see large CPU% figures..
Review of attachment 353352 [details] [review]: Looks overall good, I don't think the probing of color format is a blocker, but the list is a bit small. We just need something for the caching, which I believe could be solved by using external texture target. That would slow down redraw, but it's minor thing for video streaming. ::: gst-libs/gst/gl/gstglupload.c @@ +36,3 @@ #if GST_GL_HAVE_DMABUF #include <gst/allocators/gstdmabuf.h> +#include <libdrm/drm_fourcc.h> Not used, at least not in this file. @@ +636,3 @@ + * block below. */ +#if 1 + dmabuf->eglimage = NULL; Comment out the entire thing, otherwise compilers may warn. We could also add a getenv hack here to disable that. My research this week led that assuming the texture will be refreshed is undefined with TEXTURE_2D, did you experiment with EXTERNAL texture ?
Review of attachment 353349 [details] [review]: Not quite. ::: gst-libs/gst/gl/gstglmemory.c @@ +1459,2 @@ g_return_val_if_fail (!wrapped_data + || n_mem >= n_wrapped_pointers, FALSE); This should be a == and should still include the views multiplication. The number of wrapped pointers must be equivalent to the number of memory blocks.
Review of attachment 353351 [details] [review]: The for loop seems to complicate the code flow a bit, it's probably best to just expand it out in full and avoid the indirections. What other code does for these GL attribute pairs is: guint i; EGLAttrib attribs[SOME_LARGE_ENOUGH_VALUE] = { 0, }; attribs[i++] = SOME_ATTRIBUTE; attribs[i++] = some_value; ... attribs[i++] = sentinel; g_assert (i < SOME_LARGE_ENOUGH_VALUE); Need to also use the GST_*_OBJECT debugging macros where possible instead of the non-OBJECT variants. I think the name of the two functions should be gst_egl_image_from_dmabuf_as_rgba() and gst_egl_image_from_dmabuf(). ::: gst-libs/gst/gl/egl/gsteglimage.c @@ +617,3 @@ + + /* Make sure we never set up more than MAX_NUM_DMA_BUF_PLANES planes */ + if (G_UNLIKELY (n_planes > MAX_NUM_DMA_BUF_PLANES)) This should be a g_return_val_if_fail() @@ +635,3 @@ + if (!gst_buffer_find_memory (buffer, in_info->offset[i], plane_size, + &mems_idx[i], &length, &mems_skip[i])) { + GST_DEBUG ("Couldn't find memory for plane %d", i); GST_WARNING_OBJECT at least @@ +642,3 @@ + if (length != 1) { + GST_DEBUG + ("There are multiple dmabufs for plane %d, which is unsupported", i); GST_FIXME_OBJECT or GST_WARNING_OBJECT. @@ +651,3 @@ + /* And all memory found must be dmabuf */ + if (!gst_is_dmabuf_memory (mems[i])) { + GST_DEBUG ("Plane %d memory isn't dmabuf", i); GST_FIXME_OBJECT or GST_WARNING_OBJECT @@ +667,3 @@ + attribs[6 + 6 * i + 5] = stride; + + GST_DEBUG ("Plane %d has FD %d offset %" G_GSIZE_FORMAT " stride %" GST_DEBUG_OBJECT @@ +676,3 @@ + img = _gst_egl_image_create (context, EGL_LINUX_DMA_BUF_EXT, NULL, attribs); + if (!img) { + GST_WARNING ("eglCreateImage failed: %s", GST_WARNING_OBJECT
Review of attachment 353352 [details] [review]: Looks mostly good. Probing is not a blocker as long as we fallback correctly. Same with determining if the GPU can handle the conversion. For clearing caches, the intent of the spec and general consensus has been that if a texture is updated externally, the texture needs to be rebound (i.e. glBindTexture()) in order to receive the new contents. The actual synchronisation happens inside the driver though. A question on this dmabuf, what happens when mapping the textures to sysmem? Are they YUV or RGBA? is it even possible? Are they read-only? You can test this out with a pipeline like so: dmabuf-producer ! glupload ! gldownload ! xvimagesink ::: gst-libs/gst/gl/gstglupload.c @@ +482,2 @@ #if GST_GL_HAVE_DMABUF + spurious newline @@ +652,3 @@ + } + + GST_TRACE ("Got GstEGLImage %p", (gpointer) (dmabuf->eglimage)); GST_TRACE_OBJECT @@ +680,3 @@ + ptr = (gpointer *) dmabuf->eglimage; + gst_gl_memory_setup_buffer (allocator, dmabuf->outbuf, dmabuf->params, NULL, + &ptr, 1); erm, I don't gst_gl_memory_setup_buffer() is required to be performed in the OpenGL thread so this entire function can be be merged with _direct_dma_buf_upload_perform(). @@ +951,3 @@ }; + spurious newline @@ +954,3 @@ #endif /* GST_GL_HAVE_DMABUF */ + spurious newline
Created attachment 357449 [details] [review] qt element GBM support patch Before I continue with the cleanup of this uploader, have a look at this patch that I added to make qmlglsink work with Qt5 EGLFS + GBM. The comments indicate that using nativeResourceForWindow() is probably not a good idea, but there is no other way of making it work with the GBM backend, since the EGL display Qt5 renders to is tied to a DRM plane that Qt5 itself created, so there is no "default display".
Review of attachment 357449 [details] [review]: The native platform interface is the only way to get at some Qt resources. as long as there's nothing better, using it is ok. ::: ext/qt/gstqtglutility.cc @@ +122,3 @@ + EGLDisplay egl_display = (EGLDisplay) + native->nativeResourceForWindow("egldisplay", NULL); + display = (GstGLDisplay *) gst_gl_display_egl_new_with_egl_display (egl_display); What if egl_display is invalid? Also, this code will always be run when libgstgl is GST_GL_HAVE_WINDOW_GBM enabled and realistically, could probably replace the next #else cause so you don't need to #elif GST_GL_HAVE_WINDOW_GBM.
I propose we continue to discuss the qmlglsink issues in bug 792378 .
Created attachment 369210 [details] [review] gstgl: Fix sanity check in gst_gl_memory_setup_buffer()
Created attachment 369211 [details] [review] gl/egl: Add gst_egl_image_from_dmabuf_direct() function
Created attachment 369212 [details] [review] glmemory: Fix n_wrapped_pointers usage gst_gl_memory_setup_buffer() was not properly using the number of pointers to wrapped. This also fixes the validation, as we only support 1 wrapper per view, or num_planes * views wrapper.
Created attachment 369213 [details] [review] WIP gldownload: Implement direct dmabuf uploader The idea is that some GPUs (like the Vivante series) can actually perform the YUV->RGB conversion internally, so no custom conversion shaders are needed. To make use of this feature, we need an additional uploader that can import DMABUF FDs and also directly pass the pixel format, relying on the GPU to do the conversion.
First pass going through review comment, combined with some input from Carlos. In this version, I've simplified the eglimage code, and tried and reused as much as possible the code between the two uploaders. This is barely tested, hence the WIP mark on the commit message, but I'm looking forward some input on the new form.
Review of attachment 369211 [details] [review]: ::: gst-libs/gst/gl/egl/gsteglimage.c @@ -473,0 +480,66 @@ +/* + * Variant of _drm_rgba_fourcc_from_info() that is used in case the GPU can + * handle YUV formats directly (by using internal shaders, or hardwired ... 63 more ... This obviously need to be fixed, as in "direct" mode, we should not use the same import format for BGRA, RGBA, ARGB, etc. otherwise the colors are going to be shifted.
Review of attachment 369213 [details] [review]: ::: gst-libs/gst/gl/gstglupload.c @@ +807,3 @@ + _set_caps_features_with_passthrough (caps, + GST_CAPS_FEATURE_MEMORY_DMABUF, passthrough); + gst_caps_set_simple (ret, "format", G_TYPE_STRING, "RGBA", NULL); I believe it's the wrong direction.
Created attachment 370478 [details] [review] WIP gldownload: Implement direct dmabuf uploader An updated version of the patch. It's far from finished. I'm not sure when I have some time to continue this. Hopefully soon. The fallback to software upload is ugly right now. It looses one buffer. The solution will be to move the accept to 'before_transform' and let the base transform class reconfigure with the selected upload only. We'll still need to handle a failing 'perform', but I think we can avoid it for the new upload method. Is there a known use-case where 'perform' actually fails, or is this 'just in case' right now?
Yes, on platforms without iommu, if you perform an upload from a dmabuf exported by UVC driver it will fail as the GPU will require physically contiguous memory. Note, you have a typo in the commit title, gldownload -> glupload.
Do you have any examples for such no-iommu platforms? The plan is to use eglQueryDmaBufModifiersEXT() to check if direct upload is supported, so any platforms that that don't implement the extension (probably non-mesa drivers) or don't support direct upload can be filtered out.
Does't the IMX.6 Vivante requires contiguous memory ? It's not just the backend allocation (vmalloc, scatter gather, cma) that can cause an importation to fail. Some HW will require 4 bytes stride alignment, other will require 16 bytes, or even 32 bytes. There is just so many reason why a DMABuf import can fail that we need a properly working and transparent (no drop) fallback. Most of this is implemented with trial and error in mind, this is inherited from DRM subsystem. Of course, for your use case you might control that, but it will fails for others if you are not careful enough, and this would cause regressions.
Created attachment 372959 [details] [review] gl/egl: Add gst_egl_image_from_dmabuf_direct() function The colorspace conversion happens during the upload so the necessary hints must be provided to ensure that the conversion works correctly. At least the Mesa Intel driver will create a texture without error but produces an incorrect result. Use eglQueryDmaBufModifiersEXT() to check if non-external upload is supported for the given format. Based on a patch from Carlos Rafael Giani <dv@pseudoterminal.org>.
Created attachment 372960 [details] [review] glupload: try to use the last method after reconfigure Reconfigure will trigger a set_caps which clears the upload method. Remember the method in this case and start with it. Wrap around once to try all methods if necessary.
Created attachment 372961 [details] [review] glupload: allow system memory for dmabuf in transform_caps This should not be necessary, but currently not all plugins that provide dmabuf memory announce this with caps features, e.g. v4l2. The static caps already contain the system memory. It didn't break before because other upload methods provide the necessary transformation.
Created attachment 372962 [details] [review] glupload: handle upload methods with different caps If a upload method is selected then use it exclusively in transform_caps(). Also, reconfigure if the current caps don't match the current upload method.
Created attachment 372963 [details] [review] gluploadelement: try to avoid dropping buffers Without this, a buffer is dropped if glupload indicates that it is necessary to reconfigure. Avoid this by explicitly reconfiguring immediately and uploading the buffer again.
Created attachment 372964 [details] [review] glupload: Implement direct dmabuf uploader The idea is that some GPUs (like the Vivante series) can actually perform the YUV->RGB conversion internally, so no custom conversion shaders are needed. To make use of this feature, we need an additional uploader that can import DMABUF FDs and also directly pass the pixel format, relying on the GPU to do the conversion. Based on patches from Nicolas Dufresne <nicolas.dufresne@collabora.com> and Carlos Rafael Giani <dv@pseudoterminal.org>.
I've attached my current set of patches. I've tested this on Intel and found no regressions. I've simulated failures in different places and it worked without any buffers lost. It works with Etnaviv on i.MX6 as well with our current patch stack. The Mesa patches for this should appear on the mailing list soon.
About the g_return_val_if_fail patches in gst_gl_memory_setup_buffer() (for some reason, reviewing patches does not work for me): For views == 1 i think n_mem == n_wrapped_pointers is the correct check. I don't know how multi-view should work here, however with this code: for (v = 0; v < views; v++) { for (i = 0; i < n_mem; i++) { [...] ... = wrapped_data[i]; [...] "n_mem * views == n_wrapped_pointers" seems wrong. Only n_mem wrapped pointers are used, regardless of the number of views. Is the code incorrect or the check?
Review of attachment 372959 [details] [review]: Apart from the type checks, this looks ok to me. ::: gst-libs/gst/gl/egl/gsteglimage.c @@ +592,3 @@ +{ + EGLDisplay egl_display = EGL_DEFAULT_DISPLAY; + EGLuint64KHR *modifiers; This will need configure/meson checks as old android versions don't have 64-bit integers in EGL. @@ +647,3 @@ + * + * Another notable difference to gst_egl_image_from_dmabuf() + * is that this function creates one EGL image for all planes, not just one. I assume this is meant to read as "one EGLImage for all planes, not one for a single plane." @@ +653,3 @@ +GstEGLImage * +gst_egl_image_from_dmabuf_direct (GstGLContext * context, + gint fd[GST_VIDEO_MAX_PLANES], gsize offset[GST_VIDEO_MAX_PLANES], Some documentation should be added to say that these 'arrays' have as many valid values as there are planes in @in_info. I'm also not a fan of arrays as function arguments. 'type *name' is exactly the same and the size (in []) is not useful to the compiler.
Review of attachment 372960 [details] [review]: Looks mostly ok. ::: gst-libs/gst/gl/gstglupload.c @@ +1771,3 @@ + if (upload->priv->method_i == 0) { + upload->priv->method_i = upload->priv->saved_method_i; + upload->priv->saved_method_i = 0; Maybe a comment here that this is only really used for reconfiguration and set_caps() will reset upload->priv->method_i to 0 so we need to reset it after a reconfiguration. @@ +1868,3 @@ + if (ret == GST_GL_UPLOAD_RECONFIGURE && upload->priv->method_i > 0) + upload->priv->saved_method_i = upload->priv->method_i - 1; Any particular reason why this wasn't put into the above if statement with all the other ret checking? Also, this assumes that the first method will never need a reconfigure.
Review of attachment 372962 [details] [review]: Looks ok. ::: gst-libs/gst/gl/gstglupload.c @@ +1871,3 @@ + if (!gst_caps_is_equal (upload->priv->out_caps, caps)) { + gst_buffer_replace (&outbuf, NULL); + ret = GST_GL_UPLOAD_RECONFIGURE; Add some debug logging to this effect?
Review of attachment 372963 [details] [review]: A comment as to why we can't use reconfigure_src() would be helpful :)
Review of attachment 372964 [details] [review]: Looks good.
(In reply to Michael Olbrich from comment #29) > About the g_return_val_if_fail patches in gst_gl_memory_setup_buffer() (for > some reason, reviewing patches does not work for me): > > For views == 1 i think n_mem == n_wrapped_pointers is the correct check. > I don't know how multi-view should work here, however with this code: > > for (v = 0; v < views; v++) { > for (i = 0; i < n_mem; i++) { > [...] > ... = wrapped_data[i]; > [...] > > "n_mem * views == n_wrapped_pointers" seems wrong. Only n_mem wrapped > pointers are used, regardless of the number of views. Is the code incorrect > or the check? hrm I think the code is wrong for separated mode. If separated, there are n_views * n_planes number of memories that need to be used. For non-separated mode, there are only ever n_planes number of memories. The index into wrapped_data should be wrapped_data[v * n_mem + i]. Evidently no-one's ever uploaded separated multiview buffers from external memory :).
(In reply to Matthew Waters (ystreet00) from comment #31) > Review of attachment 372960 [details] [review] [review]: > @@ +1868,3 @@ > > + if (ret == GST_GL_UPLOAD_RECONFIGURE && upload->priv->method_i > 0) > + upload->priv->saved_method_i = upload->priv->method_i - 1; > > Any particular reason why this wasn't put into the above if statement with > all the other ret checking? The other ret checking is there to determine if we should stop looping. This check should happen after the loop stopped. > Also, this assumes that the first method will never need a reconfigure. No, method_i points to the next method. I'll remove that check. method_i is never 0 here. I added this at the time to make sure that 'method_i - 1' is not negative.
(In reply to Matthew Waters (ystreet00) from comment #32) > Review of attachment 372962 [details] [review] [review]: > ::: gst-libs/gst/gl/gstglupload.c > @@ +1871,3 @@ > + if (!gst_caps_is_equal (upload->priv->out_caps, caps)) { > + gst_buffer_replace (&outbuf, NULL); > + ret = GST_GL_UPLOAD_RECONFIGURE; > > Add some debug logging to this effect? I'll add debugging to the next patch where this is used in the element. Most logging happens there already.
Created attachment 373216 [details] [review] gl/egl: Add gst_egl_image_from_dmabuf_direct() function The colorspace conversion happens during the upload so the necessary hints must be provided to ensure that the conversion works correctly. At least the Mesa Intel driver will create a texture without error but produces an incorrect result. Use eglQueryDmaBufModifiersEXT() to check if non-external upload is supported for the given format. Based on a patch from Carlos Rafael Giani <dv@pseudoterminal.org>. Updated as requested. Note: I've only tested the autoconf checks. I think the meson part is correct but I cannot test this here.
Created attachment 373217 [details] [review] glupload: try to use the last method after reconfigure Reconfigure will trigger a set_caps which clears the upload method. Remember the method in this case and start with it. Wrap around once to try all methods if necessary. Comment added and modified check when saved_method_i is set.
Created attachment 373218 [details] [review] gluploadelement: try to avoid dropping buffers Without this, a buffer is dropped if glupload indicates that it is necessary to reconfigure. Avoid this by explicitly reconfiguring immediately and uploading the buffer again. Comment and debug output added.
Created attachment 373594 [details] [review] glmemory: Fix n_wrapped_pointers usage gst_gl_memory_setup_buffer() was not properly using the number of pointers to wrapped. This also fixes the validation, as we only support 1 wrapper per view, or num_planes * views wrapper.
Created attachment 373595 [details] [review] glupload: allow system memory for dmabuf in transform_caps This should not be necessary, but currently not all plugins that provide dmabuf memory announce this with caps features, e.g. v4l2. The static caps already contain the system memory. It didn't break before because other upload methods provide the necessary transformation.
Created attachment 373596 [details] [review] glupload: handle upload methods with different caps If a upload method is selected then use it exclusively in transform_caps(). Also, reconfigure if the current caps don't match the current upload method.
Created attachment 373597 [details] [review] gl/egl: Add gst_egl_image_from_dmabuf_direct() function The colorspace conversion happens during the upload so the necessary hints must be provided to ensure that the conversion works correctly. At least the Mesa Intel driver will create a texture without error but produces an incorrect result. Use eglQueryDmaBufModifiersEXT() to check if non-external upload is supported for the given format. Based on a patch from Carlos Rafael Giani <dv@pseudoterminal.org>.
Created attachment 373598 [details] [review] glupload: Implement direct dmabuf uploader The idea is that some GPUs (like the Vivante series) can actually perform the YUV->RGB conversion internally, so no custom conversion shaders are needed. To make use of this feature, we need an additional uploader that can import DMABUF FDs and also directly pass the pixel format, relying on the GPU to do the conversion. Based on patches from Nicolas Dufresne <nicolas.dufresne@collabora.com> and Carlos Rafael Giani <dv@pseudoterminal.org>.
Created attachment 373599 [details] [review] glupload: try to use the last method after reconfigure Reconfigure will trigger a set_caps which clears the upload method. Remember the method in this case and start with it. Wrap around once to try all methods if necessary.
Created attachment 373600 [details] [review] gluploadelement: try to avoid dropping buffers Without this, a buffer is dropped if glupload indicates that it is necessary to reconfigure. Avoid this by explicitly reconfiguring immediately and uploading the buffer again.
We are pretty close, but not quite yet. There is a slightly annoying regression: gst-launch-1.0 videotestsrc ! glimagesink . . . ** (gst-launch-1.0:23629): WARNING **: 23:33:48.149: gluploadelement0: transform_caps returned caps which are not a real subset of the filter caps
Definitly not what I wanted, this patchset it not ready. I'll revert in a minute. Attachment 373594 [details] pushed as b1299c1 - glmemory: Fix n_wrapped_pointers usage Attachment 373595 [details] pushed as d7eb48c - glupload: allow system memory for dmabuf in transform_caps Attachment 373596 [details] pushed as 87336b1 - glupload: handle upload methods with different caps Attachment 373597 [details] pushed as 8f0d75d - gl/egl: Add gst_egl_image_from_dmabuf_direct() function Attachment 373598 [details] pushed as 3b1ae62 - glupload: Implement direct dmabuf uploader Attachment 373599 [details] pushed as c1053e1 - glupload: try to use the last method after reconfigure Attachment 373600 [details] pushed as 75f2532 - gluploadelement: try to avoid dropping buffers
Created attachment 374103 [details] [review] gl/egl: Add gst_egl_image_from_dmabuf_direct() function The colorspace conversion happens during the upload so the necessary hints must be provided to ensure that the conversion works correctly. At least the Mesa Intel driver will create a texture without error but produces an incorrect result. Use eglQueryDmaBufModifiersEXT() to check if non-external upload is supported for the given format. Based on a patch from Carlos Rafael Giani <dv@pseudoterminal.org>.
That's a new version that fixes some memory leaks. I'm not allowed to make the old version as obsolete.
Created attachment 374104 [details] [review] glupload: handle upload methods with different caps If a upload method is selected then use it exclusively in transform_caps(). Also, reconfigure if the current caps don't match the current upload method. -- Nicolas: This should fix the warning you got. The filter was not applied when only one tranform function is used. I also added some more fallback handling. I've seen some cases where the GLMemory method was selected, combinded with a system memory only filter. That filter is not always there. It seems racy (it's gone with more debugging enabled) and it's also gone on the next call to transform_caps. I'm not sure why this happens, but falling back to transform with all methods and then handling the fallout with the next buffer should work. That's what happens at the beginning anyways, so going back to this when transform_caps would return empty caps should work correctly. Same issue here: I cannot make the other patch as obsolete.
Created attachment 374105 [details] [review] glupload: calculate DRM fourcc once for direct dmabuf upload Calculate DRM fourcc and report to the DEBUG log about it only once instead of three times in gst_egl_image_from_dmabuf_direct().
Created attachment 374106 [details] [review] glupload: debug output from dmabuf and dmabuf_direct upload transform_caps
Created attachment 374107 [details] [review] glupload: dmabuf-direct: query formats before modifiers The EXT_image_dma_buf_import_modifiers extension [1] states regarding eglQueryDmaBufModifiersEXT: The format must be one of those returned by the eglQueryDmaBufFormatsEXT command. To comply with this requirement eglQueryDmaBufFormatsEXT must be called before eglQueryDmaBufModifiersEXT. [1] https://www.khronos.org/registry/EGL/extensions/EXT/EGL_EXT_image_dma_buf_import_modifiers.txt
Created attachment 374108 [details] [review] glupload: dmabuf-direct: report driver limitations to debug log Report in the DEBUG log if the driver does not support importing a given format with linear modifiers non-externally.
Can you link your branch, it's all mixed up and won't apply anymore.
git://git.pengutronix.de/mol/gst-plugins-base glupload
Attachment 373594 [details] pushed as 0ac646f - glmemory: Fix n_wrapped_pointers usage Attachment 373595 [details] pushed as f3292dc - glupload: allow system memory for dmabuf in transform_caps Attachment 373598 [details] pushed as 507e31d - glupload: Implement direct dmabuf uploader Attachment 373599 [details] pushed as 4d9abc6 - glupload: try to use the last method after reconfigure Attachment 373600 [details] pushed as b006ef9 - gluploadelement: try to avoid dropping buffers Attachment 374103 [details] pushed as 5d0e191 - gl/egl: Add gst_egl_image_from_dmabuf_direct() function Attachment 374104 [details] pushed as f349cdc - glupload: handle upload methods with different caps Attachment 374105 [details] pushed as 58399b2 - glupload: calculate DRM fourcc once for direct dmabuf upload Attachment 374106 [details] pushed as c4edd80 - glupload: debug output from dmabuf and dmabuf_direct upload transform_caps Attachment 374107 [details] pushed as eb20293 - glupload: dmabuf-direct: query formats before modifiers Attachment 374108 [details] pushed as 3c8ac7d - glupload: dmabuf-direct: report driver limitations to debug log
Thanks to all contributors for this effort. It was quite epic. We'll keep patching forward if we find more corner cases and as we do more tests then Etnaviv (e.g. using Mali driver).
CI is now failing, tests initially worked on my PC an now I got a similar failure. https://ci.gstreamer.net/job/GStreamer-master/11058/testReport/
Now a blocker since we need to fix or revert asap.
Created attachment 374117 [details] [review] glupload: prefer caps order provided by the upload methods Specifically this means, that RGBA is prefered over other formats. Uploading RGBA is often faster, so it should be used if possible. This also avoids problems when the upstream element can provide multiple formats: The during initial caps negotiation, it is unknown which upload methods will actually work. The upstream element might choose a format that cannot be converted. Specifically, without this change videotestsrc ! glupload ! glshader ! fakesink and similar pipelines will fail. videotestsrc picks I420 but DirectDmabuf, the only method that can upload this directly to RGBA, cannot upload the generated buffer.
This should fix most failures. test_upload_data is a bit more complicated. The test already not quite correct: It checks if the return value of gst_gl_upload_perform_with_buffer() is FALSE. But that function returns a GstGLUploadReturn. In this case it returns GST_GL_UPLOAD_RECONFIGURE. I can make the test work with this change, but I'm not sure if weakening the check like this is correct: diff --git a/gst-libs/gst/gl/gstglupload.c b/gst-libs/gst/gl/gstglupload.c index bd438d65d415..0ef3f1048a82 100644 --- a/gst-libs/gst/gl/gstglupload.c +++ b/gst-libs/gst/gl/gstglupload.c @@ -2013,7 +2013,7 @@ restart: if (last_impl != upload->priv->method_impl) { GstCaps *caps = gst_gl_upload_transform_caps (upload, upload->context, GST_PAD_SINK, upload->priv->in_caps, NULL); - if (!gst_caps_is_equal (upload->priv->out_caps, caps)) { + if (!gst_caps_is_subset (caps, upload->priv->out_caps)) { gst_buffer_replace (&outbuf, NULL); ret = GST_GL_UPLOAD_RECONFIGURE; }
Created attachment 374120 [details] [review] glupload-test: Don't use gboolean to store enums The unit test makes mixed usage of ret value. Sometimes its does stores an enum and at other moment a boolean. Also fix test using boolean instead of the correct enum value.
Created attachment 374121 [details] [review] glupload: Do prepend the preferred caps The direct dmabuf upload does color conversion, so when it transforms the caps, it replaces the format with all formats found through the format query. When this uploader can't be used, it makes the upstream source pick a unsupported format. To fix this, we only append the caps with a list of format. So the source will only pick one of these formats if the downstream preferred format is not supported. A negotiation failure after this would be normal. This fixes pipelines without a glcolorconvert element.
Created attachment 374122 [details] [review] glupload: Only renegotiate if the caps are incompatible There is new code that ensures that we renegotiate after an uploader transition if the negotiated caps have changed. The problem is that the raw uploader will not really try and fixate the input caps, but instead of return a subset with the only the supported target texture. This had two effect, raw uploads was always done renegotiated once and the raw upload unit test was now failing as it didn't expect a renegotiation. As it's a valid check, simply relax the gst_caps_is_equal() check and use a gst_caps_is_subset() instead.
Attachment 374120 [details] pushed as c2ec68f - glupload-test: Don't use gboolean to store enums Attachment 374121 [details] pushed as c8c7672 - glupload: Do prepend the preferred caps Attachment 374122 [details] pushed as e074eff - glupload: Only renegotiate if the caps are incompatible
Comment on attachment 374117 [details] [review] glupload: prefer caps order provided by the upload methods For some reason this one wasn't in the provided branch. It could have fixed the regression I believe, but I ended up solving this differently. Let me know if you think this should come back.
I noticed, that _direct_dma_buf_upload_transform_caps() is not quite correct. It should provide empty caps if the downstream format is anything other than RGBA. I'm not sure if that makes a difference in practice. In general, I think that glupload should always prefer RGBA on the sink pad. That way we can potentially avoid extra conversions. My patch would be a first step in that direction. However, I'm not sure it it's worth the effort. Are there any real use-cases where the input format is not fixed anyways?
First sorry for the confusion on my previous comment, I worked on this on the plane and didn't realize your comments. (In reply to Michael Olbrich from comment #70) > I noticed, that _direct_dma_buf_upload_transform_caps() is not quite correct. > It should provide empty caps if the downstream format is anything other than > RGBA. > > I'm not sure if that makes a difference in practice. > > In general, I think that glupload should always prefer RGBA on the sink pad. > That > way we can potentially avoid extra conversions. My patch would be a first > step in > that direction. > However, I'm not sure it it's worth the effort. Are there any real use-cases > where > the input format is not fixed anyways? Yes, note that I didn't ignore your patch, "glupload: Do prepend the preferred caps" implements what you described, but in the specific uploader instead of globally. So it's the same but narrowed to direct dmabuf upload. Prepending RGBA should be glcolorconvert jobs for the cases where glupload isn't doing color conversion. It's nothing new, videoconvert does that same to encourage upstream into going passthrough. And I also when for the subset approach like you also described, even though I agree that something isn't quite right. It fixes the regression that first data pointer get uploaded twice (not just that unit test failing). I'll need your help to confirm it also works for you use case, considering that my Intel GPU do very little formats in direct dmabuf upload.
(In reply to Nicolas Dufresne (ndufresne) from comment #71) > First sorry for the confusion on my previous comment, I worked on this on > the plane and didn't realize your comments. No problem. > Yes, note that I didn't ignore your patch, "glupload: Do prepend the > preferred caps" implements what you described, but in the specific uploader > instead of globally. So it's the same but narrowed to direct dmabuf upload. > Prepending RGBA should be glcolorconvert jobs for the cases where glupload > isn't doing color conversion. It's nothing new, videoconvert does that same > to encourage upstream into going passthrough. Hmm, downstream caps are not propagated properly right now. Try this for example: videotestsrc ! glupload ! \ "video/x-raw(memory:GLMemory),format={RGBA,I420}" ! fakesink Without my patch, I420 is chosen and I don't think that's correct. Your patch added an extra structure with RGBA and I420 but without my patch the order of the two is picked from the filter caps and there I420 comes first. In practice this probably wont matter. I don't think there is a realistic use-case like that. There might be issues when elements are plugged after the pipeline starts, but that's nothing new. > And I also when for the subset approach like you also described, even though > I agree that something isn't quite right. It fixes the regression that first > data pointer get uploaded twice (not just that unit test failing). I'll need > your help to confirm it also works for you use case, considering that my > Intel GPU do very little formats in direct dmabuf upload. I'm not seeing any problems. I've considered this some more. I've pretty sure what we do here is correct. The subset means that the caps produces by the upload method are compatible but possibly stricter than the caps negotiated with downstream. I don't think we can produce a buffer here that will not be accepted downstream.
To be honnest, I have always been told that upstream decide. In the example, I420 is before RGBA in videotestsrc. So yeah, I think this code should be fine.
Ok, this is mostly academic anyways. When can we choose the upstream format anyways. I thinks its good enough for now. Let's see what happens once other people start to test this...