GNOME Bugzilla – Bug 784369
gst-vaapi+Wayland = rendering corruption in totem (but not in gst-play-1.0)
Last modified: 2018-02-15 07:46:41 UTC
After successfully getting gst-vaapi to work in a Wayland session (either requires Weston, or you to unset $DISPLAY in Gnome Shell), totem produces rendering corruption. Strangely enough though, gst-play-1.0 has no such corruption.
Note: The technique of unsetting $DISPLAY to avoid Xwayland and use Wayland-proper only seems to work for totem, not gst-play-1.0. gst-play-1.0 only renders to Wayland successfully (without corruption) in Weston.
What kind of corruption are you seeing?
I get black screen which I think is the same bug as https://bugzilla.gnome.org/show_bug.cgi?id=781269 (Finally I can reproduce!)
Created attachment 354811 [details]
Here's a screenshot.
Yes, the problem seems to be in the clutter plugin. Also confirmed in:
env -uDISPLAY gst-launch-1.0 filesrc location=bbb_sunflower_1080p_60fps_normal.mp4 ! qtdemux ! vaapidecodebin ! clutterautovideosink
(which is both corrupt and uses high CPU seemingly software blitting via pixman)
so a workaround is to avoid clutter:
env -uDISPLAY gst-launch-1.0 filesrc location=bbb_sunflower_1080p_60fps_normal.mp4 ! qtdemux ! vaapidecodebin ! glimagesink
(no corruption and very low CPU)
I believe the video is in format NV12, which both the gl and clutter plugins have their own shaders for. The former seems to work and the latter might be the cause of this bug...
static ClutterGstRenderer nv12_glsl_renderer =
CLUTTER_GST_RENDERER_NEEDS_GLSL | CLUTTER_GST_RENDERER_NEEDS_TEXTURE_RG,
2, /* n_layers */
Ignore my previous message I spent time debugging this today and found out what the corruption really is...
gstreamer-vaapi is defaulting to 0 surface flags, which means that the linear flag is not set, and then VAAPI is free to use VA_SURFACE_EXTBUF_DESC_ENABLE_TILING. So the corruption we see in totem is just tiles being uploaded linearly by mistake. You can fix it by hacking gstreamer-vaapi where it has the (unused) option to disable VA_SURFACE_EXTBUF_DESC_ENABLE_TILING. But that's not all...
clutter-gst is using its rgb32 renderer, not nv12. The reason why that ends up looking corrupt (and possibly also why totem is generally slow and janky) is because it's using the software upload path. The GL upload path for rgb32 in clutter-gst is failing for some reason (I haven't looked that deep yet).
So you can trivially set the linear flag in gstreamer-vaapi to disable VA_SURFACE_EXTBUF_DESC_ENABLE_TILING and that does avoid the corruption of trying to upload tiled video linearly. But that might not be the best solution. If we can figure out why clutter-gst is failing to use GL uploads and falling back to software, then that might be a way to skip the linear/tiling confusion, as well as possibly fix the stuttery playback and high CPU that totem exhibits.
Reassigned to gstreamer-vaapi.
After digging deeper I found both the workaround and the solution lay in gstreamer-vaapi. The "proper" solution seemed to me to be to implement GstVideoGLTextureUploadMeta for Wayland or EGL, in gstreamer-vaapi. However now I look at gstreamer-vaapi's open bugs it's apparent there is a different plan already and my "proper" solution would have been rejected. See bug 779966 and bug 748634 instead.
I may yet propose some kind of automated workaround for this issue (intelligently choosing linear buffers). That would not solve the poor performance of the software upload path in clutter-gst, but it might be a nice way to separate the visible corruption (this bug) from the performance issue.
Maybe see bug 740753 for a potential solution. It seems not that gstreamer-vaapi on wayland is buggy as much as it was never finished (never worked?). So this is really more of a feature task than just a bug fix.
This problem is interesting because the same issue is reproduced in webkitgtk+
Somehow the VA surface is not copied correctly to the texture in egl (Wayland)
Actually it's not the same issue. You can tell by inspecting the screenshots in both bugs carefully.
This bug is about Intel's tiling layout, which is Intel-specific and you can tell because the distortions are square-like, go vertically and even adjacent pixels don't seem to match up. Not to mention disabling tiling in the code fixes this bug. :)
The webkit bug however has no such square-like or adjacent pixel issues. That one looks like a simple stride/pitch mistake with a linear (non-tiled) image.
Future patch work for this bug may actually be done in bug 773453 instead. Since that's the main issue my patch for this bug resolves :)
The patch we use in Ubuntu 17.10 is still awaiting review in bug 773453.
(In reply to Daniel van Vugt from comment #11)
> The patch we use in Ubuntu 17.10 is still awaiting review in bug 773453.
Thanks for the patch. It works and fixes an issue. The problem is that it focuses in a single use case (clutterautovideosink in wayland), but it breaks dmabuf handling, for example with glimagesink (egl either wayland and Xorg).
Nonetheless it is a bug in the code and patches in bug 788503 try to fix it in a proper way.
The other part is re-enabling the GLTextureUpload meta in for EGL. And you did spotted the problem of not using the renderer context. Still the patch looks to me also odd where a wrapped EGLDisplay is created, I'm not sure if there's a memory leak there, but it looks like.
Yesterday I was poking at that code and I feel there's a better way to set the current context when the texture is imported from the sink. I'm still working on it.
Anywya, Daniel, thank you very much for spotting the problems with GLUploadTexture meta in EGL.
No problem. Take the parts you need and modify as appropriate. I look forward to not having to carry patches in Ubuntu any more.
@Daniel can you wrap up the status of this bug? what patches are pending?
I think that question is answered by bug 773453.
It looks like you fixed it already. Although I have not tried your fixes.
Ok. Let's close this bug then. Please open a new one if the problem reappear. Thanks.