After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 784369 - gst-vaapi+Wayland = rendering corruption in totem (but not in gst-play-1.0)
gst-vaapi+Wayland = rendering corruption in totem (but not in gst-play-1.0)
Status: RESOLVED OBSOLETE
Product: GStreamer
Classification: Platform
Component: gstreamer-vaapi
1.12.2
Other Linux
: Normal normal
: git master
Assigned To: clutter-gst-maint
clutter-gst-maint
Depends on:
Blocks: 748634
 
 
Reported: 2017-06-30 07:22 UTC by Daniel van Vugt
Modified: 2018-02-15 07:46 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Screenshot (694.90 KB, image/png)
2017-07-03 02:51 UTC, Daniel van Vugt
Details

Description Daniel van Vugt 2017-06-30 07:22:52 UTC
After successfully getting gst-vaapi to work in a Wayland session (either requires Weston, or you to unset $DISPLAY in Gnome Shell), totem produces rendering corruption. Strangely enough though, gst-play-1.0 has no such corruption.

Note: The technique of unsetting $DISPLAY to avoid Xwayland and use Wayland-proper only seems to work for totem, not gst-play-1.0. gst-play-1.0 only renders to Wayland successfully (without corruption) in Weston.

Tested with:
http://distribution.bbb3d.renderfarming.net/video/mp4/bbb_sunflower_1080p_60fps_normal.mp4
Comment 1 Lionel Landwerlin 2017-06-30 10:16:53 UTC
What kind of corruption are you seeing?
I get black screen which I think is the same bug as https://bugzilla.gnome.org/show_bug.cgi?id=781269 (Finally I can reproduce!)
Comment 2 Daniel van Vugt 2017-07-03 02:51:45 UTC
Created attachment 354811 [details]
Screenshot

Here's a screenshot.
Comment 3 Daniel van Vugt 2017-07-03 06:27:09 UTC
Yes, the problem seems to be in the clutter plugin. Also confirmed in:

env -uDISPLAY gst-launch-1.0 filesrc location=bbb_sunflower_1080p_60fps_normal.mp4 ! qtdemux ! vaapidecodebin ! clutterautovideosink

(which is both corrupt and uses high CPU seemingly software blitting via pixman)

so a workaround is to avoid clutter:

env -uDISPLAY gst-launch-1.0 filesrc location=bbb_sunflower_1080p_60fps_normal.mp4 ! qtdemux ! vaapidecodebin ! glimagesink

(no corruption and very low CPU)
Comment 4 Daniel van Vugt 2017-07-06 10:18:13 UTC
I believe the video is in format NV12, which both the gl and clutter plugins have their own shaders for. The former seems to work and the latter might be the cause of this bug...

static ClutterGstRenderer nv12_glsl_renderer =
  {
    "NV12 glsl",
    CLUTTER_GST_NV12,
    CLUTTER_GST_RENDERER_NEEDS_GLSL | CLUTTER_GST_RENDERER_NEEDS_TEXTURE_RG,
    GST_STATIC_CAPS (MAKE_CAPS(GST_CAPS_FEATURE_MEMORY_SYSTEM_MEMORY,
                               "NV12")),
    2, /* n_layers */
    clutter_gst_nv12_glsl_setup_pipeline,
    clutter_gst_nv12_upload,
    clutter_gst_dummy_upload_gl,
    clutter_gst_dummy_shutdown,
  };
Comment 5 Daniel van Vugt 2017-07-14 11:01:38 UTC
Ignore my previous message I spent time debugging this today and found out what the corruption really is...

gstreamer-vaapi is defaulting to 0 surface flags, which means that the linear flag is not set, and then VAAPI is free to use VA_SURFACE_EXTBUF_DESC_ENABLE_TILING. So the corruption we see in totem is just tiles being uploaded linearly by mistake. You can fix it by hacking gstreamer-vaapi where it has the (unused) option to disable VA_SURFACE_EXTBUF_DESC_ENABLE_TILING. But that's not all...

clutter-gst is using its rgb32 renderer, not nv12. The reason why that ends up looking corrupt (and possibly also why totem is generally slow and janky) is because it's using the software upload path. The GL upload path for rgb32 in clutter-gst is failing for some reason (I haven't looked that deep yet).

So you can trivially set the linear flag in gstreamer-vaapi to disable VA_SURFACE_EXTBUF_DESC_ENABLE_TILING and that does avoid the corruption of trying to upload tiled video linearly. But that might not be the best solution. If we can figure out why clutter-gst is failing to use GL uploads and falling back to software, then that might be a way to skip the linear/tiling confusion, as well as possibly fix the stuttery playback and high CPU that totem exhibits.
Comment 6 Daniel van Vugt 2017-07-18 06:22:48 UTC
Reassigned to gstreamer-vaapi.

After digging deeper I found both the workaround and the solution lay in gstreamer-vaapi. The "proper" solution seemed to me to be to implement GstVideoGLTextureUploadMeta for Wayland or EGL, in gstreamer-vaapi. However now I look at gstreamer-vaapi's open bugs it's apparent there is a different plan already and my "proper" solution would have been rejected. See bug 779966 and bug 748634 instead.

I may yet propose some kind of automated workaround for this issue (intelligently choosing linear buffers). That would not solve the poor performance of the software upload path in clutter-gst, but it might be a nice way to separate the visible corruption (this bug) from the performance issue.
Comment 7 Daniel van Vugt 2017-07-24 04:06:19 UTC
Maybe see bug 740753 for a potential solution. It seems not that gstreamer-vaapi on wayland is buggy as much as it was never finished (never worked?). So this is really more of a feature task than just a bug fix.
Comment 8 Víctor Manuel Jáquez Leal 2017-08-02 14:01:10 UTC
This problem is interesting because the same issue is reproduced in webkitgtk+
 https://bugs.webkit.org/show_bug.cgi?id=173796

Somehow the VA surface is not copied correctly to the texture in egl (Wayland)
Comment 9 Daniel van Vugt 2017-08-04 03:40:06 UTC
Actually it's not the same issue. You can tell by inspecting the screenshots in both bugs carefully.

This bug is about Intel's tiling layout, which is Intel-specific and you can tell because the distortions are square-like, go vertically and even adjacent pixels don't seem to match up. Not to mention disabling tiling in the code fixes this bug. :)

The webkit bug however has no such square-like or adjacent pixel issues. That one looks like a simple stride/pitch mistake with a linear (non-tiled) image.
Comment 10 Daniel van Vugt 2017-08-04 03:43:58 UTC
Future patch work for this bug may actually be done in bug 773453 instead. Since that's the main issue my patch for this bug resolves :)
Comment 11 Daniel van Vugt 2017-10-06 05:15:19 UTC
The patch we use in Ubuntu 17.10 is still awaiting review in bug 773453.
Comment 12 Víctor Manuel Jáquez Leal 2017-10-06 08:08:09 UTC
(In reply to Daniel van Vugt from comment #11)
> The patch we use in Ubuntu 17.10 is still awaiting review in bug 773453.

Thanks for the patch. It works and fixes an issue. The problem is that it focuses in a single use case (clutterautovideosink in wayland), but it breaks dmabuf handling, for example with glimagesink (egl either wayland and Xorg).

Nonetheless it is a bug in the code and patches in bug 788503 try to fix it in a proper way.

The other part is re-enabling the GLTextureUpload meta in for EGL. And you did spotted the problem of not using the renderer context. Still the patch looks to me also odd where a wrapped EGLDisplay is created, I'm not sure if there's a memory leak there, but it looks like.

Yesterday I was poking at that code and I feel there's a better way to set the current context when the texture is imported from the sink. I'm still working on it.

Anywya, Daniel, thank you very much for spotting the problems with GLUploadTexture meta in EGL.
Comment 13 Daniel van Vugt 2017-10-06 08:10:21 UTC
No problem. Take the parts you need and modify as appropriate. I look forward to not having to carry patches in Ubuntu any more.
Comment 14 Víctor Manuel Jáquez Leal 2018-02-15 06:58:19 UTC
@Daniel  can you wrap up the status of this bug? what patches are pending?
Comment 15 Daniel van Vugt 2018-02-15 07:08:31 UTC
I think that question is answered by bug 773453.

It looks like you fixed it already. Although I have not tried your fixes.
Comment 16 Víctor Manuel Jáquez Leal 2018-02-15 07:46:41 UTC
Ok. Let's close this bug then. Please open a new one if the problem reappear. Thanks.