After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 780681 - EGL wayland surfaces are freed too early (?)
EGL wayland surfaces are freed too early (?)
Status: RESOLVED OBSOLETE
Product: gtk+
Classification: Platform
Component: Backend: Wayland
3.22.x
Other Linux
: Normal major
: ---
Assigned To: gtk-bugs
gtk-bugs
Depends on:
Blocks:
 
 
Reported: 2017-03-29 11:35 UTC by memeka
Modified: 2018-05-02 18:20 UTC
See Also:
GNOME target: ---
GNOME version: 3.21/3.22



Description memeka 2017-03-29 11:35:42 UTC
GTK+ EGL applications such as totem or gnome-maps on wayland segphault on exit because they try to use surfaces that have been already freed. The issue seems to be in GDK, because in gnome, they crash the entire session (gnome-shell also crashes), but in weston only the application throws segpfault when exiting. I am assuming this is because weston does not use GTK+ but gnome-shell does.

This is an example trace from totem:

----------------------------------------------------------------------------

Core was generated by `totem bbb_720p.mov'.
Program terminated with signal SIGSEGV, Segmentation fault.
  • #0 get_next_argument
    at ../src/connection.c line 430
  • #0 get_next_argument
    at ../src/connection.c line 430
  • #1 wl_argument_from_va_list
    at ../src/connection.c line 493
  • #2 wl_proxy_marshal
    at ../src/wayland-client.c line 692
  • #3 window_surface_delete
    from /usr/lib/arm-linux-gnueabihf/egl-current/libwayland-egl.so.1
  • #4 eglp_window_surface_specific_deinitialization
    from /usr/lib/arm-linux-gnueabihf/egl-current/libwayland-egl.so.1
  • #5 eglp_delete_surface
    from /usr/lib/arm-linux-gnueabihf/egl-current/libwayland-egl.so.1
  • #6 eglp_destroy_all_non_current_surfaces
    from /usr/lib/arm-linux-gnueabihf/egl-current/libwayland-egl.so.1
  • #7 eglp_try_display_finish_terminating
    from /usr/lib/arm-linux-gnueabihf/egl-current/libwayland-egl.so.1
  • #8 eglTerminate
    from /usr/lib/arm-linux-gnueabihf/egl-current/libwayland-egl.so.1
  • #9 eglp_unload_callback
    from /usr/lib/arm-linux-gnueabihf/egl-current/libwayland-egl.so.1
  • #10 osup_term_unload_hooks
    from /usr/lib/arm-linux-gnueabihf/egl-current/libwayland-egl.so.1
  • #11 osup_c_unload_hook
    from /usr/lib/arm-linux-gnueabihf/egl-current/libwayland-egl.so.1
  • #12 ??
    from /lib/ld-linux-armhf.so.3

(gdb) print (struct wl_proxy) *0x7f6bedb0
$3 = {object = {interface = 0x7fe1bfc8, implementation = 0x7fb51c30, id = 44}, display = 0x7f660ec0, queue = 0x7f660f2c, flags = 2, refcount = 1, user_data = 0x0, dispatcher = 0x0, version = 3}

(gdb) print (struct wl_interface) *0x7fe1bfc8 # => this is proxy->interface - you can see the name is garbage already
$4 = {name = 0xa93e931d "iXh\377\367Һ\022KP!0\265{D\021L\205\260\025F\034Y#h\003\223\377\367\f\354\016IjF", version = 49, method_count = -2147421248, methods = 0x7f6beda8, event_count = 0, events = 0x0}

(gdb) print (struct wl_message) *0x7f6beda8 # => this is proxy->interface->methods => you can see the signature field cannot be accessed (0x31 is invalid) leading to the segmentation fault
$5 = {name = 0x0, signature = 0x31 <error: Cannot access memory at address 0x31>, types = 0x7fe1bfc8}

----------------------------------------------------------------------------

This is running gtk+ 3.22.8 (debian stretch) on armhf architecture with Mali T628 GPU using the ARM wayland drivers version r12p0. All files in the egl-current directory (including libwayland-egl.so) are symlinks to the binary mali driver libmali.so

I've raised the issue first with ARM (see https://community.arm.com/graphics/f/discussions/8146/r12p0-wayland-driver-odroid-xu3-frees-objects-too-early-leading-to-segm-fault) and after investigation I was told by an ARM engineer that the issue probably is in GDK:

<quote>
This segfault can happen if the application frees the Wayland surface too early, specifically if the associated EGL surface is still current. If this is the case, the application is doing something like the following during clean up:

eglDestroySurface(egl_surface);
wl_egl_window_destroy(wl_egl_window_win);
wl_surface_destroy(wl_surface);

If egl_surface was either the draw or read argument in the previous call to eglMakeCurrent, egl_surface and wl_egl_window_win are only marked for deletion and are still in use. Destroying wl_surface results in the SEGFAULT when the driver subsequently needs to do something with the wl_surface (in this case, part of deletion). EGL spec 1.5 sections 3.5.5 and 3.2 cover the lifetime of EGL objects.

There are 2 possible application fixes you could consider:
 * Call eglMakeCurrent(display, EGL_NO_SURFACE, EGL_NO_SURFACE, EGL_NO_CONTEXT) before destroying the surface.
 * Call eglTerminate() instead of destroying the surfaces individually.

I'm reasonably confident that this is an issue in GDK (or how totem is calling GTK+) rather than the driver.
</quote>
Comment 1 Olivier Fourdan 2017-03-30 06:53:54 UTC
Not necessarily gdk, could be clutter as well, as both totem and gnome-maps uses clutter (which use a subsurface).

That leads to another question, which version of clutter do you use? Can you try with clutter from git master which has a different implementation for subsurfaces (one that uses gdk, rather than using wayland directly).

(Also, I find it odd that it crashes the rest of the session, even if a client is crashing, why would that cause gnome-shell to crash as well, those are completely different processes)
Comment 2 GNOME Infrastructure Team 2018-05-02 18:20:08 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/gtk/issues/795.