GNOME Bugzilla – Bug 790932
[wayland] wl_seat capabilities is racy and can cause a crash in the clients
Last modified: 2021-07-05 13:44:38 UTC
Description The way wl_seat capabilities work, by notifying clients of capabilities changes, and clients consequently requesting the relevant interface objects (pointer, keyboard, touch) is inherently racy. On quick VT chnages for example, capabilities on the seat will be added and removed, an by the time the client receives the capability notification and requests the relevant keyboard, pointer or touch, another VT switch might have occured and the wl_pointer, wl_keyboard or wl_touch already destroyed, leading to a protocol error which kilsl the client (and when that client is Xwayland, it also kill gnome-shell/mutter). To avoid this, we need to return a wl_keyboard, wl_pointer or wl_touch when requested even when the capabilities are not present. This is what westons does. Steps to reproduce: 1. log on GNOME on Wayland 2. during the GNOME startup, switch VTs back and forth. References: This issues was initially reported downstream in https://bugzilla.redhat.com/show_bug.cgi?id=1516859 Pekka pointed out out to avoid this issue in a reply to this thread https://lists.x.org/archives/xorg-devel/2017-November/055340.html Relevant code in weston as pointed out by Pekka: https://cgit.freedesktop.org/wayland/weston/tree/libweston/input.c?id=3.0.0#n2342
Created attachment 364549 [details] [review] [PATCH] wayland: avoid a race in wl_seat capabilities The way wl_seat capabilities work, by notifying clients of capabilities changes, and clients consequently requesting the relevant interface objects (pointer, keyboard, touch) is inherently racy. On quick VT changes for example, capabilities on the seat will be added and removed, and by the time the client receives the capability change notification and requests the relevant keyboard, pointer or touch, another VT switch might have occurred and the wl_pointer, wl_keyboard or wl_touch already destroyed, leading to a protocol error which kills the client. To avoid this, we need to return a wl_keyboard, wl_pointer or wl_touch when requested even when the capabilities are not present.
Review of attachment 364549 [details] [review]: ::: src/wayland/meta-wayland-keyboard.c @@ +997,3 @@ + */ + if (keyboard->xkb_info.keymap_fd < 0) + return; Adding a _is_enabled() that checks this would make the early-out self explanatory IMHO. The comment is slightly missleading. Sure it's about the compositor sending a new capabilies mask, but it's really about quickly *removing* capabilities. I suggest something like /* * Create an inert object if the corresponding capability was removed since * the client created the resource. */ ::: src/wayland/meta-wayland-pointer.c @@ +484,3 @@ + pointer->pointer_clients = + g_hash_table_new_full (NULL, NULL, NULL, + (GDestroyNotify) meta_wayland_pointer_client_free); I don't think this is correct. The pointer client is used for managing focus of a clients pointer (including all the extensions) but resources created in the race this should not result in a pointer client being created (as that'd mean code incorrectly would think there is actually real resource(s) that should be focused). Instead, the table should be destroyed and created as before, but with the following other changes: 1. Only ensure the pointer client when creating an active wl_pointer resource 2. Add a meta_wayland_pointer_is_inert() 3. Handle the race in the extensions, and create inert resources the above function returns TRUE ::: src/wayland/meta-wayland-touch.c @@ +566,3 @@ + * well...), but rather create the resource which will be destroyed + * shortly after anyway as the client catches up on capabilities... + */ I think a comment like I suggested is clearer, and it should be turned into an early-out before adding the resource to the resource list.
(In reply to Jonas Ådahl from comment #2) > Review of attachment 364549 [details] [review] [review]: > [...] > > ::: src/wayland/meta-wayland-pointer.c > @@ +484,3 @@ > + pointer->pointer_clients = > + g_hash_table_new_full (NULL, NULL, NULL, > + (GDestroyNotify) > meta_wayland_pointer_client_free); > > I don't think this is correct. The pointer client is used for managing focus > of a clients pointer (including all the extensions) but resources created in > the race this should not result in a pointer client being created (as that'd > mean code incorrectly would think there is actually real resource(s) that > should be focused). > > Instead, the table should be destroyed and created as before, but with the > following other changes: > > 1. Only ensure the pointer client when creating an active wl_pointer resource > 2. Add a meta_wayland_pointer_is_inert() > 3. Handle the race in the extensions, and create inert resources the above > function returns TRUE Sorry I am a bit lost here, “pointer->pointer_clients” is a wayland only API, I don't see it being used anywhere outside of meta-wayland-pointer.[ch], therfore I don;t undersatnd how could gnome-shell extension make use of that? We can avoid creating an empty hash (i.e. leaving the code as it was) and check for the pointer->pointer_clients being non-NULL (as did commit b64b1591 before), but I do not understand what you mean in point #2 and #3 above...
(In reply to Olivier Fourdan from comment #3) > (In reply to Jonas Ådahl from comment #2) > > Review of attachment 364549 [details] [review] [review] [review]: > > [...] > > > > ::: src/wayland/meta-wayland-pointer.c > > @@ +484,3 @@ > > + pointer->pointer_clients = > > + g_hash_table_new_full (NULL, NULL, NULL, > > + (GDestroyNotify) > > meta_wayland_pointer_client_free); > > > > I don't think this is correct. The pointer client is used for managing focus > > of a clients pointer (including all the extensions) but resources created in > > the race this should not result in a pointer client being created (as that'd > > mean code incorrectly would think there is actually real resource(s) that > > should be focused). > > > > Instead, the table should be destroyed and created as before, but with the > > following other changes: > > > > 1. Only ensure the pointer client when creating an active wl_pointer resource > > 2. Add a meta_wayland_pointer_is_inert() > > 3. Handle the race in the extensions, and create inert resources the above > > function returns TRUE > > Sorry I am a bit lost here, “pointer->pointer_clients” is a wayland only > API, I don't see it being used anywhere outside of > meta-wayland-pointer.[ch], therfore I don;t undersatnd how could gnome-shell > extension make use of that? > > We can avoid creating an empty hash (i.e. leaving the code as it was) and > check for the pointer->pointer_clients being non-NULL (as did commit > b64b1591 before), but I do not understand what you mean in point #2 and #3 > above... Sorry, was not clear here. I meant wl_pointer Wayland extensions (such as relative pointer and pointer gestures).
Created attachment 365525 [details] [review] [PATCH] wayland: avoid a race in wl_seat capabilities Updated patch
Note, https://bugzilla.redhat.com/show_bug.cgi?id=1523952 is another similar downstream crash report. Not totally sure if it's the same, so didn't dupe it downstream yet.
Ping? Any movement on getting this merged?
It probably fell through the cracks with gitlab migration, I'll post a new MR in gitlab.
https://gitlab.gnome.org/GNOME/mutter/merge_requests/77
FWIW, I doubt bug 1523952 is the same, it's not the same interface (wl_display not wl_seat) - That other bug should be already addressed upstream though.
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled). If you can still reproduce the situation described in this ticket in a recent and supported software version, then please follow https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines and create a new ticket at https://gitlab.gnome.org/GNOME/mutter/-/issues/ Thank you for your understanding and your help.