GNOME Bugzilla – Bug 759299
wayland: possible infinite loop in gdkwindow-wayland
Last modified: 2015-12-16 13:14:01 UTC
Created attachment 317111 [details] Reproducer program Summary: The Wayland backend will enter an infinite loop in case of a cycle in transient relationship (a transient for B and B transient for A). Steps to reproduce: 1. Save and build the attached program $ gcc -o cycling-transient cycling-transient.c `pkg-config --libs --cflags gtk+-3.0` 2. Execute the program under Wayland Actual result: The program wil lenter an infinite loop and take 100% CPU Expected result: The windows appear normally as in X11 Additional data: This is because of the following code construct in gdkwindow-wayland.c while (window) { GdkWindowImplWayland *impl; impl = GDK_WINDOW_IMPL_WAYLAND (window->impl); window = impl->transient_for; } As found in find_grab_input_seat(), get_popup_parent() or gdk_wayland_window_get_fake_root_coords()
WE should be robust against this sort of application error. We could maintain a 'transient depth' for each window, for example.
(In reply to Matthias Clasen from comment #1) > WE should be robust against this sort of application error. We could > maintain a 'transient depth' for each window, for example. Or we check for loops on xdg_surface.set_parent and just disconnect a client that tries to set a loop.
Created attachment 317380 [details] [review] wayland: Check transient depth Gdk Wayland backend traverses up the window tree looking for transiency parents, but does not check for cycles when doing so. As a result, if two or more windows are transient to each other, the Wayland gdk backend will enter an infinite loop. While this is clearly a bug in the application, gtk+/gdk should be more robust and handle such errors more gracefully. To avoid looping infinitely, add a transient depth and check that the parent window in on a lower transient depth and exit the loop otherwise.
Review of attachment 317380 [details] [review]: ::: gdk/wayland/gdkwindow-wayland.c @@ +1976,3 @@ + } + else + impl->transient_depth = 0; Sadly, I don't think this works - don't you have to update the transient_depth field for all transients of this window here ? I think we probably do have to walk up the tree, after all
Right. Actually I think it's even worse than that, because by walking up the transient tree, we may run into the cycle loop. And checking for transient depth here won't help, because the window could be made transient for a parent of lower depth or hight depth. E.g. [ 0 ] -> [ 1 ] -> [ 2 ] / / /----/ /----/ / / [ 0 ] -> [ 1 ] In this case, this is fine 0 < 1 But: [ 0 ] -> [ 1 ] -> [ 2 ] \ \ \------------\ \------------\ \ \ [ 0 ] -> [ 1 ]...[ 2 ] -> [ 3 ] -> [ 4 ] In this case, the next depth in the tree is actually lesser (or equal) than the current one, and yet this does not mean there is a loop, so testing the transient depth is meaningless here precisely because we are adjusting its value. Beside, if we can detect a loop in the set_transient() function and refuse to apply the transient relationship in such a case, we can guarantee that no loop is created and we don't need the transient depth at all. Means we do all the job in set_transient() (possibly using a more costly algorithm) but we don't have to do it again every time we walk up the transient list (and this is done very often, for example gdk_wayland_window_get_fake_root_coords() is invoked at each pointer move from pointer_handle_motion()).
(In reply to Olivier Fourdan from comment #5) > Right. > > Actually I think it's even worse than that, because by walking up the > transient tree, we may run into the cycle loop. Well, walking up the tree, you check for either hitting NULL (good) or yourself (bad).
(In reply to Matthias Clasen from comment #6) > Well, walking up the tree, you check for either hitting NULL (good) or > yourself (bad). Yes, exactly, we don't need a transient depth, we just need to be careful when setting the transient relationship at first. Unless I am missing something obvious, a trivial patch is to follow.
Created attachment 317426 [details] [review] wayland: Check transient loop Note, this patch tries to avoid the problem by setting the transient relationship before checking for a possible loop. If such a loop is found, the previous transient value is restored. Yet, because unmap_subsurface() is called first and also uses transient_for, we need to proceed till the end of the function gdk_wayland_window_set_transient_for() as if we modified the transient, even if the value is not changed so that the subsurface has a chance to be recreated if needed.
Created attachment 317427 [details] A slightly more elaborate reproducer This reproducer will hang mutter/gnome-shell with an unpatched gtk+, i.e. the infinite loop will occur in mutter this time. So do not run this reproducer if your your main session is GNOME on Wayland. Yet it means a broken client can easily cause a denial of service in mutter, means we also need to fix mutter...
Review of attachment 317426 [details] [review]: ::: gdk/wayland/gdkwindow-wayland.c @@ +1961,1 @@ impl->transient_for = parent; I don't see why you even need to set transient_for first. Just start walking up from the parent - if you hit the window, setting transient_for would create a cycle. No ?
I reckon you need to set the transient_for first to walk up the tree and detect all possibilities. Consider this case, two valid and distinct transient trees (where "->" means "is transient for"): [ A ] -> [ B ] -> [ C ] And [ D ] -> [ E ] -> [ A ] Now the program tries to set [ C ] -> [ D ] (ie C becomes transient for D) If you start walking up the tree starting from C, you won't hit D in the first tree, neither you will hit C walking up the second tree starting from D, and yet, if C becomes transient for D, you'll have a loop: [ A ] -> [ B ] -> [ C ] -> [ D ] -> [ E ] -> [ A ]
(In reply to Olivier Fourdan from comment #11) > I reckon you need to set the transient_for first to walk up the tree and > detect all possibilities. > > Consider this case, two valid and distinct transient trees (where "->" means > "is transient for"): > > [ A ] -> [ B ] -> [ C ] > > And > > [ D ] -> [ E ] -> [ A ] > How is this two trees, and not just d -> e -> a -> b -> c ?
Yeah, you're right of course, it's obvious... Let's blame that on a temporary (or so I hope ^_~) mental fog on my side :-)
Created attachment 317476 [details] [review] Updated patch
you could even call it transient mental fog
Review of attachment 317476 [details] [review]: looks good now
Review of attachment 317476 [details] [review]: attachment 317476 [details] [review] pushed as commit b456db8 wayland: Check transient loop
Closing.