After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 762468 - wayland: Switching between fullscreen and unfullscreen too fast may result in a fullscreen->unfullscreen feedback loop
wayland: Switching between fullscreen and unfullscreen too fast may result in...
Status: RESOLVED FIXED
Product: gtk+
Classification: Platform
Component: Backend: Wayland
3.19.x
Other Linux
: Normal normal
: ---
Assigned To: gtk-bugs
gtk-bugs
Depends on:
Blocks:
 
 
Reported: 2016-02-22 15:37 UTC by Olivier Fourdan
Modified: 2016-03-11 12:37 UTC
See Also:
GNOME target: 3.20
GNOME version: ---


Attachments
video capture (630.53 KB, video/ogg)
2016-02-22 17:34 UTC, Olivier Fourdan
  Details
wayland_debug capture (104.26 KB, application/x-bzip)
2016-02-22 17:35 UTC, Olivier Fourdan
  Details
GDK_DEBUG=events log (32.71 KB, text/plain)
2016-02-23 13:42 UTC, Olivier Fourdan
  Details
A set of backtraces from gtk+ (16.44 KB, text/plain)
2016-02-24 10:52 UTC, Jonas Ådahl
  Details
gdk: Compress window state events (3.09 KB, patch)
2016-02-26 06:24 UTC, Jonas Ådahl
committed Details | Review
gdk: Don't leak discarded window state event (953 bytes, patch)
2016-03-11 05:18 UTC, Jonas Ådahl
committed Details | Review

Description Olivier Fourdan 2016-02-22 15:37:49 UTC
Description

Using F11 to toggle fullscreen in Wayland cause mutter to hang, wpinning on the state change continuously.

Steps to reproduce

1. Open gnome-terminal
2. Press F11

Actual result

gnome-terminal blinks, gnome-shell/mutter hangs, session is dead spinning continuously between fullscreen/non-fullscreen state, apparently

Expected result

gnome-terminal switches to fullscreen

Additional data

The same can be achieved with Xwayland apps as well, e.g. google-chrome
Comment 1 Olivier Fourdan 2016-02-22 17:34:44 UTC
Created attachment 321870 [details]
video capture
Comment 2 Olivier Fourdan 2016-02-22 17:35:44 UTC
Created attachment 321871 [details]
wayland_debug capture
Comment 3 Jonas Ådahl 2016-02-23 03:44:34 UTC
I can't reproduce this on current master of gtk+ and mutter.

Given the debug output it looks like GTK+ goes back and forth for some reason. Could you check whether the state array sent by mutter in xdg_surface.configure contains the fullscreen enum value?

I wonder what happens if mutter denies a GTK+ client going fullscreen. FWIW GTK+ should not draw itself in fullscreen mode until it received a .configure event with the fullscreen enum. I suspect this is what happens although I have no idea why it would reject it.

Do you have other Xwayland apps it reproduces with? I tried with Firefox, but don't have google-chrome installed.
Comment 4 Olivier Fourdan 2016-02-23 07:26:23 UTC
A few more info, this happens on both F23 and rawhide (so I reckon not too far from git master), it's not Xwayland specific (gnome-terminal is Wayland native) and it's possibly a race condition as it's not 100% reproducible (seems to be easier to reproduce using multiple monitors).
Comment 5 Olivier Fourdan 2016-02-23 13:42:16 UTC
Created attachment 321957 [details]
GDK_DEBUG=events log

This is weird, from the gdk wayland backend point of view, it behaves as if mutter was switching continuously between fullscreen and non-fulscreen.

At first, gnome-terminal would switch back immediately to fullscreen all by itself (ie, it's fullscreen, press F11, it goes back to normal then switches back immediately to fullscreen),

Then, after some retries, it will enter that loop switching continuously between fullscreen and non-fullscreen all by itself.

I tried to come up with a simple reproducer (something a lot simpler than gnome-terminal) but failed to reproduce.
Comment 6 Jonas Ådahl 2016-02-23 13:51:14 UTC
Does mutter ever send xdg_surface.configure with XDG_SURFACE_STATE_FULLSCREEN in the state array?
Comment 7 Olivier Fourdan 2016-02-23 13:57:54 UTC
Yes, that message "Gdk-Message: configure, window... fullscreen" is logged in gdk/wayland/gdkwindow-wayland.c xdg_surface_configure() called from the xdg_surface_listener so I guess mutter sends us that, or is my logic flawed?
Comment 9 Jonas Ådahl 2016-02-23 14:09:51 UTC
Ah, right. No, your logic is sound :)

But then again I would assume mutter is just responding to the xdg_surface.set_fullscreen and xdg_surface.unset_fullscreen GTK seems to be sending. Can you see where they come from?

That would not explain why Xwayland clients do the same thing though.
Comment 10 Olivier Fourdan 2016-02-23 16:23:36 UTC
Could be a gtk+ bug or a gnome-terminal bug after all...

having both gnome-shell and gnome-terminal-server in gdb simultaneously, breaking on xdg_surface_set_fullscreen()/xdg_surface_unset_fullscreen() in mutter and gdk_window_fullscreen()/gdk_window_unfullscreen() in gnome-terminal-server shows that both seem to be playing some sort of ping pong game.

Interestingly, in gnome-terminal-server we hit this code path:

  • #0 gdk_window_fullscreen
    at gdkwindow.c line 10452
  • #4 <emit signal ??? on instance 0x557485ca9580 [GtkToggleAction]>
    at gsignal.c line 3441
  • #5 _gtk_action_emit_activate
    at deprecated/gtkaction.c line 909
  • #6 gtk_toggle_action_set_active
    at deprecated/gtktoggleaction.c line 299
  • #7 terminal_window_state_event
    at terminal-window.c line 2329
  • #12 <emit signal ??? on instance 0x557485cfa370 [TerminalWindow]>
    at gsignal.c line 3441
  • #13 gtk_widget_event_internal
    at gtkwidget.c line 7706
  • #14 gtk_main_do_event
    at gtkmain.c line 1800
  • #15 gdk_event_source_dispatch
    at gdkeventsource.c line 90
  • #16 g_main_context_dispatch
    at gmain.c line 3154
  • #17 g_main_context_dispatch
    at gmain.c line 3769
  • #18 g_main_context_iterate
    at gmain.c line 3840
  • #19 g_main_context_iteration
    at gmain.c line 3901
  • #20 g_application_run
    at gapplication.c line 2363
  • #21 main
    at server.c line 187
  • #7 terminal_window_state_event
    at terminal-window.c line 2329
  • #0 gdk_window_unfullscreen
    at gdkwindow.c line 10562
  • #4 <emit signal ??? on instance 0x557485ca9580 [GtkToggleAction]>
    at gsignal.c line 3441
  • #5 _gtk_action_emit_activate
    at deprecated/gtkaction.c line 909
  • #6 gtk_toggle_action_set_active
    at deprecated/gtktoggleaction.c line 299
  • #7 terminal_window_state_event
    at terminal-window.c line 2329
  • #12 <emit signal ??? on instance 0x557485cfa370 [TerminalWindow]>
    at gsignal.c line 3441
  • #13 gtk_widget_event_internal
    at gtkwidget.c line 7706
  • #14 gtk_main_do_event
    at gtkmain.c line 1800
  • #15 gdk_event_source_dispatch
    at gdkeventsource.c line 90
  • #16 g_main_context_dispatch
    at gmain.c line 3154
  • #17 g_main_context_dispatch
    at gmain.c line 3769
  • #18 g_main_context_iterate
    at gmain.c line 3840
  • #19 g_main_context_iteration
    at gmain.c line 3901
  • #20 g_application_run
    at gapplication.c line 2363
  • #21 main
    at server.c line 187
  • #7 terminal_window_state_event
    at terminal-window.c line 2329
$7 = 0

So I suspect both mutter and gnome-terminal are replying each other switching states back and forth. Or something along these lines...

Thread 1 "gnome-shell" hit Breakpoint 3, xdg_surface_set_fullscreen (client=0x55e64f807b00, resource=0x55e64f37fd90, output_resource=0x0)
    at wayland/meta-wayland-surface.c:1439

Thread 1 "gnome-terminal-" hit Breakpoint 2, gdk_window_unfullscreen (window=0x557485b90450 [GdkWaylandWindow]) at gdkwindow.c:10562

Thread 1 "gnome-shell" hit Breakpoint 4, xdg_surface_unset_fullscreen (client=0x55e64f807b00, resource=0x55e64f37fd90) at wayland/meta-wayland-surface.c:1447

Thread 1 "gnome-terminal-" hit Breakpoint 1, gdk_window_fullscreen (window=0x557485b90450 [GdkWaylandWindow]) at gdkwindow.c:10452

Thread 1 "gnome-shell" hit Breakpoint 3, xdg_surface_set_fullscreen (client=0x55e64f807b00, resource=0x55e64f37fd90, output_resource=0x0)
    at wayland/meta-wayland-surface.c:1439
Comment 11 Matthias Clasen 2016-02-24 03:31:34 UTC
Can't make this happen with gnome-terminal 3.19.2 and gtk+ 3.19.9
Comment 12 Olivier Fourdan 2016-02-24 08:03:02 UTC
(In reply to Matthias Clasen from comment #11)
> Can't make this happen with gnome-terminal 3.19.2 and gtk+ 3.19.9

It's on Fedora rawhide so the same versions as yours:

mutter-3.19.90-1.fc24.x86_64
gnome-shell-3.19.90-1.fc24.x86_64
gtk3-3.19.9-1.fc24.x86_64
gnome-terminal-3.19.2-1.fc24.x86_64

It's a race condition somewhere, sometimes not easy to reproduce.
Comment 13 Olivier Fourdan 2016-02-24 09:12:13 UTC
Note: reason I still suspect something dodgy in mutter is because nothing of the sort occurs when using weston, neither with gnome-terminal (Wayland native) nor with google-chrome (running through Xwayland), but as with any race condition, I can't be sure of anything until I can pinpoint the problem precisely...
Comment 14 Olivier Fourdan 2016-02-24 10:07:51 UTC
I think the best way to trigger this is by pressing F11 again while the window animation is still playing. Repeatedly.
Comment 15 Olivier Fourdan 2016-02-24 10:14:21 UTC
And I think google-chrome issue is different, gdb shows it doesn't switch back and forth between fullscreen/unfullscreen like gnome-terminal does (it's just that it ends up spending a lot of time in the same code path in mesa). So something else to investigate...
Comment 16 Jonas Ådahl 2016-02-24 10:41:18 UTC
I just managed to reproduce it by holding down F11 for a while. After some debugging I think the issue is with gtk.

What I see happening is that gtk is doing this:

 * = code run in gtk+
<- = gtk+ sends wayland request
-> = gtk+ receives wayland event

 * window-state event received (fullscreen)
<- xdg_surface.set_fullscreen
 * window-state event received (not fullscreen)
<- xdg_surface.unset_fullscreen
-> xdg_surface.configure(fullscreen)
 * queue window state (fullscreen)
-> xdg_surface.configure(unfullscreen)
 * queue window state (not fullscreen)
 * window-state event received (fullscreen)
<- xdg_surface.set_fullscreen
 * window-state event received (not fullscreen)
<- xdg_surface.unset_fullscreen
...
then it goes on.

In other words, there seems to happen when both a fullscreen and an unfullscreen window-state GDK event gets queued before the other one is dispatched, causing GDK to enter a feedback loop.
Comment 17 Olivier Fourdan 2016-02-24 10:52:07 UTC
Yes, a agree, even more now that I'm convinced the google-chrome is a different issue.
Comment 18 Jonas Ådahl 2016-02-24 10:52:24 UTC
Created attachment 322225 [details]
A set of backtraces from gtk+

I'm attaching a log of a series of breakpoint hits in gtk.

I set one breakpoint in _gdk_set_window_state, which, if I understand correctly, queues the window state event.

Another one in gdk_wayland_window_fullscreen, which sends the xdg_surface.set_fullscreen request

A third one in gdk_wayland_window_unfullscreen, which sends the xdg_surface.unset_fullscreen request.

Now, how to fix this I'm not so sure. Can we avoid doing these things asynchronously, or maybe peek if there is already another window state event in the queue and drop any earlier, when dispatching?
Comment 19 Jonas Ådahl 2016-02-24 11:35:49 UTC
Moving this to gtk+ then.
Comment 20 Matthias Clasen 2016-02-26 04:08:42 UTC
what you see in the stacktraces is that we get a window state event which changes the fullscreen state, then terminal-window updates its toggle action to match, which in turn triggers the callback to call gtk_window_[un]fullscreen. Seems fine so far, since we are just calling the function for the state we already changed to, but something ought to notice this and fizzle out, probably on the mutter side.
Comment 21 Jonas Ådahl 2016-02-26 04:14:26 UTC
(In reply to Matthias Clasen from comment #20)
> what you see in the stacktraces is that we get a window state event which
> changes the fullscreen state, then terminal-window updates its toggle action
> to match, which in turn triggers the callback to call
> gtk_window_[un]fullscreen. Seems fine so far, since we are just calling the
> function for the state we already changed to, but something ought to notice
> this and fizzle out, probably on the mutter side.

What could mutter do? When it receives the make-fullscreen request, it immediately responds with "ok, go fullscreen then". Then when it receives the unmake-fullscreen request, it immediately responds with "ok, go unfullscreen then".

I have a patch which "compresses" GDK_WINDOW_STATE events by dropping any previous event to the same window while updating the "changed_mask". That seems to fix the issue to me (though it triggers another bug, which I suspect is in mutter, causing the window to not display any more, and then display with the wrong size).

Another possible solution I intended to try is to only allow dispatching one wayland event at once before dispatching the internal GDK event queue, but this would have a lot larger impact on things.
Comment 22 Jonas Ådahl 2016-02-26 04:16:45 UTC
(In reply to Jonas Ådahl from comment #21)
> 
> I have a patch which "compresses" GDK_WINDOW_STATE events by dropping any
> previous event to the same window while updating the "changed_mask". That
> seems to fix the issue to me (though it triggers another bug, which I
> suspect is in mutter, causing the window to not display any more, and then
> display with the wrong size).

(It displays again, with the wrong size, when switching workspace back and forth once and other things)
Comment 23 Jonas Ådahl 2016-02-26 06:18:48 UTC
(In reply to Jonas Ådahl from comment #21)
> 
> Another possible solution I intended to try is to only allow dispatching one
> wayland event at once before dispatching the internal GDK event queue, but
> this would have a lot larger impact on things.

No, this is not possible with the current available libwayland-client API. It only exposes dispatch-all.
Comment 24 Jonas Ådahl 2016-02-26 06:24:11 UTC
Created attachment 322435 [details] [review]
gdk: Compress window state events

If there are already a window state event for a given window queued
when the window state is changed, drop that event and queue a new event
with a changed_mask based on the state before last event that was queue
without compression.
Comment 25 Jonas Ådahl 2016-02-26 10:16:23 UTC
The window-disappears is fixed by bug 762716. The incorrect window size I have yet to debug, so I don't know whether it is the same as bug 762713.
Comment 26 Matthias Clasen 2016-03-01 11:38:54 UTC
Review of attachment 322435 [details] [review]:

Looks fine with me.
Comment 27 Matthias Clasen 2016-03-01 11:45:20 UTC
Review of attachment 322435 [details] [review]:

Looks fine with me.
Comment 28 Matthias Clasen 2016-03-02 02:19:25 UTC
Jonas, I assume this is fine to commit ?
Comment 29 Jonas Ådahl 2016-03-02 02:30:47 UTC
Attachment 322435 [details] pushed as 9e2207b - gdk: Compress window state events
Comment 30 Alexander Larsson 2016-03-10 16:35:38 UTC
+      _gdk_event_queue_remove_link (display, pending_event_link);
+      g_list_free_1 (pending_event_link);

This is leaking the GdkEvent...
Comment 31 Alexander Larsson 2016-03-10 16:39:08 UTC
You need something like: (untested)

a/gdk/gdkevents.c b/gdk/gdkevents.c
index a094d4a..e14f9ab 100644
--- a/gdk/gdkevents.c
+++ b/gdk/gdkevents.c
@@ -2201,6 +2201,7 @@ _gdk_set_window_state (GdkWindow      *window,
     {
       old = window->old_state;
       _gdk_event_queue_remove_link (display, pending_event_link);
+      gdk_event_free ((GdkEvent *)pending_event_link->data);
       g_list_free_1 (pending_event_link);
     }
   else
Comment 32 Matthias Clasen 2016-03-10 16:59:12 UTC
Oops, thanks!
Comment 33 Jonas Ådahl 2016-03-11 05:18:39 UTC
Created attachment 323680 [details] [review]
gdk: Don't leak discarded window state event

When compressing window state events, we didn't free the discarded
event after removing it from the queue, causing us to leak it. This
commit makes sure to free the discarded event after unqueuing it.
Comment 34 Jonas Ådahl 2016-03-11 05:19:46 UTC
(In reply to Alexander Larsson from comment #30)
> +      _gdk_event_queue_remove_link (display, pending_event_link);
> +      g_list_free_1 (pending_event_link);
> 
> This is leaking the GdkEvent...

Oops indeed. Thanks for noticing!
Comment 35 Matthias Clasen 2016-03-11 12:37:04 UTC
Attachment 323680 [details] pushed as f8bbbbf - gdk: Don't leak discarded window state event