GNOME Bugzilla – Bug 768016
[Wayland] Submenus often get closed after ~2 seconds
Last modified: 2016-08-12 15:30:01 UTC
Steps to reproduce: 0. have a wayland compositor, e.g. weston or gnome-wayland session 1. open any Gtk+ 3.x application with "old" menus (see "Affected applications" below) 2. open any menu (e.g. "File" or "View") ("2nd level") 3. open any submenu ("3rd level") on a menu item in 2nd level menu 4. focus mouse (hover) on submenu What happens: After ~2 seconds, submenu hides. In menu, focus changes to some other item in 2nd level menu. The newly selected menu item in 2nd level menu will be a menu item above the previously selected item in 2nd level. If the previously selected item is the first one in 2nd level, any menu item could be selected. The menu item newly selected in 2nd level menu varies depending on the menu item previously focused in 3rd level menu. What should happen: submenu should not hide while having input focus. There is no reason to change focus without user interaction. Affected applications: * gnome-terminal * evolution * anjuta * libreoffice (recent Gtk+ 3 builds, default on Fedora 24) * probably others This issue does not affect these types of applications: * Qt 4 or 5 based applications * Gtk+ 2 based applications * firefox (drawing its own menu bar) * Gtk+ 3 based applications with menus based on GtkPopover (e.g. epiphany, gedit) * Gtk+ 3 based applications with X11 backend (even on gnome-wayland sessions) This issue is also present when running Gtk+ 3 applications inside weston, but it is specific to the wayland backend. This issue is present whether or not animations are enabled. This issue can not always be reproduced. It seems like fast mouse moving on a menu reduces the probability for this to happen. On my "normal" usage it happens ~80% of all cases when I open a menu. There is nothing related in logs, even with G_ENABLE_DIAGNOSTICS=1 added to environment. Valgrind doesn't spit out anything, even without suppression files. Looks like memory is fine. Workaround: Start applications with GDK_BACKEND=x11 added to environment when running a gnome-wayland session.
With keyboard navigation (arrow keys) menus work fine and this issue is not present. Note that keyboard navigation is partially broken due to bug #768017 though. With WAYLAND_DEBUG=1 added to environment, this issue is occurring less often, but still happens with ~20% probability. This is an excerpt from `WAYLAND_DEBUG=1 anjuta` with the issue happening: [2932029,985] wl_pointer@8.motion(19648858, 191,460938, 133,816406) [2932030,047] wl_pointer@8.frame() [2932044,831] wl_pointer@8.motion(19648874, 196,347656, 133,816406) [2932044,901] wl_pointer@8.frame() [2932063,620] wl_pointer@8.motion(19648890, 201,347656, 133,816406) [2932063,678] wl_pointer@8.frame() [2932095,967] wl_pointer@8.motion(19648922, 213,347656, 132,816406) [2932096,057] wl_pointer@8.frame() [2932113,540] wl_pointer@8.motion(19648938, 217,347656, 132,816406) [2932113,655] wl_pointer@8.frame() [2932130,739] wl_pointer@8.motion(19648952, 219,347656, 132,816406) [2932130,823] wl_pointer@8.frame() leaving 2nd level menu, entering 3rd level menu: [2932147,990] wl_pointer@8.leave(29612, wl_surface@58) [2932148,081] wl_pointer@8.frame() [2932148,142] wl_pointer@8.enter(29613, wl_surface@47, 6,347656, 17,816406) [2932148,192] wl_pointer@8.frame() [2932148,242] -> wl_surface@19.attach(wl_buffer@34, 0, 0) [2932148,270] -> wl_surface@19.set_buffer_scale(1) [2932148,325] -> wl_surface@19.damage(0, 0, 24, 24) [2932148,357] -> wl_surface@19.commit() [2932148,367] -> wl_pointer@8.set_cursor(29613, wl_surface@19, 4, 4) [2932148,397] wl_pointer@8.motion(19648974, 6,347656, 17,816406) [2932148,435] wl_pointer@8.frame() [2932162,347] wl_pointer@8.motion(19648988, 7,347656, 17,816406) [2932162,440] wl_pointer@8.frame() [2932163,656] -> wl_surface@47.attach(wl_buffer@37, 0, 0) [2932163,681] -> wl_surface@47.set_buffer_scale(1) [2932163,689] -> wl_surface@47.damage(6, 7, 347, 28) [2932163,711] -> wl_surface@47.frame(new id wl_callback@54) [2932163,721] -> wl_surface@47.commit() [2932180,090] wl_display@1.delete_id(54) [2932180,139] wl_buffer@37.release() [2932180,153] wl_pointer@8.motion(19649002, 8,347656, 17,816406) [2932180,187] wl_pointer@8.frame() [2932180,251] wl_callback@54.done(19649010) [2932181,089] -> wl_compositor@4.create_region(new id wl_region@54) [2932181,119] -> wl_region@54.add(6, 5, 347, 229) [2932181,143] -> wl_surface@47.set_opaque_region(wl_region@54) [2932181,157] -> wl_region@54.destroy() [2932181,201] -> wl_compositor@4.create_region(new id wl_region@51) [2932181,221] -> wl_region@51.add(6, 5, 347, 229) [2932181,243] -> wl_surface@47.set_input_region(wl_region@51) [2932181,256] -> wl_region@51.destroy() [2932182,876] -> wl_surface@47.attach(wl_buffer@37, 0, 0) [2932182,896] -> wl_surface@47.set_buffer_scale(1) [2932182,903] -> wl_surface@47.damage(6, 7, 347, 225) [2932182,919] -> wl_surface@47.frame(new id wl_callback@27) [2932182,928] -> wl_surface@47.commit() [2932196,345] wl_display@1.delete_id(54) [2932196,392] wl_display@1.delete_id(51) [2932196,407] wl_display@1.delete_id(27) [2932196,425] wl_buffer@37.release() [2932196,440] wl_pointer@8.motion(19649022, 11,347656, 17,816406) [2932196,475] wl_pointer@8.frame() [2932196,533] wl_callback@27.done(19649026) moving mouse around on 3rd level menu: [2932228,950] wl_pointer@8.motion(19649054, 17,347656, 17,816406) [2932229,002] wl_pointer@8.frame() [2932244,853] wl_pointer@8.motion(19649074, 22,347656, 17,816406) [2932244,899] wl_pointer@8.frame() [2932262,971] wl_pointer@8.motion(19649090, 26,347656, 17,816406) [2932263,041] wl_pointer@8.frame() [2932296,229] wl_pointer@8.motion(19649124, 38,347656, 17,816406) [2932296,329] wl_pointer@8.frame() [2932312,781] wl_pointer@8.motion(19649140, 45,390625, 17,816406) [2932312,830] wl_pointer@8.frame() [2932329,316] wl_pointer@8.motion(19649156, 53,726562, 17,816406) [2932329,399] wl_pointer@8.frame() [2932346,216] wl_pointer@8.motion(19649174, 59,937500, 17,816406) [2932346,297] wl_pointer@8.frame() [2932363,320] wl_pointer@8.motion(19649182, 61,937500, 17,816406) [2932363,382] wl_pointer@8.frame() here the unintended action happens: [2933149,803] -> xdg_popup@52.destroy() [2933149,841] -> wl_surface@47.destroy() [2933152,571] -> xdg_surface@31.set_window_geometry(26, 23, 960, 1053) [2933152,640] -> wl_compositor@4.create_region(new id wl_region@27) [2933152,661] -> wl_region@27.add(33, 23, 946, 7) [2933152,763] -> wl_region@27.add(26, 30, 960, 1046) [2933152,800] -> wl_surface@29.set_opaque_region(wl_region@27) [2933152,826] -> wl_region@27.destroy() [2933152,897] -> wl_compositor@4.create_region(new id wl_region@51) [2933152,927] -> wl_region@51.add(16, 13, 980, 1073) [2933152,954] -> wl_surface@29.set_input_region(wl_region@51) [2933152,970] -> wl_region@51.destroy() [2933153,535] -> wl_surface@29.attach(wl_buffer@25, 0, 0) [2933153,577] -> wl_surface@29.set_buffer_scale(1) [2933153,594] -> wl_surface@29.damage(191, 1046, 198, 20) [2933153,635] -> wl_surface@29.frame(new id wl_callback@54) [2933153,673] -> wl_surface@29.commit() [2933154,221] -> wl_compositor@4.create_region(new id wl_region@56) [2933154,256] -> wl_region@56.add(6, 5, 217, 342) [2933154,303] -> wl_surface@58.set_opaque_region(wl_region@56) [2933154,329] -> wl_region@56.destroy() [2933154,367] -> wl_compositor@4.create_region(new id wl_region@57) [2933154,394] -> wl_region@57.add(6, 5, 217, 342) [2933154,419] -> wl_surface@58.set_input_region(wl_region@57) [2933154,446] -> wl_region@57.destroy() [2933158,316] -> wl_surface@58.attach(wl_buffer@60, 0, 0) [2933158,368] -> wl_surface@58.set_buffer_scale(1) [2933158,385] -> wl_surface@58.damage(6, 7, 217, 338) [2933158,441] -> wl_surface@58.frame(new id wl_callback@46) [2933158,466] -> wl_surface@58.commit() [2933158,544] wl_display@1.delete_id(52) [2933158,569] wl_display@1.delete_id(47) [2933158,586] wl_pointer@8.leave(29614, nil) [2933158,610] wl_pointer@8.frame() [2933158,625] wl_surface@19.leave(wl_output@7) [2933158,649] -> wl_surface@19.attach(wl_buffer@34, 0, 0) [2933158,694] -> wl_surface@19.set_buffer_scale(1) [2933158,715] -> wl_surface@19.damage(0, 0, 24, 24) [2933158,758] -> wl_surface@19.commit() [2933158,773] -> wl_pointer@8.set_cursor(29613, wl_surface@19, 4, 4) [2933196,223] wl_display@1.delete_id(27) [2933196,254] wl_display@1.delete_id(51) [2933196,260] wl_display@1.delete_id(56) [2933196,268] wl_display@1.delete_id(57) [2933196,273] wl_display@1.delete_id(46) [2933196,326] wl_display@1.delete_id(54) [2933196,341] wl_buffer@25.release() [2933196,356] wl_buffer@60.release() [2933196,364] wl_callback@46.done(19650026) [2933196,382] wl_callback@54.done(19650026) [2933229,254] wl_pointer@8.enter(29615, wl_surface@29, 431,847656, 214,816406) [2933229,356] -> wl_surface@19.attach(wl_buffer@34, 0, 0) [2933229,376] -> wl_surface@19.set_buffer_scale(1) [2933229,381] -> wl_surface@19.damage(0, 0, 24, 24) [2933229,389] -> wl_surface@19.commit() [2933229,393] -> wl_pointer@8.set_cursor(29615, wl_surface@19, 4, 4) [2933229,404] wl_pointer@8.frame() [2933229,461] wl_surface@19.enter(wl_output@7) [2933229,469] -> wl_surface@19.attach(wl_buffer@34, 0, 0) [2933229,480] -> wl_surface@19.set_buffer_scale(1) [2933229,485] -> wl_surface@19.damage(0, 0, 24, 24) [2933229,497] -> wl_surface@19.commit() [2933229,501] -> wl_pointer@8.set_cursor(29615, wl_surface@19, 4, 4) [2933229,516] wl_pointer@8.motion(19650046, 431,847656, 214,816406) [2933229,525] wl_pointer@8.frame()
I can reproduce this
It's the gtk_menu_stop_navigating_submenu_cb() being called, that explain the delay (MENU_POPDOWN_DELAY is 1000, i.e. 1s) For reference:
+ Trace 236393
I reckon it's a difference in the coordinates between X11 and Wayland. If we look at gtk_menu_set_submenu_navigation_region(): https://git.gnome.org/browse/gtk+/tree/gtk/gtkmenu.c#n4185 which is called in both X11 and Wayland cases as soon as the pointer leaves the first menu to enter the submenu, we see that the gtk_menu_stop_navigating_submenu_cb() timeout is set when event->x >= 0 && event->x < width, width being the width of the window where the event is coming from. In the X11 case, we have the event->x slightly (couple of pixels) farther than the actual left window width (e.g. x = 450, width = 448) which sounds right because the pointer has left the previous item therefore it's outside the window. But in the Wayland case, we have the event x slightly less that the actual left window width (e.g. x = 396, width = 408), and therfore we enter the case where the timeout will be set up. The Wayland case seems wrong, the event x should not be less than the width of the window that was the pointer just left (otherwise it would not have left the window, would we?).
well, either the coordinate is wrong or the width is wrong.
(In reply to Olivier Fourdan from comment #4) ... snip ... > > The Wayland case seems wrong, the event x should not be less than the width > of the window that was the pointer just left (otherwise it would not have > left the window, would we?). I suppose the x/y when running under Wayland is the most recent known coordinate. wl_pointer_leave doesn't carry any coordinates, so we won't know where it might have gone.
(In reply to Jonas Ådahl from comment #6) > I suppose the x/y when running under Wayland is the most recent known > coordinate. wl_pointer_leave doesn't carry any coordinates, so we won't know > where it might have gone. Right, the coords used in the leave notify event are surface_x/surface_y which are updated on enter and motion events, means that what we get on leave is basically what we had at the last motion event within the surface. That's unfortunate, but it should be recoverable, because the callback routine gtk_menu_stop_navigating_submenu_cb() checks again for the child window where the pointer resides: https://git.gnome.org/browse/gtk+/tree/gtk/gtkmenu.c#n4104 So we still have a chance to recover from our mistake, but it seems gdk_window_get_device_position() might return the wrong window. Still digging...
Sorry in advance for this lengthy post, I find this issue rather complex, what follows is mostly a brain dump, as a reminder for myself and if anyone else wishes to chime in with ideas... I think the problem with the coords from the LeaveNotify ewent being within the previous window can happen even in X11, so the root of the problem is actually the wrong child window being returned by gdk_window_get_device_position() in gtk_menu_stop_navigating_submenu_cb(): https://git.gnome.org/browse/gtk+/tree/gtk/gtkmenu.c#n4116 Adding traces in gdk_window_get_device_position() I can see that under Wayland, it's indeed another menu item which is returned and therefore wrongly selected: gdk_window_get_device_position - tmp (15.58,22.20) window 0x1a128a0 [408x160] child 0x1a12a10 [408x32] On X11 this does not occur because get_device_state() returns FALSE: https://git.gnome.org/browse/gtk+/tree/gdk/gdkwindow.c#n4945 So in gdk_window_get_device_position(), "normal_child" is FALSE on X11 and therefore gdk_window_get_device_position() returns NULL. On Wayland, it returns TRUE, leading to another window/widget to be selected. get_device_state() on both X11 and Wayland calls query_state() of the corresponding backend and returns TRUE in the child is not NULL. gdkwindow-wayland impl: https://git.gnome.org/browse/gtk+/tree/gdk/wayland/gdkwindow-wayland.c#n2053 gdkwindow-x11 impl: https://git.gnome.org/browse/gtk+/tree/gdk/x11/gdkwindow-x11.c#n3306 So it all comes down to the child window returned by query_state(). On X11 (XI2 backend), it relies on the child as returned by XIQueryPointer() https://git.gnome.org/browse/gtk+/tree/gdk/x11/gdkdevice-xi2.c#n317 Whereas on Wayland, it relies on the pointer's focus_window: https://git.gnome.org/browse/gtk+/tree/gdk/wayland/gdkdevice-wayland.c#n498 I suspect there is a difference between the child window returned by XIQueryPointer() on X11 and the pointer->focus_window in Wayland. FWIW, the value of the child window returned by XIQueryPointer() in our case is None whereas in the same case pointer->focus_window is whatever window the pointer last entered.
OK, I think I now have a much more better understanding of what is happening here... The key to the problem is the window used when invoking gdk_window_get_device_position() in gtk_menu_stop_navigating_submenu_cb(): https://git.gnome.org/browse/gtk+/tree/gtk/gtkmenu.c#n4116 It's the menu's priv->bin_window not the actual window where the pointer resides. On X11, if the pointer is not in a child of the given window, the child returned by XIQueryPointer() is None (as not a child of the given window). On Wayland, the child is the pointer->focus window, being a child or not of the given window is irrelevant. So on Wayland, we get a "child_window" whereas on X11 we don't - But then, if that "child_window" was correct, that wouldn't be much of a problem, but it is not. gdk_window_get_device_position() calls _gdk_window_find_child_at() using the device coordinates minus the window's abs_x, abs_y: But the device coords are relative to the device's pointer focus window and abs_x/abs_y are relative to the given window, ie the menu's priv->bin_window which is not at all at the same location. As a result, the tmp_x/tmp_y location used to find the window are wrong, and the wrong window is returned, leading to the sub-menu to be discarded.
Created attachment 330579 [details] [review] [PATCH] wayland: return child only in device_query_state() On X11, device_query_state() uses XIQueryPointer() which will return a child window only if the pointer is within an actual child of the given window. Wayland backend would return the pointer->focus window independently of the given window, but that breaks the logic in get_device_state() and later in gdk_window_get_device_position_double() because the window is searched based on coordinates from another window without sibling relationship, breaking gtkmenu sub-menus further down the line. Fix the Wayland backend to mimic X11's XIQueryPointer() to return a child only if really a child of the given window. That's the most sensible thing to do to fix the issue, but the API here seems to be modeled after the X11 implementation and the description of gdk_window_get_device_position_double() is not entirely accurate.
Created attachment 330581 [details] [review] [PATCH] wayland: remove unneeded statement While at it... seat->pointer_info.focus is already set to NULL 2 lines above, no need to repeat it there.
Review of attachment 330579 [details] [review]: Me and Olivier discussed this solution yesterday, and I think it makes sense. Wouldn't hurt with another a second opinion though.
Review of attachment 330581 [details] [review]: LGTM.
(In reply to Jonas Ådahl from comment #12) > Review of attachment 330579 [details] [review] [review]: > > Me and Olivier discussed this solution yesterday, and I think it makes > sense. Wouldn't hurt with another a second opinion though. Makes sense to me too fwiw.
Comment on attachment 330581 [details] [review] [PATCH] wayland: remove unneeded statement attachment 330581 [details] [review] pushed as commit e032c83 - wayland: remove unneeded statement
Comment on attachment 330579 [details] [review] [PATCH] wayland: return child only in device_query_state() attachment 330579 [details] [review] pushed as commit 298221b - wayland: return child only in device_query_state()
Note: I pushed to master only, if we want it in gtk-3-20 please let me know (I reckon it's pretty safe for 3.20)
(In reply to Olivier Fourdan from comment #17) > Note: I pushed to master only, if we want it in gtk-3-20 please let me know > (I reckon it's pretty safe for 3.20) Since it is a Wayland-only patch I'd prefer putting it into 3.20. Otherwise I think somebody else will report this issue in various bugzillas (including downstream).
(In reply to Olivier Fourdan from comment #17) > Note: I pushed to master only, if we want it in gtk-3-20 please let me know > (I reckon it's pretty safe for 3.20) 3.20 is fine, if you want to cherry-pick it.
(In reply to Matthias Clasen from comment #19) > 3.20 is fine, if you want to cherry-pick it. Thanks! attachment 330579 [details] [review] has been cherry-picked from commit 298221b in gtk-3-20 as commit ff99d89 - wayland: return child only in device_query_state()
Issue fixed with last libreoffice upgrade on archlinux I'm impressed by the reactivity Thanks a lot, I'll reconsider the amount of my donation to the foundation this year. Chears
As I said issue is gone for main menu (ie File, Edit...), but it still occurs on combo list elements (ie style, font selection...). I'm using Archlinux (gtk3-3.20.8, libreoffice 5.2.0, wayland 1.11.0).
(In reply to Guillaume Tissier from comment #22) > As I said issue is gone for main menu (ie File, Edit...), but it still > occurs on combo list elements (ie style, font selection...). > I'm using Archlinux (gtk3-3.20.8, libreoffice 5.2.0, wayland 1.11.0). Libreoffice might have its own issues, I fail to reproduce using gtk3-demo's combo example (gtk3-demo --run=combobox), it works fine here. Can you reproduce using gtk3-demo or other gtk sample code by any chance?
(In reply to Olivier Fourdan from comment #23) > Libreoffice might have its own issues, I fail to reproduce using gtk3-demo's > combo example (gtk3-demo --run=combobox), it works fine here. > > Can you reproduce using gtk3-demo or other gtk sample code by any chance? Working fine with gtk3-demo LO font selection, style selection (and others widgets affected) doesn't seem to be pure gtk3 combo, looks much like a menu ihmo Chears
I suspect what your seeing is a different LO bug not related to this particular issue.
I am seeing some similiar issue with LO 5.1.5 which looks like the combo boxes are having problems with absolute vs. relative pointer positions.
*** Bug 769263 has been marked as a duplicate of this bug. ***