After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 784319 - Sometimes tooltip_popup_timeout() in gtk/gtktooltip.c causes crash after closing a window
Sometimes tooltip_popup_timeout() in gtk/gtktooltip.c causes crash after clos...
Status: RESOLVED NOTABUG
Product: gtk+
Classification: Platform
Component: Widget: Other
3.20.x
Other Linux
: Normal normal
: ---
Assigned To: gtk-bugs
gtk-bugs
Depends on:
Blocks:
 
 
Reported: 2017-06-29 04:15 UTC by Takuro Ashie
Modified: 2017-07-12 10:17 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
A patch to avoid the crash (436 bytes, patch)
2017-06-29 04:15 UTC, Takuro Ashie
none Details | Review
Stack trace of the crash (2.42 KB, text/plain)
2017-06-29 04:16 UTC, Takuro Ashie
  Details

Description Takuro Ashie 2017-06-29 04:15:06 UTC
Created attachment 354667 [details] [review]
A patch to avoid the crash

I'm now porting Firefox to Wayland/Weston on Renesas RZ/G1E.

* Renesas RZ/G1E:
  http://elinux.org/RZ-G/Boards/SK-RZG1E
* How to build BSP for it:
  http://elinux.org/RZ-G/Boards/Yocto_2.0
* Additional Yocto recipes to build Firefox:
  https://github.com/mozilla-japan/meta-browser/tree/firefox-52.1esr
  https://github.com/mozilla-japan/meta-gecko-embedded

The version of GTK+ is 3.20.9.

Sometimes tooltip_popup_timeout() in gtk/gtktooltip.c causes crash after closing a browser window.

Although I'm not sure whether it's Firefox's fault or GTK+'s fault, I think GTK+ should ensure to avoid the crash.  It seems that the code in gtk/gtktooltip.c doesn't observe destroying GdkWindow after registering a timeout function, it may be the cause of the bug.

The attached patch seems work fine for me, it can avoid crash on my environment.
Comment 1 Takuro Ashie 2017-06-29 04:16:53 UTC
Created attachment 354668 [details]
Stack trace of the crash
Comment 2 Olivier Fourdan 2017-06-29 07:28:06 UTC
Looks like a dupe of bug 782283, unlikely this will be fixed in gtk+-3.20 though...

Can  you try with a current gtk+-3.22 ?
Comment 3 Takuro Ashie 2017-06-29 07:41:25 UTC
> Can you try with a current gtk+-3.22 ?

Sorry, upgrading gtk+ for the board is hard due to the limitation of the GPU driver. gtk+-3.22 requires a later version of wayland although the driver is tied with the specific version of it (1.9 or 1.10). In addition I can't reproduce the bug on PC.

Instead of using gtk+-3.22, I'll try the patch in bug 782283.
Comment 4 Olivier Fourdan 2017-06-29 08:07:11 UTC
Or, alternatively, please provide a simple reproducer which exhibit the issue... 

Oddly, it's not obvious, based on the backtrace (attachment 354668 [details]), why this would affect Wayland alone, this occurs in gtk+ code, not gdk backend.
Comment 5 Takuro Ashie 2017-06-29 08:12:36 UTC
> Instead of using gtk+-3.22, I'll try the patch in bug 782283.

Hmm, it's not effective for this bug, still reproduces.

> Or, alternatively, please provide a simple reproducer which exhibit the issue... 

I'll try to find it...
Comment 6 Takuro Ashie 2017-07-03 09:06:50 UTC
It seems that the following issue seems same with this:

https://bugzilla.redhat.com/show_bug.cgi?id=1467104
Comment 7 Martin Stransky 2017-07-03 09:18:05 UTC
It may be a variant of https://bugzilla.gnome.org/show_bug.cgi?id=781391#c5 but for tooltips perhaps?

AFAIK Jonas said that Firefox draw to already deleted window...maybe we can call gdk_window_is_destroyed() from Firefox before wl_surface.commit?
Comment 8 Olivier Fourdan 2017-07-05 07:50:37 UTC
Yet nothing in either backtraces points toward gdk wayland backend, so how can we be sure this really is Wayland related? AFAICS, it looks like a race...
Comment 9 Olivier Fourdan 2017-07-05 08:21:17 UTC
Ok, found a way to reproduce easily and that's related to Dnd I reckon.

1. Run Firefox profile manager
2. Create a new profile
3. initiate a fake DnD on the "Finish" button, i.e. press, move, release...

-> firefox-wayland crashes with the exact same backtrace.

Oddly, I see no tooltip being involved in this from a UI standpoint, the button never shows a tooltip there.
Comment 10 Olivier Fourdan 2017-07-05 14:28:27 UTC
OK, I've just reproduced with the x11 backend so this is _not_ Wayland specific, but rather related to the use of CSD.

To reproduce with x11, run:

 $ GDK_BACKEND=x11 GTK_CSD=1 firefox-wayland -ProfileManager

And follow the reproducing steps from comment 9.
Comment 11 Olivier Fourdan 2017-07-07 13:42:21 UTC
I tried to reproduce this with a simple gtk+ program, but failed to reproduce, so it could be /something/ that firefox does.

From what I can see, we get an additional motion event for the gdk window which is about to be destroyed, so we end up in gtk_tooltip_handle_event_internal() with the event window being from the widget that has just been unrealized, so it will start a gtk_tooltip_start_delay() that will end up crashing.

Obviously, attachment 354667 [details] [review] will avoid the crash, but I cannot really tell whether or not this is correct unless I understand what mechanism leads to this issue (i.e. I have the feeling that the patch is hiding the problem more than solving it, and I'm not even sure the problem is in gtk+ rather than firefox in the first place).
Comment 12 Olivier Fourdan 2017-07-07 14:37:35 UTC
Ah! Using the gtk inspector shows the widget in question is the "MozContainer".

In the backtrace at step #7:

  • #7 gtk_widget_get_window
    at /home/ofourdan/src/gnome/gtk+/gtk/gtkwidget.c line 15937

That Widget there 0x7f4252a8fe50 is the MozContainer according to the gtk+ inspector.
Comment 13 Martin Stransky 2017-07-10 07:03:34 UTC
Olivier, thanks for looking at it. MozContainer is a custom Firefox widget, it's nested in a toplevel window and it holds live Firefox GdkWindow on Wayland or when toplevel window uses CSD (so it's complicated to draw to toplevel GdkWindow).

The code is here:
https://github.com/stransky/gecko-dev/blob/master/widget/gtk/mozcontainer.cpp
Comment 14 Olivier Fourdan 2017-07-10 07:17:19 UTC
(In reply to Martin Stransky from comment #13)
> Olivier, thanks for looking at it. MozContainer is a custom Firefox widget,
> it's nested in a toplevel window and it holds live Firefox GdkWindow on
> Wayland or when toplevel window uses CSD (so it's complicated to draw to
> toplevel GdkWindow).

Yeap, and this crash is with both Wayland and x11 with CSD.
Comment 15 Olivier Fourdan 2017-07-10 07:45:51 UTC
Looking at that MozContainer, I find it weird that there is a moz_container_unrealize() for Wayland alone, and even there, it doesn't seem to do much:

https://github.com/stransky/gecko-dev/blob/master/widget/gtk/mozcontainer.cpp#L383

Looking at gtk_window_unrealize() it will destroy and unregister its windows:

https://git.gnome.org/browse/gtk+/tree/gtk/gtkwindow.c?h=gtk-3-22#n7448

I could be wrong, but considering that the issue is caused by a motion event coming from a MozContainer widget while the toplevel is being destroyed, I wonder if there might be some code missing in MozContainer.
Comment 16 Takuro Ashie 2017-07-10 08:11:52 UTC
Hmm, mozcontainer doesn't chain up the unrealize func like this:

https://git.gnome.org/browse/gtk+/tree/gtk/gtkwindow.c?h=gtk-3-22#n7505
Comment 17 Takuro Ashie 2017-07-10 08:27:04 UTC
When the widget doesn't have its own GdkWindow (by not calling gtk_widget_set_has_window), a GdkWindow is created & destroyed by GtkWidget class:

https://git.gnome.org/browse/gtk+/tree/gtk/gtkwidget.c#n10594

But when the child class doesn't chain up the unrealize func, the above code doesn't called so the window isn't destroyed.

I think it's firefox's fault.
I'll close this bug after I confirm it on my environment.
Comment 18 Olivier Fourdan 2017-07-10 10:01:20 UTC
If fixing this in Firefox, please make sure the fix also works on x11/CSD as well.
Comment 19 Martin Stransky 2017-07-10 11:26:41 UTC
moz_container_realize() calls gtk_widget_set_window(), do we need to extra call gtk_widget_set_has_window()?

https://github.com/stransky/gecko-dev/blob/master/widget/gtk/mozcontainer.cpp#L373

Also the gtk_widget_set_has_window(container, FALSE) is called only when the toplevel GdkWindow is used for rendering:

https://dxr.mozilla.org/mozilla-central/source/widget/gtk/nsWindow.cpp#3750
Comment 20 Martin Stransky 2017-07-10 11:29:54 UTC
I see that the CSD/Wayland code path is under gtk_widget_get_has_window (widget) condition which looks like Gtk is aware about window owned by container:

https://github.com/stransky/gecko-dev/blob/master/widget/gtk/mozcontainer.cpp#L334
Comment 21 Takuro Ashie 2017-07-11 07:58:08 UTC
(In reply to Martin Stransky from comment #20)
> I see that the CSD/Wayland code path is under gtk_widget_get_has_window
> (widget) condition which looks like Gtk is aware about window owned by
> container:

Sorry, you are right.
My explanation about gtk_widget_get_has_window() is incorrect.
gtk_widget_get_has_window() returns TRUE by default:

  https://git.gnome.org/browse/gtk+/tree/gtk/gtkwidget.c#n8202
  (priv->no_window == 0 by default)

In addition, when gtk_widget_get_has_window() is FALSE, a parent GdkWindow is used for the widget's GdkWindow, it doesn't create its own GdkWindow.


On the other hand the explanation about "unrealize" is correct.
I confirmed that it's not called so that no one destroy the GdkWindow of MozContainer. When MOZ_WAYLAND isn't defined, it's destroyed by the following line:

  https://git.gnome.org/browse/gtk+/tree/gtk/gtkwidget.c#n10589

But when I address it, sometimes another similar crash occurs.
It's more rare than before...
We should continue it at https://bugzilla.redhat.com/show_bug.cgi?id=1467104.
Comment 22 Takuro Ashie 2017-07-12 05:08:48 UTC
Now I confirmed that it's firefox-wayland's issue, not GTK+.
(See https://bugzilla.redhat.com/show_bug.cgi?id=1467104)

(In reply to Olivier Fourdan from comment #18)
> If fixing this in Firefox, please make sure the fix also works on x11/CSD as
> well.

I confirmed it. Thank you for your effort.
Comment 23 Martin Stransky 2017-07-12 08:33:30 UTC
Well, and why is the "unrealize" handler call missing here? I think the issue here is that the unrealize handler does not remove the GdkWindow created in realize handler, right?

With this patch we'll leak the wayland surfaces here.
Comment 24 Martin Stransky 2017-07-12 09:48:40 UTC
I added the GdkWindow delete to unrealized handler - please check if that fixes for you. 

https://github.com/stransky/gecko-dev/commit/dba7baee43fd9f75ae70c76f5ec4850f392cf0b9
Comment 25 Olivier Fourdan 2017-07-12 09:55:14 UTC
moz_container_unrealize() is if defined(MOZ_WAYLAND) only, that won't work with x11 on CSD which is affected as well by this bug (comment 18).
Comment 26 Martin Stransky 2017-07-12 09:57:42 UTC
(In reply to Olivier Fourdan from comment #25)
> moz_container_unrealize() is if defined(MOZ_WAYLAND) only, that won't work
> with x11 on CSD which is affected as well by this bug (comment 18).

MOZ_WAYLAND is a build-time option - moz_container_unrealize() used when wayland enabled build is running on X11.
Comment 27 Takuro Ashie 2017-07-12 10:17:18 UTC
(In reply to Martin Stransky from comment #23)

https://bugzilla.redhat.com/show_bug.cgi?id=1467104#c10