After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 762994 - Race condition in GIO/AppFileChooser crashes Firefox/Gtk3
Race condition in GIO/AppFileChooser crashes Firefox/Gtk3
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: gio
unspecified
Other Linux
: Normal normal
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2016-03-02 16:41 UTC by Martin Stransky
Modified: 2016-04-26 13:22 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
testcase (2.40 KB, text/x-csrc)
2016-03-02 16:41 UTC, Martin Stransky
  Details
testcase (1.58 KB, text/plain)
2016-03-02 16:44 UTC, Martin Stransky
  Details
GContextSpecificGroup: detach sources (1.60 KB, patch)
2016-04-26 08:49 UTC, Allison Karlitskaya (desrt)
committed Details | Review
GContextSpecificGroup: add testcase (1.49 KB, patch)
2016-04-26 09:18 UTC, Allison Karlitskaya (desrt)
committed Details | Review

Description Martin Stransky 2016-03-02 16:41:43 UTC
Created attachment 322880 [details]
testcase

Looks like a race condition and regression in GIO 2.46 (F23, F24, 
rawhide) which causes appchooser dialod to crash everytime in
Firefox.

There's a testcase. Detailed description and analysis is available at mozilla bugzilla: 

https://bugzilla.mozilla.org/show_bug.cgi?id=1244305#c28

Thanks!
Comment 1 Martin Stransky 2016-03-02 16:44:22 UTC
Created attachment 322881 [details]
testcase
Comment 2 Martin Stransky 2016-03-02 16:45:37 UTC
There are two warnings produced by this testcase:

(test2:22931): GLib-GObject-WARNING **: instance of invalid non-instantiatable type '(null)'

(test2:22931): GLib-GObject-CRITICAL **: g_signal_emit_valist: assertion 'G_TYPE_CHECK_INSTANCE (instance)' failed

the invalid instance which is actually freed memory causes the crash.
Comment 3 Karl Tomlinson 2016-03-02 22:22:51 UTC
Looking at the source, without reproducing to confirm:

g_context_specific_group_get() attaches a source pointing at a new
G_TYPE_APP_INFO_MONITOR to the context:
https://git.gnome.org/browse/glib/tree/gio/gcontextspecificgroup.c?h=2.46.2#n191

g_context_specific_source_dispatch() always returns G_SOURCE_CONTINUE:
https://git.gnome.org/browse/glib/tree/gio/gcontextspecificgroup.c?h=2.46.2#n56

g_context_specific_group_remove() is called when the G_TYPE_APP_INFO_MONITOR
is finalized, but does not remove the source from the context nor detach the
G_TYPE_APP_INFO_MONITOR instance from the source.  It merely unrefs the
source:

https://git.gnome.org/browse/glib/tree/gio/gcontextspecificgroup.c?h=2.46.2#n225

I don't see anything to detach the source from the context.
Seems we are missing a g_source_destroy() in remove.

g_source_attach increments the ref count so unref is not enough to finalize:
https://git.gnome.org/browse/glib/tree/glib/gmain.c?h=2.46.2#n1111

This code was in 2.44.1 so I don't know why it is not reproducing there.
Perhaps something in this changeset triggered:

https://git.gnome.org/browse/glib/commit/?id=2737ab3201163631be152801a859b3874a667f10
Comment 4 Allison Karlitskaya (desrt) 2016-04-26 08:39:38 UTC
Your analysis of this situation is pretty much completely correct.  Thanks for finding this and working it out.

Indeed, the changes in file monitors are likely to blame here.  These changes made is to that events are emitted much more rapidly, which causes the race to occur.  Before, changes were emitted only after a long delay, which would have nicely avoided the race.
Comment 5 Allison Karlitskaya (desrt) 2016-04-26 08:49:51 UTC
Created attachment 326734 [details] [review]
GContextSpecificGroup: detach sources

GContextSpecificGroup has been somewhat broken for a rather long time:
when we remove the last reference on an object held in the group, we try
to clean up the source, but fail to actually remove it from the
mainloop.

We will soon stop emitting signals on the source (due to it having been
removed from the hash table) but any "in flight" signals will still be
delivered on the source, which continues to exist.  This is a problem if
the event is being delivered just as the object is being destroyed.

This also means that we leave the source attached to the mainloop
forever (and next time will create a new one)...

This is demonstrated with the GtkAppChooser dialog which writes an
update to the mimeapps.list file just as it is closing, triggering the
app info monitor to fire just as it is being destroyed.

Karl Tomlinson correctly analysed the problem and proposed this fix.
Comment 6 Allison Karlitskaya (desrt) 2016-04-26 09:18:13 UTC
Created attachment 326739 [details] [review]
GContextSpecificGroup: add testcase

Add a test case for unreffing an object from a GContextSpecificGroup
immediately after firing a signal, before allowing the mainloop to run.
Comment 7 Allison Karlitskaya (desrt) 2016-04-26 13:22:18 UTC
<danw> desrt: seems plausible but i'm not familiar with GContextSpecificGroup
<alex> desrt: yeah, it looks good to me too, but i also don't know this code very well
<mclasen> I don't know that code at all, but it also looks reasonable to me :-)
— @mclasen adds to the noncommittal choir

Attachment 326734 [details] pushed as 62f320e - GContextSpecificGroup: detach sources
Attachment 326739 [details] pushed as 7558995 - GContextSpecificGroup: add testcase