GNOME Bugzilla – Bug 778862
Crash on exit when implementing Gio.Application.vfunc_dbus_unregister()
Last modified: 2018-01-27 12:04:00 UTC
Sample script to reproduce the problem: const Gio = imports.gi.Gio; const Lang = imports.lang; const App = new Lang.Class({ Name: 'App', Extends: Gio.Application, vfunc_activate: function () { this.parent(); }, vfunc_dbus_unregister: function () {}, }); new App({ application_id: 'com.example.Foo' }).run(['foo']); Output: ** Gjs:ERROR:/blah/gjs/gi/object.cpp:1383:void disassociate_js_gobject(GObject*): assertion failed: (!cancel_toggle_idle(gobj, TOGGLE_UP)) Aborted (core dumped) Looks like fallout from my recent patches.
I'm not sure this is a recent regression, I remember having similar problems in gnome-weather in old code too. The problem is that dbus_unregister is called after the last mainloop iteration, so we can't use the idle to clean up the toggle ref.
I'm pretty sure it's a new problem in the original code which I distilled the above minimal example from. But I guess that just means that it's a race between the toggle ref resolving and the end of the mainloop.
From the code: * Note that one would think that toggling up only happens * in the main thread (because toggling up is the result of * the JS object, previously visible only to JS code, becoming * visible to the refcounted C world), but because of weird * weak singletons like g_bus_get_sync() objects can see toggle-ups * from different threads too. * We could lock the keep alive structure and avoid the idle maybe, * but there aren't many peculiar objects like that and it's * not a big deal. I think we should try to avoid the idle function, at least, because otherwise we have to assume there's a main loop running at all times. This certainly won't work in gjs-console.
I realized that won't work; switching the GjsMaybeOwned wrapper to and from rooted mode, requires being in a request, and that can only happen from the thread that's associated with the JSRuntime. (CurrentThreadCanAccessRuntime() assertion in StartRequest().) I don't understand why this code in gjs_object_clear_toggles() while (g_main_context_pending(NULL) && g_atomic_int_get(&pending_idle_toggles) > 0) { g_main_context_iteration(NULL, false); } doesn't clean up the pending toggle refs, though. I guess after the main loop is shut down there will be nothing pending on the default GMainContext? Maybe a solution would be to put the pending toggles in an async queue, and have the main thread drain that queue in an idle function. That way, when shutting down the GjsContext, we could empty out the queue without depending on there being a main loop running.
Created attachment 346779 [details] [review] object: Downgrade severity of toggle pending on dissociate Since there may be idle toggle refs pending when a GObject is dissociated from its JS wrapper, during the normal course of GJS usage, we should not assert in this case. Instead log a critical warning with a diagnostic message explaining when this is / isn't a bug.
Since we've determined that this can happen in normal usage of GJS, at least until we get rid of the idle function, here's a patch to downgrade the assert to a critical.
Review of attachment 346779 [details] [review]: It's a bit ugly to spam the output with this message, because there's nothing the application can do to fix it, but better than crashing I guess... I'd be OK with this for now, if we add a FIXME to indicate that it's temporary.
OK, I'll push it with a FIXME comment referring to this bug.
Comment on attachment 346779 [details] [review] object: Downgrade severity of toggle pending on dissociate Attachment 346779 [details] pushed as e290481 - object: Downgrade severity of toggle pending on dissociate
Created attachment 347592 [details] [review] object: Fix toggle check in dissociate_js_gobject() If both a toggle up and toggle down were queued, we don't want to assert here, since the object would be scheduled to get freed anyway.
Created attachment 347593 [details] [review] object: Queue object toggles and drain them in idle Previously, object toggle notifications that happened on a non-owner thread would cause the toggles to happen in an idle function. That was not good if no main loop was running, because then the toggles would just queue up indefinitely, eventually causing a crash when cleaning up objects at context shutdown time. Instead of doing that, we make a threadsafe FIFO queue of toggles, and drain it in an idle function. This has just the same effect, except that we now also have the option of draining it outside of an idle function. So when preparing to shut down the context, we drain it synchronously.
Created attachment 347594 [details] [review] object: Fix comments about toggle refs These comments were lagging behind the code since we switched from KeepAlive to GjsMaybeOwned.
Created attachment 347595 [details] [review] context: Drain toggle queue at start of GC In order to minimize the chances of a JS object being garbage collected while it is queued to toggle up, we handle all pending object toggles when garbage collection starts.
Created attachment 347596 [details] [review] WIP - object: Clear toggles in weak pointer callback This is not supposed to work, but it does fix the crash from this bug.
Well, this is still a mystery to me. I refactored it so that the toggles don't _have_ to be handled in an idle function (though they still normally are.) However, the only thing that really fixes it is clearing the toggles inside the weak pointer callback. That seems fishy to me. The crash meant that a JS object wrapper was being GC'd while its GObject had already been toggled up on a different thread. So, draining the toggle queue there _should_ mean that we are rooting a JS object that is already marked for collection. And in that case, anytime we access the rooted object, we'll crash on a use-after-free.
Review of attachment 347592 [details] [review]: Looks good
Review of attachment 347593 [details] [review]: Nice, I only have one small comment. ::: gi/toggle.cpp @@ +104,3 @@ + + if (item.needs_unref) + g_object_unref(item.gobj); It doesn't matter in practice but you could set item.needs_unref = false here
Review of attachment 347594 [details] [review]: ++
Review of attachment 347595 [details] [review]: ++
Review of attachment 347596 [details] [review]: To be honest I'm not sure if it's worth trading off the critical with code that may have unintended effects... Maybe we could have a third opinion from Giovanni or the release team.
I'll push the comment fix, in any case.
Comment on attachment 347594 [details] [review] object: Fix comments about toggle refs Attachment 347594 [details] pushed as d55d10a - object: Fix comments about toggle refs
Created attachment 349125 [details] [review] WIP - object: Clear toggles in weak pointer callback This is not supposed to work, but it does fix the crash from this bug.
Attachment 347592 [details] pushed as 7735dd2 - object: Fix toggle check in dissociate_js_gobject() Attachment 347593 [details] pushed as 0cf7f83 - object: Queue object toggles and drain them in idle Attachment 347595 [details] pushed as 404e705 - context: Drain toggle queue at start of GC
Review of attachment 349125 [details] [review]: Marking this "needs work" because we still don't understand why it works, when it actually should not.
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/gjs/issues/103.