After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 724798 - Random gnome-shell crashes. Mainly after searching in gnome-shell.
Random gnome-shell crashes. Mainly after searching in gnome-shell.
Status: RESOLVED FIXED
Product: gnome-shell
Classification: Core
Component: general
3.11.x
Other Linux
: Normal normal
: ---
Assigned To: gnome-shell-maint
gnome-shell-maint
Depends on:
Blocks:
 
 
Reported: 2014-02-20 12:59 UTC by sangu
Modified: 2014-03-12 18:30 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
search: destroy result actors before forgetting about them (1.36 KB, patch)
2014-02-23 23:26 UTC, Giovanni Campagna
committed Details | Review

Description sangu 2014-02-20 12:59:54 UTC
(gdb) bt
  • #0 js::ObjectClass
    from /lib64/libmozjs-24.so
  • #1 FinalizeArenas(js::FreeOp*, js::gc::ArenaHeader**, js::gc::ArenaList&, js::gc::AllocKind, js::SliceBudget&)
    from /lib64/libmozjs-24.so
  • #2 SweepBackgroundThings(JSRuntime*, bool)
    from /lib64/libmozjs-24.so
  • #3 js::GCHelperThread::doSweep()
    from /lib64/libmozjs-24.so
  • #4 js::GCHelperThread::threadLoop()
    from /lib64/libmozjs-24.so
  • #5 _pt_root
    from /lib64/libnspr4.so
  • #6 start_thread
    from /lib64/libpthread.so.0
  • #7 clone
    from /lib64/libc.so.6

$ dmesg
[...]
JS GC Helper[1694]: segfault at 200000049 ip 00007fdfc79b728e sp 00007fdfa5839960 error 4 in libmozjs-24.so[7fdfc77a5000+562000]
JS GC Helper[9586]: segfault at 7f6431a05d00 ip 00007f6431a05d00 sp 00007f640effc958 error 15 in libmozjs-24.so[7f6431a00000+a000]
Comment 1 Giovanni Campagna 2014-02-20 13:02:57 UTC
Just reproduced this 5 minutes ago, with a debug build of libmozjs you get

Assertion failure: !rt->isHeapCollecting(), at /home/giovanni/gnome/js24-24.2.0/js/src/jsapi.cpp:206

because StBin.grid-search-result is being destroyed for the first time, so it destroys its child, causing a 'destroy' signal that calls back into JS code. And that's not allowed during GC sweep.
Comment 2 Giovanni Campagna 2014-02-23 23:26:33 UTC
Created attachment 270079 [details] [review]
search: destroy result actors before forgetting about them

We can't let live (ie, never destroyed) actors undergo GC, because
they will emit :destroy signals during finalization and assert/crash
libmozjs. Properly destroy all actors before letting the GC
free them.
Comment 3 Jasper St. Pierre (not reading bugmail) 2014-02-23 23:38:53 UTC
Review of attachment 270079 [details] [review]:

This seems like a super terrible bug. We can't just let GC crash the Shell and hack around it everywhere. Plenty of extensions will likely crash the Shell as well. I know you have a gjs bug open for this.

We need to dispose GObjects in the main thread. Ryan has long wanted a "GCleanup" that would make it safe to do g_object_unref from any thread and have it get disposed on the thread the object was created in. It might be possible to do something like that here: through toggle refs, we know if we're holding onto the last reference of an object, so we know if the object will be finalized in the main thread or not.

We should queue finalization of objects to be handled in the main thread, since we're quite certain that it was created it in the main thread, at least until we get GCleanup.

::: js/ui/search.js
@@ +337,3 @@
     _ensureResultActors: function(results, callback) {
         let metasNeeded = results.filter(Lang.bind(this, function(resultId) {
+            return !(resultId in this._resultDisplays);

No, the code before should work fine.
Comment 4 Giovanni Campagna 2014-02-23 23:48:42 UTC
(In reply to comment #3)
> Review of attachment 270079 [details] [review]:
> 
> This seems like a super terrible bug. We can't just let GC crash the Shell and
> hack around it everywhere. Plenty of extensions will likely crash the Shell as
> well. I know you have a gjs bug open for this.
> 
> We need to dispose GObjects in the main thread. Ryan has long wanted a
> "GCleanup" that would make it safe to do g_object_unref from any thread and
> have it get disposed on the thread the object was created in. It might be
> possible to do something like that here: through toggle refs, we know if we're
> holding onto the last reference of an object, so we know if the object will be
> finalized in the main thread or not.
> 
> We should queue finalization of objects to be handled in the main thread, since
> we're quite certain that it was created it in the main thread, at least until
> we get GCleanup.

You didn't read the gjs bug, did you? (Or the stack trace, fwiw)
This is totally running off the main thread. It's deep inside a JS_GC call.
It's the only proper place to finalize GObject stuff, and unless you insist with JSCLASS flags, all JS object finalization happens in the main thread.
You can queue all that you want, the JS object is dead and you get a crash when you invoke the queued idle next time. Or the JS function handling the signal is dead, and you still get a crash. Or you get a new proxy, and the JS function, now bound to something completely different than it expects, fails to do anything useful because the signal IDs set on the expando properties are undefined.

> ::: js/ui/search.js
> @@ +337,3 @@
>      _ensureResultActors: function(results, callback) {
>          let metasNeeded = results.filter(Lang.bind(this, function(resultId) {
> +            return !(resultId in this._resultDisplays);
> 
> No, the code before should work fine.

It's a little different in case of non-enumerable properties (say, a nice search provider returning "watch" as a result id), that's why I changed it to match the for..in in clear()
Comment 5 Giovanni Campagna 2014-02-23 23:54:50 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > Review of attachment 270079 [details] [review] [details]:
> 
> You didn't read the gjs bug, did you? (Or the stack trace, fwiw)

Ops, I just noticed the original stack trace does not show the bug is not threading related (because without JS_DEBUG, the bug manifests as a crash when the resurrected object is double-freed in the background thread)

With a "simple" test case such as:
#!/usr/bin/env gjs

const GLib = imports.gi.GLib;
const GObject = imports.gi.GObject;
const Gtk = imports.gi.Gtk;
const System = imports.system;

function test() {
    let w = new Gtk.Button();
    w._foo = true;
    w.connect('destroy', function() {
        log('destroy invoked');
        log('w is ' + w);
    });

    let c = new Gtk.Box();
    c.add(w);
}

function main() {
    Gtk.init(null);

    // Create an object
    // Make sure the object uses toggle references
    // Create a reference cycle
    // Install a callback that will be invoked
    // when the object is finalized
    // Get the object out of scope
    // Run the mainloop for a while, to make sure gjs
    // deferred keep-alive callbacks are run
    // Force a GC
    //
    // The test passes if there is a critical telling
    // you that the JS callback is not invoked;
    // it fails if gjs aborts in libmozjs or if
    // the callback is invoked

    test();

    GLib.timeout_add(GLib.PRIORITY_DEFAULT, 500, Gtk.main_quit);
    Gtk.main();

    System.gc();
    return 0;
}
main();

you get a more clear picture:

  • #0 AssertHeapIsIdleOrIterating
    at /home/giovanni/gnome/js24-24.2.0/js/src/jsapi.cpp line 206
  • #1 AssertHeapIsIdleOrIterating
    at /home/giovanni/gnome/js24-24.2.0/js/src/jsapi.cpp line 212
  • #2 JSAutoCompartment::JSAutoCompartment
    at /home/giovanni/gnome/js24-24.2.0/js/src/jsapi.cpp line 1459
  • #3 closure_marshal
    at gi/value.cpp line 79
  • #4 g_closure_invoke
    at gclosure.c line 768
  • #5 signal_emit_unlocked_R
    at gsignal.c line 3551
  • #6 g_signal_emit_valist
    at gsignal.c line 3307
  • #7 g_signal_emit
    at gsignal.c line 3363
  • #8 gtk_widget_dispose
    at gtkwidget.c line 11334
  • #9 g_object_run_dispose
    at gobject.c line 1073
  • #10 gtk_box_forall
    at gtkbox.c line 2111
  • #11 gtk_container_destroy
    at gtkcontainer.c line 1409
  • #12 g_closure_invoke
    at gclosure.c line 768
  • #13 signal_emit_unlocked_R
    at gsignal.c line 3667
  • #14 g_signal_emit_valist
    at gsignal.c line 3307
  • #15 g_signal_emit
    at gsignal.c line 3363
  • #16 gtk_widget_dispose
    at gtkwidget.c line 11334
  • #17 g_object_unref
    at gobject.c line 3075
  • #18 release_native_object
    at gi/object.cpp line 1082
  • #19 object_instance_finalize
    at gi/object.cpp line 1378
  • #20 finalize
    at /home/giovanni/gnome/js24-24.2.0/js/src/jsobjinlines.h line 213
  • #21 js::gc::Arena::finalize<JSObject>
    at /home/giovanni/gnome/js24-24.2.0/js/src/jsgc.cpp line 331
  • #22 FinalizeTypedArenas<JSObject>
    at /home/giovanni/gnome/js24-24.2.0/js/src/jsgc.cpp line 395
  • #23 FinalizeArenas
    at /home/giovanni/gnome/js24-24.2.0/js/src/jsgc.cpp line 432
  • #24 finalizeNow
    at /home/giovanni/gnome/js24-24.2.0/js/src/jsgc.cpp line 1306
  • #25 js::gc::ArenaLists::queueObjectsForSweep
    at /home/giovanni/gnome/js24-24.2.0/js/src/jsgc.cpp line 1402
  • #26 BeginSweepingZoneGroup
    at /home/giovanni/gnome/js24-24.2.0/js/src/jsgc.cpp line 3677
  • #27 BeginSweepPhase
    at /home/giovanni/gnome/js24-24.2.0/js/src/jsgc.cpp line 3761
  • #28 IncrementalCollectSlice
    at /home/giovanni/gnome/js24-24.2.0/js/src/jsgc.cpp line 4289
  • #29 GCCycle
    at /home/giovanni/gnome/js24-24.2.0/js/src/jsgc.cpp line 4422
  • #30 Collect
    at /home/giovanni/gnome/js24-24.2.0/js/src/jsgc.cpp line 4558
  • #31 js::GC
    at /home/giovanni/gnome/js24-24.2.0/js/src/jsgc.cpp line 4580
  • #32 JS_GC
    at /home/giovanni/gnome/js24-24.2.0/js/src/jsapi.cpp line 2709
  • #33 gjs_gc
    at modules/system.cpp line 105
  • #34 js::CallJSNative
  • #35 js::Invoke
    at /home/giovanni/gnome/js24-24.2.0/js/src/vm/Interpreter.cpp line 481
  • #36 Interpret
    at /home/giovanni/gnome/js24-24.2.0/js/src/vm/Interpreter.cpp line 2298
  • #37 js::RunScript
    at /home/giovanni/gnome/js24-24.2.0/js/src/vm/Interpreter.cpp line 438
  • #38 js::ExecuteKernel
    at /home/giovanni/gnome/js24-24.2.0/js/src/vm/Interpreter.cpp line 622
  • #39 js::Execute
    at /home/giovanni/gnome/js24-24.2.0/js/src/vm/Interpreter.cpp line 659
  • #40 JS::Evaluate
    at /home/giovanni/gnome/js24-24.2.0/js/src/jsapi.cpp line 5443
  • #41 JS::Evaluate
  • #42 gjs_eval_with_scope
  • #43 gjs_context_eval
  • #44 main
    at gjs/console.cpp line 140

Comment 6 Jasper St. Pierre (not reading bugmail) 2014-02-24 00:05:08 UTC
(In reply to comment #4)

Ah, thanks for the explanation. I guess I got this confused with another finalization bug. It seems the only thing we can truly do in this case is to prevent objects with signals from being finalized, which is extremely poor. Or we can just prevent signals from being emitted after being finalized, which is also poor. Hm.

> It's a little different in case of non-enumerable properties (say, a nice
> search provider returning "watch" as a result id), that's why I changed it to
> match the for..in in clear()

js> "watch" in {}
true

The "in" operator doesn't care about enumerability at all. You probably want foo.hasOwnProperty("watch");. But that's a separate fix.
Comment 7 Giovanni Campagna 2014-03-12 18:30:21 UTC
Pushed after Jasper acknowledged it was a real bug in the last comment,
with the problematic part (in vs === undefined) removed.