Bug 781799 – gnome-shell 3.24.1 crash on wayland

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 781799 - gnome-shell 3.24.1 crash on wayland


Summary:	gnome-shell 3.24.1 crash on wayland


Status:	RESOLVED FIXED

Product:	gjs
Classification:	Bindings
Component:	general
Version:	1.48.x
Hardware:	Other Linux

Importance:	Normal critical
Target Milestone:	---
Assigned To:	Philip Chimento
QA Contact:	gjs-maint

URL:
Whiteboard:

Duplicates:	782058 782060 782589 782827 782851 783699 783723 783769 783813 783904 (view as bug list)
Depends on:
Blocks:

Reported:	2017-04-26 20:40 UTC by Domenico Iezzi
Modified:	2017-06-26 19:10 UTC

See Also:	https://bugzilla.redhat.com/show_bug.cgi?id=1451914
GNOME target:	---
GNOME version:	---

Attachments
output of journalctl -xe (7.82 KB, text/plain) 2017-04-26 20:40 UTC, Domenico Iezzi		Details
Stack trace (8.59 KB, text/plain) 2017-05-03 20:28 UTC, Vladimir Stoyakin		Details
Stack trace - full (22.94 KB, text/plain) 2017-05-03 20:29 UTC, Vladimir Stoyakin		Details
Another journalctl stacktrace (just in case it is helpful) (19.33 KB, text/plain) 2017-05-05 00:05 UTC, Mike Javorski		Details
Stack strace (29.17 KB, text/plain) 2017-05-19 10:26 UTC, Baptiste Mille-Mathias		Details
Just another log with stacktrace (21.36 KB, text/x-log) 2017-05-25 21:28 UTC, Cengiz Can		Details
Crash of gnome-shell by running sushi (76.12 KB, text/x-log) 2017-05-27 23:59 UTC, Vladimir Stoyakin		Details
Valgrind-Memcheck report (99.22 KB, text/plain) 2017-06-04 20:07 UTC, Vladimir Stoyakin		Details
object: Prevent use-after-free in signal connections (8.05 KB, patch) 2017-06-11 20:04 UTC, Philip Chimento	committed	Details \| Review
util-root: Allow GjsMaybeOwned::DestroyNotify to free (5.29 KB, patch) 2017-06-11 20:04 UTC, Philip Chimento	rejected	Details \| Review
gnome-shell crash backtrace (12.90 KB, text/plain) 2017-06-13 21:32 UTC, John		Details
util-root: Require GjsMaybeOwned callback to reset (3.13 KB, patch) 2017-06-15 06:54 UTC, Philip Chimento	committed	Details \| Review
Still gnome-shell crash on Wayland!! Version 3.24.2 (37.03 KB, image/png) 2017-06-26 16:04 UTC, jeyhunn		Details
Systemd CoreDump Info for Gnome-Shell crash (8.81 KB, text/plain) 2017-06-26 18:49 UTC, jeyhunn		Details

Description Domenico Iezzi 2017-04-26 20:40:06 UTC

Created attachment 350505 [details]
output of journalctl -xe

OS: Archlinux
Linux 4.10.11
Gnome 3.24.1
Wayland 1.13.0

With my dual-monitor setup (laptop + 21" hp monitor), gnome-shell randomly crashes with the following error message:

Apr 26 22:15:00 plugsuite python3[7044]: Error reading events from display: Connection reset by peer
Apr 26 22:15:00 plugsuite unknown[6667]: Error reading events from display: Broken pipe
Apr 26 22:15:00 plugsuite org.gnome.Shell.desktop[914]: (EE)
Apr 26 22:15:00 plugsuite org.gnome.Shell.desktop[914]: Fatal server error:
Apr 26 22:15:00 plugsuite org.gnome.Shell.desktop[914]: (EE) failed to read Wayland events: Broken pipe
Apr 26 22:15:00 plugsuite org.gnome.Shell.desktop[914]: (EE)

After that it brings me to the GDM login screen.

The only way I have to reproduce this issue is to start a wayland session from gdm and wait a random amount of time. This does not happen on Xorg. I have provided as an attachment the journalctl log.

Comment 1 Jonas Ådahl 2017-04-28 02:44:55 UTC

Seems to be gjs trying to free something it shouldn't, possibly during GC.

"random amount of time", is this random number of seconds, minutes, hours or days?

Philip, does this look similar to the other gjs related issues?

Comment 2 Philip Chimento 2017-04-28 05:03:21 UTC

If it's GNOME 3.24.1, then it's likely GJS 1.48.1 which still suffered from bug 781194. The stack trace looks like it could be the same, but it's hard to tell for sure without debug symbols.

If this is GJS 1.48.2, then it's a new issue. A stack trace with debug symbols would be most helpful in that case.

Comment 3 Vladimir Stoyakin 2017-04-30 11:08:25 UTC

I use only one monitor, but I think it is the same bug.
This is stack trace from my machine:

Stack trace of thread 369:
#0  0x00007f548cd62a10 raise (libc.so.6)
#1  0x00007f548cd6413a abort (libc.so.6)
#2  0x00007f548cda12b0 __libc_message (libc.so.6)
#3  0x00007f548cda790e malloc_printerr (libc.so.6)
#4  0x00007f548cda811e _int_free (libc.so.6)
#5  0x00007f548f22d9a3 _ZN13GjsMaybeOwnedIN2JS5ValueEE16teardown_rootingEv (libgjs.so.0)
#6  0x00007f5488456183 _ZN8JSObject8finalizeEPN2js6FreeOpE (libmozjs-38.so)
#7  0x00007f54884b0d7c _ZN2js2gc10ArenaLists16forceFinalizeNowEPNS_6FreeOpENS0_9AllocKindENS1_14KeepArenasEnumEPPNS0_11ArenaHeaderE (libmozjs-38.so)
#8  0x00007f5488457489 _ZN2js2gc10ArenaLists11finalizeNowEPNS_6FreeOpENS0_9AllocKindENS1_14KeepArenasEnumEPPNS0_11ArenaHeaderE (libmozjs-38.so)
#9  0x00007f548846cbe3 _ZN2js2gc9GCRuntime22beginSweepingZoneGroupEv (libmozjs-38.so)
#10 0x00007f548846d622 _ZN2js2gc9GCRuntime15beginSweepPhaseEb (libmozjs-38.so)
#11 0x00007f548846f398 _ZN2js2gc9GCRuntime23incrementalCollectSliceERNS_11SliceBudgetEN2JS8gcreason6ReasonE (libmozjs-38.so)
#12 0x00007f548846fd40 _ZN2js2gc9GCRuntime7gcCycleEbRNS_11SliceBudgetEN2JS8gcreason6ReasonE (libmozjs-38.so)
#13 0x00007f548846ff8d _ZN2js2gc9GCRuntime7collectEbNS_11SliceBudgetEN2JS8gcreason6ReasonE (libmozjs-38.so)
#14 0x00007f5488470354 _ZN2js2gc9GCRuntime7startGCE18JSGCInvocationKindN2JS8gcreason6ReasonEl (libmozjs-38.so)
#15 0x00007f548f241df9 gjs_schedule_gc_if_needed (libgjs.so.0)
#16 0x00007f548f241e64 gjs_call_function_value (libgjs.so.0)
#17 0x00007f548f21cfe5 gjs_closure_invoke (libgjs.so.0)
#18 0x00007f548f234cdc closure_marshal (libgjs.so.0)
#19 0x00007f548d614f75 g_closure_invoke (libgobject-2.0.so.0)
#20 0x00007f548d626f82 n/a (libgobject-2.0.so.0)
#21 0x00007f548d62fbdc g_signal_emit_valist (libgobject-2.0.so.0)
#22 0x00007f548d62ffbf g_signal_emit (libgobject-2.0.so.0)
#23 0x00007f548d6193a4 n/a (libgobject-2.0.so.0)
#24 0x00007f548d618c46 n/a (libgobject-2.0.so.0)
#25 0x00007f548d61d130 g_object_set_property (libgobject-2.0.so.0)
#26 0x00007f548f229557 set_g_param_from_prop (libgjs.so.0)
#27 0x00007f5488181972 _ZN2js22CallJSPropertyOpSetterEP9JSContextPFbS1_N2JS6HandleIP8JSObjectEENS3_I4jsidEEbNS2_13MutableHandleINS2_5ValueEEEES6_S8_bSB_ (libmozjs-38.so)
#28 0x00007f548815c0c9 NativeSet (libmozjs-38.so)
#29 0x00007f548815e8db SetExistingProperty (libmozjs-38.so)
#30 0x00007f54882ae07f _ZN2js11SetPropertyEP9JSContextN2JS6HandleIP8JSObjectEES6_NS3_I4jsidEENS2_13MutableHandleINS2_5ValueEEEb (libmozjs-38.so)
#31 0x00007f548fbbd186 n/a (n/a)

OS: ArchLinux
linux 4.10.11-1
gnome-shell 3.24.1+2+g45c2627d4-1
gjs 1.48.2-1
js38 38.8.0-3

Comment 4 Philip Chimento 2017-04-30 19:11:38 UTC

With GJS 1.48.2, then this is likely not the same as bug 781194. It looks like a new one; I'll reassign this to GJS.

Any information on how often this happens or any common circumstances where it is triggered?

Is it possible to get a stack trace with full debug info? I would need to know the output of `p *this` at frame 5 in any case, and getting demangled names with line numbers in the stack trace would also be very helpful.

Comment 5 Vladimir Stoyakin 2017-05-01 08:08:52 UTC

It is random crashes (one time after four hours without any user actions), so I can't reproduce them intentionally.
It happened three or four times last week for me (at everyday use).
I report if I get more information.

Comment 6 Florian Müllner 2017-05-02 12:41:18 UTC

*** Bug 782058 has been marked as a duplicate of this bug. ***

Comment 7 Vladimir Stoyakin 2017-05-03 20:28:11 UTC

Created attachment 351009 [details]
Stack trace

Comment 8 Vladimir Stoyakin 2017-05-03 20:29:56 UTC

Created attachment 351010 [details]
Stack trace - full

Comment 9 Vladimir Stoyakin 2017-05-03 20:45:34 UTC

Today gnome-shell crashed with message:

08:49:59 kernel: gnome-shell[364]: segfault at 51 ip 00007f2e9a98b98b sp 00007fff2e1d3020 error 6 in libgjs.so.0.0.0[7f2e9a950000+bc000]

I have attached stack trace. Is there any thing that I should find in coredump?

Comment 10 Philip Chimento 2017-05-04 04:46:06 UTC

Is it possible to find out the values of the fields of *this (of type GjsMaybeOwned) in frame 3 and/or 4?

Comment 11 Vladimir Stoyakin 2017-05-04 21:54:29 UTC

(gdb) p *((GjsMaybeOwned<JS::Value>*) 0x5f76770)

$5 = {m_rooted = 40, m_has_weakref = 229, m_cx = 0x5f35a50, 
  m_heap = {<js::HeapBase<JS::Value>> = {<js::ValueOperations<JS::Heap<JS::Value> >> = {<No data fields>}, <No data fields>}, ptr = {data = {
        asBits = 99834448, debugView = {payload47 = 99834448, tag = 0}, s = {payload = {i32 = 99834448, u32 = 99834448, why = 99834448}}, 
        asDouble = 4.9324771028324344e-316, asPtr = 0x5f35a50, asWord = 99834448, asUIntPtr = 99834448}}}, m_root = 0x5f35c50, m_notify = 0x771e528, 
  m_data = 0x0}

Comment 12 Mike Javorski 2017-05-05 00:04:37 UTC

I have also experience this issue. I can report that it is not specific to wayland, though as people note, when running in Wayland you lose everything as your entire session is closed. If you are running X11 (as I have just switched back to since the problem started) gnome-shell is restarted and I don't lose all my work in-progress (since X11 stays up).

I've had 4 crashes today. 2 in Wayland, and 2 since I switched to Gnome on X11.

Arch Package versions:
* gjs 1.48.2-1
* js38 38.8.0-3
* gnome-shell 3.24.1+2+g45c2627d4-1

Comment 13 Mike Javorski 2017-05-05 00:05:34 UTC

Created attachment 351144 [details]
Another journalctl stacktrace (just in case it is helpful)

Comment 14 Philip Chimento 2017-05-05 03:38:09 UTC

Mike, your stack trace is different again from the existing two. Any chance you could install debug symbols and wait for another crash to see if it's the same stack trace? The frame of interest is the one on the very top, inside libgjs after g_main_context_dispatch().

Vladimir, thanks, that's very helpful. Looks like the object consists of garbage. That might point to use-after-free. If you are familiar with the RR debugger (or want to become so), you could try running gnome-shell under RR, and stepping backwards to see at what point the object becomes garbage.

CC Georges, since you suffered from the previous crashes on Arch. Have you seen this crash at all?

Comment 15 Mike Javorski 2017-05-05 05:57:30 UTC

(In reply to Philip Chimento from comment #14)
> Mike, your stack trace is different again from the existing two. Any chance
> you could install debug symbols and wait for another crash to see if it's
> the same stack trace? The frame of interest is the one on the very top,
> inside libgjs after g_main_context_dispatch().

Apologies, I didn't really examine the stack traces posted previously, I just read the comments and they seemed to indicate a match. I will try to work out how to install the debug symbols when I am back at the problematic computer tomorrow.

Comment 16 Philip Chimento 2017-05-05 06:01:30 UTC

(In reply to Mike Javorski from comment #15)
> (In reply to Philip Chimento from comment #14)
> > Mike, your stack trace is different again from the existing two. Any chance
> > you could install debug symbols and wait for another crash to see if it's
> > the same stack trace? The frame of interest is the one on the very top,
> > inside libgjs after g_main_context_dispatch().
> 
> Apologies, I didn't really examine the stack traces posted previously, I
> just read the comments and they seemed to indicate a match. I will try to
> work out how to install the debug symbols when I am back at the problematic
> computer tomorrow.

I didn't mean that as a criticism! The more different stack traces, the more we (hopefully) know about this bug...

Comment 17 Mike Javorski 2017-05-05 06:07:22 UTC

No problem Philip. In my world (ruby and golang mostly) different stacktrace generally means a different problem. I will see what I can do about the debug symbols in the morning and see if I can produce a crash or two with more info.

- mike


(In reply to Philip Chimento from comment #16)
> (In reply to Mike Javorski from comment #15)
> > (In reply to Philip Chimento from comment #14)
> > > Mike, your stack trace is different again from the existing two. Any chance
> > > you could install debug symbols and wait for another crash to see if it's
> > > the same stack trace? The frame of interest is the one on the very top,
> > > inside libgjs after g_main_context_dispatch().
> > 
> > Apologies, I didn't really examine the stack traces posted previously, I
> > just read the comments and they seemed to indicate a match. I will try to
> > work out how to install the debug symbols when I am back at the problematic
> > computer tomorrow.
> 
> I didn't mean that as a criticism! The more different stack traces, the more
> we (hopefully) know about this bug...

Comment 18 Georges Basile Stavracas Neto 2017-05-05 12:27:03 UTC

(In reply to Philip Chimento from comment #14)

> CC Georges, since you suffered from the previous crashes on Arch. Have you
> seen this crash at all?

I had this happening on my machine twice, but I also couldn't reliably reproduce it. And it didn't happen frequently enough to bother me, so I didn't dig deeper.

I'll try and see what's happening during this weekend, thanks for pointing me to this bug.

Comment 19 Nicolai Syvertsen 2017-05-05 15:50:55 UTC

Is this for collecting all segfaults in libgjs since 3.24 update?

Ok here is mine:

+ Trace 237422

#0 js::GCMethods<JSObject*>::needsPostBarrier
at /usr/include/mozjs-38/js/RootingAPI.h line 663
#0 js::GCMethods<JSObject*>::needsPostBarrier(JSObject*)
at /usr/include/mozjs-38/js/RootingAPI.h line 663
#1 JS::Heap<JSObject*>::set(JSObject*)
at /usr/include/mozjs-38/js/RootingAPI.h line 296
#2 JS::Heap<JSObject*>::operator=(JSObject* const&)
at /usr/include/mozjs-38/js/RootingAPI.h line 266
#3 GjsMaybeOwned<JSObject*>::reset()
at ./gjs/jsapi-util-root.h line 266
#4 closure_clear_idle(void*)
at gi/closure.cpp line 132
#5 g_main_context_dispatch
#6 0x00007f559db32a20 in
#7 g_main_loop_run
#8 meta_run
#9 main
at main.c line 454



Distro: Arch Linux
Package gjs 1.48.2-1

Comment 20 Philip Chimento 2017-05-06 06:39:15 UTC

*** Bug 782060 has been marked as a duplicate of this bug. ***

Comment 21 Vladimir Stoyakin 2017-05-07 08:11:53 UTC

(In reply to Philip Chimento from comment #14)

> Vladimir, thanks, that's very helpful. Looks like the object consists of
> garbage. That might point to use-after-free. If you are familiar with the RR
> debugger (or want to become so), you could try running gnome-shell under RR,
> and stepping backwards to see at what point the object becomes garbage.

Unfortunately, my old Core 2 Duo isn't supported by RR. If I can help with something else, I am ready.

Comment 22 Jonas Ådahl 2017-05-11 01:16:53 UTC

Another trace that seems somewhat related. It crashes during garbage collection. It was originally posted in bug 781975.

+ Trace 237467

Thread 1 (Thread 0x7f3e18816f80 (LWP 29811))

#0 g_type_check_instance_cast
at gtype.c line 4052
#1 GjsMaybeOwned<JS::Value>::teardown_rooting()
at gjs/jsapi-util-root.h line 156
#2 GjsMaybeOwned<JS::Value>::reset()
at gjs/jsapi-util-root.h line 270
#3 object_instance_finalize(JSFreeOp*, JSObject*)
at gi/object.cpp line 1626
#4 JSObject::finalize(js::FreeOp*)
at /usr/src/debug/mozilla-esr38/js/src/jsobjinlines.h line 42
#5 js::gc::Arena::finalize<JSObject>(js::FreeOp*, js::gc::AllocKind, unsigned long)
at /usr/src/debug/mozilla-esr38/js/src/jsgc.cpp line 497
#6 FinalizeTypedArenas<JSObject>
at /usr/src/debug/mozilla-esr38/js/src/jsgc.cpp line 557
#7 FinalizeArenas(js::FreeOp *, js::gc::ArenaHeader **, js::gc::SortedArenaList &, enum AllocKind, struct SliceBudget &, js::gc::ArenaLists::KeepArenasEnum)
at /usr/src/debug/mozilla-esr38/js/src/jsgc.cpp line 600
#8 js::gc::ArenaLists::forceFinalizeNow(js::FreeOp*, js::gc::AllocKind, js::gc::ArenaLists::KeepArenasEnum, js::gc::ArenaHeader**)
at /usr/src/debug/mozilla-esr38/js/src/jsgc.cpp line 2758
#9 js::gc::ArenaLists::finalizeNow(js::FreeOp*, js::gc::AllocKind, js::gc::ArenaLists::KeepArenasEnum, js::gc::ArenaHeader**)
at /usr/src/debug/mozilla-esr38/js/src/jsgc.cpp line 2741
#10 js::gc::ArenaLists::queueForegroundObjectsForSweep(js::FreeOp*)
at /usr/src/debug/mozilla-esr38/js/src/jsgc.cpp line 2876
#11 js::gc::GCRuntime::beginSweepingZoneGroup()
at /usr/src/debug/mozilla-esr38/js/src/jsgc.cpp line 5069
#12 js::gc::GCRuntime::beginSweepPhase(bool)
at /usr/src/debug/mozilla-esr38/js/src/jsgc.cpp line 5164
#13 js::gc::GCRuntime::incrementalCollectSlice(js::SliceBudget&, JS::gcreason::Reason)
at /usr/src/debug/mozilla-esr38/js/src/jsgc.cpp line 5889
#14 js::gc::GCRuntime::gcCycle(bool, js::SliceBudget&, JS::gcreason::Reason)
at /usr/src/debug/mozilla-esr38/js/src/jsgc.cpp line 6076
#15 js::gc::GCRuntime::collect(bool, js::SliceBudget, JS::gcreason::Reason)
at /usr/src/debug/mozilla-esr38/js/src/jsgc.cpp line 6190
#16 js::gc::GCRuntime::startGC(JSGCInvocationKind, JS::gcreason::Reason, long)
at /usr/src/debug/mozilla-esr38/js/src/jsgc.cpp line 6259
#17 js::gc::GCRuntime::maybePeriodicFullGC()
at /usr/src/debug/mozilla-esr38/js/src/jsgc.cpp line 3246
#18 JS_MaybeGC(JSContext*)
at /usr/src/debug/mozilla-esr38/js/src/jsapi.cpp line 1341
#19 gjs_schedule_gc_if_needed(JSContext*)
at gjs/jsapi-util.cpp line 844
#20 gjs_call_function_value(JSContext*, JS::HandleObject, JS::HandleValue, JS::HandleValueArray const&, JS::MutableHandleValue)
at gjs/jsapi-util.cpp line 719
#21 gjs_closure_invoke(GClosure*, JS::HandleValueArray const&, JS::MutableHandleValue)
at gi/closure.cpp line 229
#22 closure_marshal(GClosure*, GValue*, guint, GValue const*, gpointer, gpointer)
at gi/value.cpp line 273
#26 <emit signal ??? on instance 0x556510f60490 [IBusPanelService]>
at gsignal.c line 3447
#27 ibus_panel_service_service_method_call
at ibuspanelservice.c line 1064
#28 call_in_idle_cb
at gdbusconnection.c line 4850
#29 g_idle_dispatch
at gmain.c line 5554
#30 g_main_dispatch
at gmain.c line 3212
#31 g_main_context_dispatch
at gmain.c line 3865
#32 g_main_context_iterate
at gmain.c line 3938
#33 g_main_loop_run
at gmain.c line 4134
#34 meta_run
at core/main.c line 646
#35 main
at main.c line 454

Comment 23 Florian Müllner 2017-05-13 13:39:00 UTC

*** Bug 782589 has been marked as a duplicate of this bug. ***

Comment 24 Christian Stadelmann 2017-05-14 01:44:57 UTC

I can confirm the reproducer in bug #782060, which also causes the backtrace in attachment 351010 [details]. This crash also happens on other occasions and causes data loss.

Comment 25 Philip Chimento 2017-05-14 23:26:55 UTC

Unfortunately I have not observed this yet. Anyone who comments to confirm this bug - please provide at least the following information:

- Linux distribution
- gnome-shell version
- mozjs38 version
- gjs version
- Wayland or X or both?

If anyone who can reproduce the crash reliably would like to help, the best thing you could do would be investigate with RR and find the point at which the GjsMaybeOwned<JS::Value> object becomes garbage, as described in comment 14.

Comment 26 Christian Stadelmann 2017-05-15 10:06:11 UTC

(In reply to Philip Chimento from comment #25)
> Unfortunately I have not observed this yet.

Have you tried the reproducer in bug #782060?

> Anyone who comments to confirm
> this bug - please provide at least the following information:

Fedora 26
gnome-shell-3.24.2-1.fc26.x86_64
mutter-3.24.2-1.fc26.x86_64
gjs-1.48.3-1.fc26.x86_64
mozjs38-38.8.0-4.fc26.x86_64
libwayland-server-1.13.0-1.fc26.x86_64
GNOME/Wayland session.

> If anyone who can reproduce the crash reliably would like to help, the best
> thing you could do would be investigate with RR and find the point at which
> the GjsMaybeOwned<JS::Value> object becomes garbage, as described in comment
> 14.

What is this RR tool, how do I get it and how do I use it? Do you have any commands I can run there? I don't think it is even packaged for my distro.

On Fedora 26, there is a "gnome-valgrind-session" which should – in theory – help debugging this. If someone here uses Fedora, maybe he/she can fix it: https://apps.fedoraproject.org/packages/gnome-valgrind-session/bugs/

Comment 27 Bastien Nocera 2017-05-18 23:39:06 UTC

(In reply to Philip Chimento from comment #25)
> Unfortunately I have not observed this yet. Anyone who comments to confirm
> this bug - please provide at least the following information:
> 
> - Linux distribution

Fedora 26 (beta I guess)

> - gnome-shell version

gnome-shell-3.24.2-1.fc26.x86_64

> - mozjs38 version

mozjs38-38.8.0-4.fc26.x86_64

> - gjs version

gjs-1.48.3-1.fc26.x86_64

> - Wayland or X or both?

Using Wayland.

Just like in one of the duplicates, I reproduced this twice in a row with the same assertion by launching a clutter-gst based video player (it was sushi in the dupe, totem for me).

org.gnome.Shell.desktop[18358]: Gjs:ERROR:./gjs/jsapi-util-root.h:317:void GjsMaybeOwned<T>::trace(JSTracer*, const char*) [with T = JS::Value]: assertion failed: (!m_rooted)

#0  0x00007f66b021b7bb __GI_raise (libc.so.6)
#1  0x00007f66b021d5d1 __GI_abort (libc.so.6)
#2  0x00007f66b1def75d g_assertion_message (libglib-2.0.so.0)
#3  0x00007f66b1def7ea g_assertion_message_expr (libglib-2.0.so.0)
#4  0x00007f66b9594724 n/a (libgjs.so.0)
#5  0x00007f66ae96a8cf _ZN2js8GCMarker19processMarkStackTopERNS_11SliceBudgetE (libmozjs-38.so)
#6  0x00007f66ae945f5d _ZN2js8GCMarker14drainMarkStackERNS_11SliceBudgetE (libmozjs-38.so)
#7  0x00007f66aec95e94 _ZN2js2gc9GCRuntime14drainMarkStackERNS_11SliceBudgetENS_7gcstats5PhaseE (libmozjs-38.so)
#8  0x00007f66aecc4ba1 _ZN2js2gc9GCRuntime23incrementalCollectSliceERNS_11SliceBudgetEN2JS8gcreason6ReasonE (libmozjs-38.so)
#9  0x00007f66aecc55d2 _ZN2js2gc9GCRuntime7gcCycleEbRNS_11SliceBudgetEN2JS8gcreason6ReasonE (libmozjs-38.so)
#10 0x00007f66aecc582d _ZN2js2gc9GCRuntime7collectEbNS_11SliceBudgetEN2JS8gcreason6ReasonE (libmozjs-38.so)
#11 0x00007f66b95ad739 gjs_schedule_gc_if_needed (libgjs.so.0)
#12 0x00007f66b95ad7a4 gjs_call_function_value (libgjs.so.0)
#13 0x00007f66b9588895 gjs_closure_invoke (libgjs.so.0)
#14 0x00007f66b95a056e n/a (libgjs.so.0)
#15 0x00007f66b20a130d g_closure_invoke (libgobject-2.0.so.0)
#16 0x00007f66b20b398e signal_emit_unlocked_R (libgobject-2.0.so.0)
#17 0x00007f66b20bc1a5 g_signal_emit_valist (libgobject-2.0.so.0)
#18 0x00007f66b20bcb0f g_signal_emit (libgobject-2.0.so.0)
#19 0x00007f66b20a5594 g_object_dispatch_properties_changed (libgobject-2.0.so.0)
#20 0x00007f66b20a4f3e g_object_notify_queue_thaw (libgobject-2.0.so.0) 
#21 0x00007f66b20a94de g_object_set_property (libgobject-2.0.so.0)
#22 0x00007f66b95951b6 n/a (libgjs.so.0)
#23 0x00007f66ae9db442 _ZN2js5Shape3setEP9JSContextN2JS6HandleIPNS_12NativeObjectEEENS4_IP8JSObjectEEbNS3_13MutableHandleINS3_5ValueEEE (libmozjs-38.so)
#24 0x00007f66ae9b5499 _ZL9NativeSetP9JSContextN2JS6HandleIPN2js12NativeObjectEEENS2_IP8JSObjectEENS2_IPNS3_5ShapeEEEbNS1_13MutableHandleINS1_5ValueEEE (libmozjs-38.so)
#25 0x00007f66ae9b7c28 _ZN2js17NativeSetPropertyEP9JSContextN2JS6HandleIPNS_12NativeObjectEEENS3_IP8JSObjectEENS3_I4jsidEENS_13QualifiedBoolENS2_13MutableHandleINS2_5ValueEEEb (libmozjs-38.so)
#26 0x00007f66ae9b8a06 _ZL17SetObjectPropertyP9JSContext4JSOpN2JS6HandleINS2_5ValueEEENS3_I4jsidEENS2_13MutableHandleIS4_EE (libmozjs-38.so)
#27 0x00007f66ae9ab58f _ZL9InterpretP9JSContextRN2js8RunStateE (libmozjs-38.so)
#28 0x00007f66ae9b4324 _ZN2js9RunScriptEP9JSContextRNS_8RunStateE (libmozjs-38.so)
#29 0x00007f66ae9b4614 _ZN2js6InvokeEP9JSContextN2JS8CallArgsENS_14MaybeConstructE (libmozjs-38.so)
#30 0x00007f66ae9b5243 _ZN2js6InvokeEP9JSContextRKN2JS5ValueES5_jPS4_NS2_13MutableHandleIS3_EE (libmozjs-38.so)
#31 0x00007f66aec0dbfb _ZN2js3jit14InvokeFunctionEP9JSContextN2JS6HandleIP8JSObjectEEjPNS3_5ValueES9_ (libmozjs-38.so)
#32 0x00007f66bab2a134 n/a (n/a)

Comment 28 Bastien Nocera 2017-05-18 23:44:02 UTC

(In reply to Bastien Nocera from comment #27)
<snip>
> Just like in one of the duplicates, I reproduced this twice in a row with
> the same assertion by launching a clutter-gst based video player (it was
> sushi in the dupe, totem for me).

1. In nautilus, open a folder with a video
2. Press Enter to launch the video
3. Press 'q' in totem to exit
4. Go back to 2.

I can make it crash like that in under a minute.

Comment 29 tirifto 2017-05-19 00:35:15 UTC

(In reply to Bastien Nocera from comment #28)
> 1. In nautilus, open a folder with a video
> 2. Press Enter to launch the video
> 3. Press 'q' in totem to exit
> 4. Go back to 2.
> 
> I can make it crash like that in under a minute.

This works (breaks) very well on my system, too!

Parabola GNU+Linux-libre (Arch Linux derivative)
gnome-shell 3.24.2-1
js38 38.8.0-3
gjs 1.48.3-1

Reproducible under both X and Wayland.

Comment 30 Rui Matos 2017-05-19 09:44:09 UTC

*** Bug 782827 has been marked as a duplicate of this bug. ***

Comment 31 Baptiste Mille-Mathias 2017-05-19 10:26:53 UTC

Created attachment 352149 [details]
Stack strace

Comment 32 Eduardo Medina 2017-05-19 11:32:18 UTC

Hi, I think I have the same bug. This is my system configuration:

-Manjaro 17.0.
-GNOME-Shell: 3.24.2-1
-js38: 38.8.0-3
-gjs: 1.48.3-1
-glibc: 2.25-1

I have two computers. One of them is my main computer, a desktop equipped with an ASUS P5K Motherboard, a NVIDIA GTX 1050 as GPU and an Intel Core 2 Quad Q8300 as CPU. I use the proprietary blob driver 375.66-1 version from Manjaro repos, Linux 4.9 and Intel Microcode.

The second computer I have is an old Toshiba laptop Satellite Pro P200, with an Intel Core 2 Duo T7300 and an ATI Mobility Radeon HD 2600 as GPU identified as AMD® Rv630 by the free drivers stack (Linux 4.9 LTS and Mesa 17.0.5). I use Intel Microcode here too.

The environment crashes randomly. It recovers perfect after crashing, but when the bug occurs is very annoying.

I couldn't catch any log or message to know what is the exact error, the only thing I have clear is the bug runs always when I have an amount of windows spreaded through some virtual desktops.

Yes, the bug occurs on Mesa too if it's the same problem I'm responding. Wayland looks much more stable than Xorg, but the bug happens on both servers.

This is what I get from journalctl from my desktop:
may 19 12:57:54 manjarog-p5k polkitd[577]: Registered Authentication Agent for unix-session:c2 (system bus name :1.117 [/usr/bin/gnome-shell], object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale es_ES.UTF-8)
may 19 12:57:54 manjarog-p5k systemd-coredump[7094]: Process 3002 (gnome-shell) of user 1000 dumped core.
                                                     
Stack trace of thread 3002:
#0  0x00007f28aa658ed5 n/a (libgjs.so.0)
#1  0x00007f28a877666a g_main_context_dispatch (libglib-2.0.so.0)
#2  0x00007f28a8776a20 n/a (libglib-2.0.so.0)
#3  0x00007f28a8776d42 g_main_loop_run (libglib-2.0.so.0)
#4  0x00007f28a9f36d0c meta_run (libmutter-0.so.0)
#5  0x0000000000401ff7 main (gnome-shell)
#6  0x00007f28a818a511 __libc_start_main (libc.so.6)
#7  0x000000000040212a n/a (gnome-shell)
                                                     
Stack trace of thread 3055:
#0  0x00007f28a851b756 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
#1  0x00007f289af93500 PR_WaitCondVar (libnspr4.so)
#2  0x00007f28a3578811 n/a (libmozjs-38.so)
#3  0x00007f289af98d8c n/a (libnspr4.so)
#4  0x00007f28a85152e7 start_thread (libpthread.so.0)
#5  0x00007f28a825654f __clone (libc.so.6)
                                                     
Stack trace of thread 3060:
#0  0x00007f28a851b756 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
#1  0x00007f289af93500 PR_WaitCondVar (libnspr4.so)
#2  0x00007f28a3578811 n/a (libmozjs-38.so)
#3  0x00007f289af98d8c n/a (libnspr4.so)
#4  0x00007f28a85152e7 start_thread (libpthread.so.0)
#5  0x00007f28a825654f __clone (libc.so.6)
                                                     
Stack trace of thread 3003:
#0  0x00007f28a824c67d poll (libc.so.6)
#1  0x00007f28a87769b6 n/a (libglib-2.0.so.0)
#2  0x00007f28a8776acc g_main_context_iteration (libglib-2.0.so.0)
#3  0x00007f28a8776b11 n/a (libglib-2.0.so.0)
#4  0x00007f28a879e295 n/a (libglib-2.0.so.0)
#5  0x00007f28a85152e7 start_thread (libpthread.so.0)
#6  0x00007f28a825654f __clone (libc.so.6)
                                                     
Stack trace of thread 3004:
#0  0x00007f28a824c67d poll (libc.so.6)
#1  0x00007f28a87769b6 n/a (libglib-2.0.so.0)
#2  0x00007f28a8776d42 g_main_loop_run (libglib-2.0.so.0)
#3  0x00007f28a8d5dff6 n/a (libgio-2.0.so.0)
#4  0x00007f28a879e295 n/a (libglib-2.0.so.0)
#5  0x00007f28a85152e7 start_thread (libpthread.so.0)
#6  0x00007f28a825654f __clone (libc.so.6)
                                                     
Stack trace of thread 3061:
#0  0x00007f28a851b756 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
#1  0x00007f289af93500 PR_WaitCondVar (libnspr4.so)
#2  0x00007f28a3578811 n/a (libmozjs-38.so)
#3  0x00007f289af98d8c n/a (libnspr4.so)
#4  0x00007f28a85152e7 start_thread (libpthread.so.0)
#5  0x00007f28a825654f __clone (libc.so.6)
                                                     
Stack trace of thread 3058:
#0  0x00007f28a851b756 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
#1  0x00007f289af93500 PR_WaitCondVar (libnspr4.so)
#2  0x00007f28a3578811 n/a (libmozjs-38.so)
#3  0x00007f289af98d8c n/a (libnspr4.so)
#4  0x00007f28a85152e7 start_thread (libpthread.so.0)
#5  0x00007f28a825654f __clone (libc.so.6)
                                                     
Stack trace of thread 3026:
#0  0x00007f28a824c67d poll (libc.so.6)
#1  0x00007f28a459bee1 n/a (libpulse.so.0)
#2  0x00007f28a458d6f1 pa_mainloop_poll (libpulse.so.0)
#3  0x00007f28a458dd8e pa_mainloop_iterate (libpulse.so.0)
#4  0x00007f28a458de40 pa_mainloop_run (libpulse.so.0)
#5  0x00007f28a459be29 n/a (libpulse.so.0)
#6  0x00007f2899849fe8 n/a (libpulsecommon-10.0.so)
#7  0x00007f28a85152e7 start_thread (libpthread.so.0)
#8  0x00007f28a825654f __clone (libc.so.6)
                                                     
Stack trace of thread 3059:
#0  0x00007f28a851b756 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
#1  0x00007f289af93500 PR_WaitCondVar (libnspr4.so)
#2  0x00007f28a3578811 n/a (libmozjs-38.so)
#3  0x00007f289af98d8c n/a (libnspr4.so)
#4  0x00007f28a85152e7 start_thread (libpthread.so.0)
#5  0x00007f28a825654f __clone (libc.so.6)
                                                     
Stack trace of thread 3056:
#0  0x00007f28a851b756 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
#1  0x00007f289af93500 PR_WaitCondVar (libnspr4.so)
#2  0x00007f28a3578811 n/a (libmozjs-38.so)
#3  0x00007f289af98d8c n/a (libnspr4.so)
#4  0x00007f28a85152e7 start_thread (libpthread.so.0)
#5  0x00007f28a825654f __clone (libc.so.6)
                                                     
Stack trace of thread 3025:
#0  0x00007f28a824c67d poll (libc.so.6)
#1  0x00007f28a87769b6 n/a (libglib-2.0.so.0)
#2  0x00007f28a8776acc g_main_context_iteration (libglib-2.0.so.0)
#3  0x00007f289053e55d n/a (libdconfsettings.so)
#4  0x00007f28a879e295 n/a (libglib-2.0.so.0)
#5  0x00007f28a85152e7 start_thread (libpthread.so.0)
#6  0x00007f28a825654f __clone (libc.so.6)
                                                     
Stack trace of thread 3462:
#0  0x00007f28a8251889 syscall (libc.so.6)
#1  0x00007f28a87bc32f g_cond_wait (libglib-2.0.so.0)
#2  0x00007f28a5c3568d n/a (libmutter-cogl-0.so)
#3  0x00007f28a879e295 n/a (libglib-2.0.so.0)
#4  0x00007f28a85152e7 start_thread (libpthread.so.0)
#5  0x00007f28a825654f __clone (libc.so.6)
                                                     
Stack trace of thread 3057:
#0  0x00007f28a851b756 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
#1  0x00007f289af93500 PR_WaitCondVar (libnspr4.so)
#2  0x00007f28a3578811 n/a (libmozjs-38.so)
#3  0x00007f289af98d8c n/a (libnspr4.so)
#4  0x00007f28a85152e7 start_thread (libpthread.so.0)
#5  0x00007f28a825654f __clone (libc.so.6)
                                                     
Stack trace of thread 3054:
#0  0x00007f28a851b756 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
#1  0x00007f289af93500 PR_WaitCondVar (libnspr4.so)
#2  0x00007f28a3578811 n/a (libmozjs-38.so)
#3  0x00007f289af98d8c n/a (libnspr4.so)
#4  0x00007f28a85152e7 start_thread (libpthread.so.0)
#5  0x00007f28a825654f __clone (libc.so.6)
may 19 12:57:54 manjarog-p5k gnome-shell[7104]: JS WARNING: [resource:///org/gnome/gjs/modules/tweener/tweener.js 538]: reference to undefined property properties[istr].arrayIndex
may 19 12:57:54 manjarog-p5k gnome-shell[7104]: JS WARNING: [resource:///org/gnome/shell/ui/search.js 436]: reference to undefined property provider.isRemoteProvider
may 19 12:57:54 manjarog-p5k gnome-shell[7104]: No permission to trigger offline updates: Polkit.Error: GDBus.Error:org.freedesktop.PolicyKit1.Error.Failed: Action org.freedesktop.packagekit.trigger-offline-update is not registered
may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): DFP-0: disconnected
may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): DFP-0: Internal TMDS
may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): DFP-0: 330.0 MHz maximum pixel clock
may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0):
may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): Ancor Communications Inc VX229 (DFP-1): connected
may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): Ancor Communications Inc VX229 (DFP-1): Internal TMDS
may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): Ancor Communications Inc VX229 (DFP-1): 600.0 MHz maximum pixel clock
may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0):
may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): DFP-2: disconnected
may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): DFP-2: Internal DisplayPort
may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): DFP-2: 1440.0 MHz maximum pixel clock
may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0):
may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): DFP-3: disconnected
may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): DFP-3: Internal TMDS
may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): DFP-3: 330.0 MHz maximum pixel clock
may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0):

I don't know if it's enough or it would be better to publish the journalctl from my laptop too.

Comment 33 Florian Müllner 2017-05-20 01:34:30 UTC

*** Bug 782851 has been marked as a duplicate of this bug. ***

Comment 34 Philip Chimento 2017-05-20 04:28:22 UTC

Has anyone observed this crash on mozjs38-38.2.1rc0 rather than 38.8.0? (Or in other words, does the crash occur when using jhbuild and the gnome-3.24 modulesets?) It seems like these crashes were only reported on Arch after they upgraded their mozjs38, but I'd be happy to be proven wrong.

> What is this RR tool, how do I get it and how do I use it? Do you have any
> commands I can run there? I don't think it is even packaged for my distro.

http://rr-project.org/

Instructions for building from source, and Fedora and Ubuntu binary packages, are provided there. Read comment 14 for the information that would help here.

Comment 35 Christian Stadelmann 2017-05-20 09:57:45 UTC

(In reply to Philip Chimento from comment #34)
> Has anyone observed this crash on mozjs38-38.2.1rc0 rather than 38.8.0? (Or
> in other words, does the crash occur when using jhbuild and the gnome-3.24
> modulesets?) It seems like these crashes were only reported on Arch after
> they upgraded their mozjs38, but I'd be happy to be proven wrong.

I'm seeing the crash on Fedora (as do some other people), with mozjs38-38.8.0-4.fc26.x86_64, which is extracted from the https://ftp.mozilla.org/pub/firefox/releases/38.8.0esr/source/firefox-38.8.0esr.source.tar.bz2 sources. 38.8.0 has been in Fedora repositories for 8 months, so anyone using Gnome 3.24 on Fedora has the 38.8.0 builds.

These are a few downstream bug reports, don't know whether that helps or not:
https://bugzilla.redhat.com/show_bug.cgi?id=1451805
https://bugzilla.redhat.com/show_bug.cgi?id=1451914
https://bugzilla.redhat.com/show_bug.cgi?id=1452453
https://bugzilla.redhat.com/show_bug.cgi?id=1452901
https://bugzilla.redhat.com/show_bug.cgi?id=1451919 (missing debug symbols)

> http://rr-project.org/
> 
> Instructions for building from source, and Fedora and Ubuntu binary
> packages, are provided there. Read comment 14 for the information that would
> help here.

Thanks!

Comment 36 Eduardo Medina 2017-05-22 09:59:28 UTC

When will be available the patch for this? I work with GNOME Shell and my patience is ending.

Comment 37 Georges Basile Stavracas Neto 2017-05-22 12:18:09 UTC

(In reply to Eduardo Medina from comment #36)
> When will be available the patch for this? I work with GNOME Shell and my
> patience is ending.

When someone fixes it. Please, refrain from posting this kind of comment in Bugzilla. It just creates noise and doesn't add to the technical discussion.

Comment 38 Emmanuele Bassi (:ebassi) 2017-05-22 12:19:03 UTC

(In reply to Eduardo Medina from comment #36)
> When will be available the patch for this? I work with GNOME Shell and my
> patience is ending.

Before you end your patience, my suggestion would be to use the X.org session, so a compositor crash does not mean the session terminates.

Comment 39 Eduardo Medina 2017-05-22 14:55:10 UTC

Xorg is less stable than Wayland on my computer.

I'm a Linux gamer, so I need Xorg more than Wayland.

Comment 40 Ruud van Asseldonk 2017-05-22 20:49:47 UTC

I spent this evening trying to run gnome-shell under rr, so far without success. The problem is that gnome-shell makes a DRM ioctl syscall which is currently not supported by rr, on both of my machines. See also this comment: https://github.com/mozilla/rr/issues/1596#issuecomment-303191420. I might try to fix this some time, but for now I am unable to use rr for this.

It might be possible to sidestep the issue by not using DRM, but so far I have not been able to figure out how to force gdm/gnome-session to use the Gallium llvmpipe driver.

I have been able to consistently reproduce this issue within a few seconds under Wayland with the steps from comment 28. I have not been able to reproduce this issue at all under X.

If anybody wants to attempt to debug this, the way to start gnome-session under rr is to change the Exec= line in /usr/share/applications/org.gnome.Shell.desktop to invoke 'rr record'.

Comment 41 Philip Chimento 2017-05-23 04:16:08 UTC

(In reply to Ruud van Asseldonk from comment #40)
> I spent this evening trying to run gnome-shell under rr, so far without
> success.

Thanks so much for trying!

> The problem is that gnome-shell makes a DRM ioctl syscall which is
> currently not supported by rr, on both of my machines. See also this
> comment: https://github.com/mozilla/rr/issues/1596#issuecomment-303191420. I
> might try to fix this some time, but for now I am unable to use rr for this.

I'll be trying to repro it under X for now.

Comment 42 Philip Chimento 2017-05-23 04:19:40 UTC

(In reply to Eduardo Medina from comment #36)
> When will be available the patch for this? I work with GNOME Shell and my
> patience is ending.

Eduardo, I'd like to ask you to assume that we are all acting in good faith here. This bug has so far proven difficult for me to reproduce, and at the moment I'm also struggling to find free time to spend on tracking it down.

I know that the crashes must be frustrating, but with comments like that you are demotivating the very people who can help fix the problem.

Comment 43 Eduardo Medina 2017-05-23 09:08:19 UTC

(In reply to Philip Chimento from comment #42)
> (In reply to Eduardo Medina from comment #36)
> > When will be available the patch for this? I work with GNOME Shell and my
> > patience is ending.
> 
> Eduardo, I'd like to ask you to assume that we are all acting in good faith
> here. This bug has so far proven difficult for me to reproduce, and at the
> moment I'm also struggling to find free time to spend on tracking it down.
> 
> I know that the crashes must be frustrating, but with comments like that you
> are demotivating the very people who can help fix the problem.

Sorry if one of my previous comments sounded harsh.

The crash is reproduced every 15-45 minutes (on Xorg and Wayland) in my computer and I use GNOME Shell for my work, so the result of this is my frustration because this problem breaks my workflow.

These are the applications and the their distribution I use to crash the desktop environment:

-Desktop 1: LibreOffice, Firefox, Chromium and Opera.
-Desktop 2: Gedit, Files/Nautilus and Gimp and/or Krita.
-Desktop 3: Telegram and Audacious.

The bug doesn't happen if you use only one virtual desktop, you have to use more than one.

Comment 44 Sri Ramkrishna 2017-05-24 15:42:55 UTC

@Eduardo - I suggest you downgrade your gnome-shell packages to the previous version.  That might work much better for you for the short term.

Comment 45 Michael Catanzaro 2017-05-24 16:00:27 UTC

Is there a Fedora bug report for this? A crash that brings down the entire desktop is a likely candidate to be a Fedora blocker.

Comment 46 Christian Stadelmann 2017-05-24 17:10:56 UTC

(In reply to Michael Catanzaro from comment #45)
> Is there a Fedora bug report for this? A crash that brings down the entire
> desktop is a likely candidate to be a Fedora blocker.

Looks like you found the answer already ;)

https://bugzilla.redhat.com/show_bug.cgi?id=1451914

Comment 47 Philip Chimento 2017-05-24 18:29:12 UTC

*** Status as of May 24 - Please Read Before Posting ***

I am trying to reproduce this on my Fedora box but instead managed to permanently screw up my video drivers while trying to boot in VESA mode. That may take a while to fix. I will be trying to reproduce it with a VM instead.

Here's what you can do to help:

- Post your OS distro/version, versions of gnome-shell, gjs, mozjs38, and whether you are using Xorg or Wayland.

- Prove or disprove that this crash doesn't happen using mozjs38-38.2.1.rc0 (the version built by jhbuild.)

I suspect that might be the case because I did a reasonable amount of
testing on that version, and it was never reported by Arch users when
Arch still used that version (whereas other, now fixed, crashes were
reported on Arch.) It's a long shot but it would narrow down the
search quite a bit, so it's worth it.

- Use RR to find the point at which the offending GjsMaybeOwned<JS::Value> object becomes garbage.

* As Ruud found out, RR will likely not work on Wayland so you need
an Xorg session. (And you need to be able to reproduce the crash under X)

* See for more info
https://github.com/mozilla/rr/issues/1596#issuecomment-303191420

* Ctrl+Alt+F2 to a VT, run
`DISPLAY=:1 XDG_SESSION_TYPE=x11 rr gnome-shell --replace`, replacing
the DISPLAY value with the correct display, and Ctrl+Alt+F1 (or F7
depending on distro) back to the X server. Wait for the crash to
happen. Run `rr replay`.

- Build mozjs38 with --enable-debug and reproduce the crash. This might give more useful information.

Note, this is different from debug symbols! It enables a bunch of
extra assertions and sanity checks at runtime. This also requires
rebuilding GJS against the rebuilt mozjs38.

Comment 48 Cengiz Can 2017-05-25 21:28:37 UTC

Created attachment 352595 [details]
Just another log with stacktrace

I've bumped into this with gnome-shell 3.24.2 and gjs 1.48.3

Comment 49 Bram Neijt 2017-05-27 09:36:34 UTC

I, out of desperation of loosing my work everytime this randomly happens, tried upgrading to gjs 1.49.2 (commit=d74c0ab5968449c4d790e24cad694d9ad022ef7e)

No use, problem still remains.

mei 27 11:19:54 roo systemd-coredump[13071]: Process 10052 (gnome-shell) of user 1000 dumped core.
Stack trace of thread 10052:
#0  0x00007fe5da376a10 raise (libc.so.6)
#1  0x00007fe5da37813a abort (libc.so.6)
#2  0x00007fe5da3b52b0 __libc_message (libc.so.6)
#3  0x00007fe5da3bb90e malloc_printerr (libc.so.6)
#4  0x00007fe5da3bc11e _int_free (libc.so.6)
#5  0x00007fe5dc84514b _ZN13GjsMaybeOwnedIN2JS5ValueEE16teardown_rootingEv (libgjs.so.0)
#6  0x00007fe5d5a73c73 n/a (libmozjs-38.so)
#7  0x00007fe5d5acea3c n/a (libmozjs-38.so)
#8  0x00007fe5d5a74f61 n/a (libmozjs-38.so)
#9  0x00007fe5d5a8a733 n/a (libmozjs-38.so)
#10 0x00007fe5d5a8b172 n/a (libmozjs-38.so)
#11 0x00007fe5d5a8cf18 n/a (libmozjs-38.so)
#12 0x00007fe5d5a8d8c0 n/a (libmozjs-38.so)
#13 0x00007fe5d5a8db0d n/a (libmozjs-38.so)
#14 0x00007fe5d5a8de64 n/a (libmozjs-38.so)
#15 0x00007fe5dc85b016 gjs_gc_if_needed (libgjs.so.0)
#16 0x00007fe5dc850b44 trigger_gc_if_needed (libgjs.so.0)
#17 0x00007fe5da94f66a g_main_context_dispatch (libglib-2.0.so.0)
#18 0x00007fe5da94fa20 n/a (libglib-2.0.so.0)
#19 0x00007fe5da94fd42 g_main_loop_run (libglib-2.0.so.0)
#20 0x00007fe5dc110d0c meta_run (libmutter-0.so.0)
#21 0x0000000000401ff7 main (gnome-shell)
#22 0x00007fe5da363511 __libc_start_main (libc.so.6)
#23 0x000000000040212a n/a (gnome-shell)


libraries involved are:
glibc 2.25-1
js38 38.8.0-3
gjs 1.49.2-1
glib2 2.52.2+1+gb8bd46bc8-1
mutter 3.24.2-1
gnome-shell 3.24.2-1

Comment 50 Strangiato 2017-05-27 17:01:41 UTC

I can reproduce Wayland session crash following the steps from comment 28 https://bugzilla.gnome.org/show_bug.cgi?id=781799#c28

I use Antergos (Arch-based).

Comment 51 Vladimir Stoyakin 2017-05-27 23:59:37 UTC

Created attachment 352698 [details]
Crash of gnome-shell by running sushi

Reproduced with:
Archlinux
js38 38.8.0-3 with --enable-debug flag
gjs 1.48.3-1
gnome-shell 3.24.2-1
sushi-3.24.0-1

It contains some messages about assertion failures.
Sushi crashed at closing because I didn't see it.

Comment 52 Vladimir Stoyakin 2017-06-04 20:07:51 UTC

Created attachment 353154 [details]
Valgrind-Memcheck report

Valgrind reports about "use-after-free" at ./gi/object.cpp:1747 before crash.
Corresponding code:

static gboolean
signal_connection_invalidate_idle(void *user_data)
{
    ConnectData *connect_data = (ConnectData *) user_data;

    // referenced line
    connect_data->obj->signals = g_list_delete_link(connect_data->obj->signals,
                                                    connect_data->link);
    g_slice_free(ConnectData, connect_data);
    return G_SOURCE_REMOVE;
}

I am not sure that this is connected with damaging of GjsMaybeOwned<JS::Value>, but it may be useful.

Comment 53 John 2017-06-09 16:13:23 UTC

I also am experience this relatively frequently.  I managed to capture this crash with a non-stripped version of libmozjs.  Let me know if there's more info that would be helpful.

Arch Linux
js38 38.8.0-3
gjs 1.48.3-1
gnome-shell 3.24.2-1

Comment 54 John 2017-06-09 16:15:10 UTC

(In reply to John from comment #53)
> I also am experience this relatively frequently.  I managed to capture this
> crash with a non-stripped version of libmozjs.  Let me know if there's more
> info that would be helpful.
> 
> Arch Linux
> js38 38.8.0-3
> gjs 1.48.3-1
> gnome-shell 3.24.2-1

          PID: 4587 (gnome-shell)
           UID: 1000 (john)
           GID: 1000 (john)
        Signal: 6 (ABRT)
     Timestamp: Fri 2017-06-09 11:56:35 EDT (17min ago)
  Command Line: /usr/bin/gnome-shell
    Executable: /usr/bin/gnome-shell
 Control Group: /user.slice/user-1000.slice/session-c2.scope
          Unit: session-c2.scope
         Slice: user-1000.slice
       Session: c2
     Owner UID: 1000 (john)
       Boot ID: d7fcb0f81a6b4b96ac9dbf5e7a0808b6
    Machine ID: dceef662384b4f94ac526646ffcae8b9
      Hostname: oberon
       Storage: /var/lib/systemd/coredump/core.gnome-shell.1000.d7fcb0f81a6b4b96ac9dbf5e7a0808b6.4587.1497023795000000000000.lz4
       Message: Process 4587 (gnome-shell) of user 1000 dumped core.
                
                Stack trace of thread 4587:
                #0  0x00007f68b8a54670 raise (libc.so.6)
                #1  0x00007f68b8a55d00 abort (libc.so.6)
                #2  0x00007f68b8a93551 __libc_message (libc.so.6)
                #3  0x00007f68b8a99bfb malloc_printerr (libc.so.6)
                #4  0x00007f68b8a9afd1 _int_free (libc.so.6)
                #5  0x00007f68baf23ab3 n/a (libgjs.so.0)
                #6  0x00007f68b4153af3 _ZN8JSObject8finalizeEPN2js6FreeOpE (libmozjs-38.so)
                #7  0x00007f68b41ae643 _ZN2js2gc10ArenaLists16forceFinalizeNowEPNS_6FreeOpENS0_9AllocKindENS1_14KeepArenasEnumEPPNS0_11ArenaHeaderE (libmozjs-38.so)
                #8  0x00007f68b4154dd1 _ZN2js2gc10ArenaLists11finalizeNowEPNS_6FreeOpENS0_9AllocKindENS1_14KeepArenasEnumEPPNS0_11ArenaHeaderE (libmozjs-38.so)
                #9  0x00007f68b416b696 _ZN2js2gc9GCRuntime22beginSweepingZoneGroupEv (libmozjs-38.so)
                #10 0x00007f68b416bdd8 _ZN2js2gc9GCRuntime15beginSweepPhaseEb (libmozjs-38.so)
                #11 0x00007f68b416df93 _ZN2js2gc9GCRuntime23incrementalCollectSliceERNS_11SliceBudgetEN2JS8gcreason6ReasonE (libmozjs-38.so)
                #12 0x00007f68b416e959 _ZN2js2gc9GCRuntime7gcCycleEbRNS_11SliceBudgetEN2JS8gcreason6ReasonE (libmozjs-38.so)
                #13 0x00007f68b416eba5 _ZN2js2gc9GCRuntime7collectEbNS_11SliceBudgetEN2JS8gcreason6ReasonE (libmozjs-38.so)
                #14 0x00007f68b416f102 _ZN2js2gc9GCRuntime13gcIfRequestedEP9JSContext (libmozjs-38.so)
                #15 0x00007f68b3e9cad7 InvokeInterruptCallback (libmozjs-38.so)
                #16 0x00007f68bb905930 n/a (n/a)
                #17 0x0000000005ce15c0 n/a (n/a)
                #18 0x00007f68a32a7cb2 n/a (n/a)

Comment 55 Christian Stadelmann 2017-06-09 16:17:18 UTC

(In reply to John from comment #53)
> I also am experience this relatively frequently.  I managed to capture this
> crash with a non-stripped version of libmozjs.  Let me know if there's more
> info that would be helpful.

As written above, catching the crash with RR would be useful, in a way that you can roll back and find the moment where the data was free()d first.

Comment 56 John 2017-06-09 19:11:27 UTC

(In reply to Christian Stadelmann from comment #55)
> (In reply to John from comment #53)
> > I also am experience this relatively frequently.  I managed to capture this
> > crash with a non-stripped version of libmozjs.  Let me know if there's more
> > info that would be helpful.
> 
> As written above, catching the crash with RR would be useful, in a way that
> you can roll back and find the moment where the data was free()d first.

Sadly I can't seem to reproduce gnome-shell crashing under X11.  Under X11 I see that Sushi crashes when previewing a file, whereas under Wayland Sushi crashes, but eventually also the whole gnome-shell crashes.

I did manage to get rr setup and running, though, so I'll continue to try when I have some spare time.

Comment 57 Philip Chimento 2017-06-11 17:52:03 UTC

(In reply to Vladimir Stoyakin from comment #51)
> js38 38.8.0-3 with --enable-debug flag
> 
> It contains some messages about assertion failures.
> Sushi crashed at closing because I didn't see it.

Sushi fails a debug-mode assertion at closing because it doesn't properly clean up its GjsContext (it needs a g_object_unref(gjs_context) at the end of main.c). This can safely be ignored for now.

(In reply to Vladimir Stoyakin from comment #52)
> Valgrind reports about "use-after-free" at ./gi/object.cpp:1747 before crash.
> 
> I am not sure that this is connected with damaging of
> GjsMaybeOwned<JS::Value>, but it may be useful.

This is absolutely fantastic, thank you. I'm not certain it is connected either, but I would not be surprised if this is the cause. I will attach a patch shortly.

(In reply to John from comment #56)
> (In reply to Christian Stadelmann from comment #55)
> > (In reply to John from comment #53)
> > > I also am experience this relatively frequently.  I managed to capture this
> > > crash with a non-stripped version of libmozjs.  Let me know if there's more
> > > info that would be helpful.
> > 
> > As written above, catching the crash with RR would be useful, in a way that
> > you can roll back and find the moment where the data was free()d first.
> 
> Sadly I can't seem to reproduce gnome-shell crashing under X11.  Under X11 I
> see that Sushi crashes when previewing a file, whereas under Wayland Sushi
> crashes, but eventually also the whole gnome-shell crashes.

It may well be crashing under X anyway, it just has a different effect. Under Wayland a gnome-shell crash is more catastrophic, because it takes out all your running applications too.

---

Other news - I have reproduced the crash using Bastien's instructions:

(In reply to Bastien Nocera from comment #28)
> 1. In nautilus, open a folder with a video
> 2. Press Enter to launch the video
> 3. Press 'q' in totem to exit
> 4. Go back to 2.
> 
> I can make it crash like that in under a minute.

For me it's more like 10 minutes of frenetically mashing those keys, but I have managed to observe the crash on a VM running Fedora 27 Alpha. Unfortunately RR doesn't work under VirtualBox. I have almost got my Fedora box back up and running, so I'll concentrate my effort there. First of all to double check that the patches that I'm about to attach fix the problem, and if that doesn't work then to run RR.

Comment 58 Philip Chimento 2017-06-11 20:04:13 UTC

Created attachment 353579 [details] [review]
object: Prevent use-after-free in signal connections

Objects trace their signal connections in order to keep the closures
alive during garbage collection. When invalidating a signal connection,
we must do so in an idle function, since it is illegal to stop tracing a
GC-thing in the middle of GC.

However, this caused a possible use-after-free if the signal connection
was invalidated, and then the object itself was finalized before the idle
function could be run.

This refactor avoids the use-after-free by cancelling any pending idle
invalidations in the object's finalizer, and invalidating any remaining
signal connections in such a way that no more idle functions are
scheduled.

Comment 59 Philip Chimento 2017-06-11 20:04:19 UTC

Created attachment 353580 [details] [review]
util-root: Allow GjsMaybeOwned::DestroyNotify to free

In the case of a closure, the GjsMaybeOwned object is embedded as part of
struct Closure. The context destroy notify callback will invalidate the
closure, which frees the GjsMaybeOwned object, causing a use-after-free
when the callback returns.

This patch gives the callback a boolean return value; it should return
true if it has freed the GjsMaybeOwned object and false if it does not.
If the callback returns true, then the GjsMaybeOwned object will be
considered invalid from then on.

Comment 60 Philip Chimento 2017-06-11 20:45:36 UTC

These patches apply to both master and gnome-3-24. If you are able, please check and let me know if the crash still occurs with these patches applied.

Comment 61 Florian Müllner 2017-06-12 17:36:53 UTC

*** Bug 783699 has been marked as a duplicate of this bug. ***

Comment 62 Cosimo Cecchi 2017-06-12 19:51:04 UTC

Review of attachment 353579 [details] [review]:

Good catch! I wonder if it's possible to encode this in a testcase?
I also have a hypothetical comment below, but feel free to push if the answer is no.

::: gi/object.cpp
@@ +1402,3 @@
+{
+    auto cd = static_cast<ConnectData *>(data);
+    cd->obj->signals.erase(cd);

Is it possible for this code to get called before signal_connection_invalidate_idle() had a chance to fire? If not, then all good; otherwise you would need to only schedule the idle timeout when it hasn't been scheduled already.

Comment 63 Cosimo Cecchi 2017-06-12 19:56:32 UTC

Review of attachment 353580 [details] [review]:

I can see how this solves the bug, but it feels to me that this could be prevented by either e.g. ref-counting GjsMaybeOwned so that it can't get freed by the callback, or moving the code that frees the object outside of the callback.
I am not against this solution though, so feel free to push it if you have already considered alternatives.

Comment 64 Philip Chimento 2017-06-13 05:01:11 UTC

*** Bug 783723 has been marked as a duplicate of this bug. ***

Comment 65 Philip Chimento 2017-06-13 06:00:01 UTC

(In reply to Cosimo Cecchi from comment #62)
> Review of attachment 353579 [details] [review] [review]:
> 
> Good catch! I wonder if it's possible to encode this in a testcase?

I tried a few things; definitely not possible to write a test case directly for it, since it won't crash reliably, but depends on the freed memory's contents. I tried to write something that would at least crash when run under -fsanitize=address as in bug 783220, but no luck, since I'm still not sure what caused this.

> I also have a hypothetical comment below, but feel free to push if the
> answer is no.
> 
> ::: gi/object.cpp
> @@ +1402,3 @@
> +{
> +    auto cd = static_cast<ConnectData *>(data);
> +    cd->obj->signals.erase(cd);
> 
> Is it possible for this code to get called before
> signal_connection_invalidate_idle() had a chance to fire? If not, then all
> good; otherwise you would need to only schedule the idle timeout when it
> hasn't been scheduled already.

The documentation seemed to imply that a closure's invalidate notifier can only ever be called once, and I double checked in the source:
https://git.gnome.org/browse/glib/tree/gobject/gclosure.c#n572

Comment 66 Philip Chimento 2017-06-13 06:33:07 UTC

Comment on attachment 353579 [details] [review]
object: Prevent use-after-free in signal connections

Attachment 353579 [details] pushed as 2593d3d - object: Prevent use-after-free in signal connections

Comment 67 Philip Chimento 2017-06-13 06:52:29 UTC

(In reply to Cosimo Cecchi from comment #63)
> Review of attachment 353580 [details] [review] [review]:
> 
> I can see how this solves the bug, but it feels to me that this could be
> prevented by either e.g. ref-counting GjsMaybeOwned so that it can't get
> freed by the callback, or moving the code that frees the object outside of
> the callback.
> I am not against this solution though, so feel free to push it if you have
> already considered alternatives.

I had not considered those alternatives. I think the problem is that the lifetimes of the GjsMaybeOwned and the closure struct are tied together. My guess is that uncoupling those lifetimes would make things more complicated since if the callback doesn't free the closure, then the GjsMaybeOwned is responsible for freeing it. (Or the GjsMaybeOwned owns the last reference to itself.)

I'll think about it a bit longer before pushing this patch. One alternative might be that if you provide a callback, then the callback is required to free the GjsMaybeOwned.

Comment 68 Philip Chimento 2017-06-13 07:06:53 UTC

*** Status as of June 12 - Please Read Before Posting ***

My Fedora box died entirely, and will be a while before I can open it up to repair it. I will be relying on a virtual machine for the time being, which unfortunately means I can't run RR on the crash.

Here's what you can do to help:

- Confirm or disprove that the patches attached here solve the crashes.

  If you are on Fedora, the easiest way might be to rebuild RPMs with the patches. Take the instructions here [1] as a starting point. After installing the source RPM, download the two patches from this bug and save them in the rpmbuild/SOURCES directory. Add
    Patch0: name-of-first.patch
    Patch1: name-of-second.patch
  to the rpmbuild/SPECS/gjs.spec file under `Source0`. Under `%setup` add these lines:
    %global _default_patch_fuzz 2
    %patch0 -p1
    %patch1 -p1
  Finally, increment the integer in `Release` and add `debug`, e.g. `5debug` in order to distinguish the built packages from the system-provided version.
  Then when you build the RPM use the -ba option instead of -bp.
  Use `dnf install gjs-1.48.3-whatever-your-old-version-was` to go back to the prior situation.

- Any of the other things listed in comment 47.

[1] https://ask.fedoraproject.org/en/question/87205/how-do-i-install-a-src-rpm-with-dnf/

Comment 69 Christian Stadelmann 2017-06-13 07:55:28 UTC

(In reply to Philip Chimento from comment #65)
> (In reply to Cosimo Cecchi from comment #62)
> > Review of attachment 353579 [details] [review] [review] [review]:
> > 
> > Good catch! I wonder if it's possible to encode this in a testcase?
> 
> I tried a few things; definitely not possible to write a test case directly
> for it, since it won't crash reliably, but depends on the freed memory's
> contents. I tried to write something that would at least crash when run
> under -fsanitize=address as in bug 783220, but no luck, since I'm still not
> sure what caused this.

You could use valgrind in that test case and make it abort (=fail) at the first memory access bug.

Comment 70 John 2017-06-13 13:11:20 UTC

(In reply to Philip Chimento from comment #68)
> *** Status as of June 12 - Please Read Before Posting ***
> 
> My Fedora box died entirely, and will be a while before I can open it up to
> repair it. I will be relying on a virtual machine for the time being, which
> unfortunately means I can't run RR on the crash.
> 
> Here's what you can do to help:
> 
> - Confirm or disprove that the patches attached here solve the crashes.
> 
>   If you are on Fedora, the easiest way might be to rebuild RPMs with the
> patches. Take the instructions here [1] as a starting point. After
> installing the source RPM, download the two patches from this bug and save
> them in the rpmbuild/SOURCES directory. Add
>     Patch0: name-of-first.patch
>     Patch1: name-of-second.patch
>   to the rpmbuild/SPECS/gjs.spec file under `Source0`. Under `%setup` add
> these lines:
>     %global _default_patch_fuzz 2
>     %patch0 -p1
>     %patch1 -p1
>   Finally, increment the integer in `Release` and add `debug`, e.g. `5debug`
> in order to distinguish the built packages from the system-provided version.
>   Then when you build the RPM use the -ba option instead of -bp.
>   Use `dnf install gjs-1.48.3-whatever-your-old-version-was` to go back to
> the prior situation.
> 
> - Any of the other things listed in comment 47.
> 
> [1]
> https://ask.fedoraproject.org/en/question/87205/how-do-i-install-a-src-rpm-
> with-dnf/

I built gjs 1.48.3 with the patches from attachment 353579 [details] [review] and 353580 on ArchLinux.  Prior to applying I was able to trigger crashes relatively easily using the preview functionality as mentioned above;  after applying both patches I can no longer reproduce it.

Cheers!

Comment 71 Christian Stadelmann 2017-06-13 14:03:43 UTC

(In reply to Philip Chimento from comment #68)
> *** Status as of June 12 - Please Read Before Posting ***
> […]
> 
> Here's what you can do to help:
> 
> - Confirm or disprove that the patches attached here solve the crashes. […]

Done that, thanks for the links by the way!

With these patches applied I cannot reproduce the crash of attachment 351010 [details] from bug #782060 any more (as opposed to the situation before that, see comment #24).

Fedora 26
gnome-shell-3.24.2-1.fc26.x86_64
mutter-3.24.2-1.fc26.x86_64
gjs-1.48.3-2.fc26.x86_64 with patches from attachment 353579 [details] [review] and attachment 353580 [details] [review] applied
mozjs38-38.8.0-4.fc26.x86_64
libwayland-server-1.13.0-1.fc26.x86_64
GNOME/Wayland session.

I did not run into any of the related crashes on a GNOME/Xorg session, but GNOME/Wayland crashed regularly, about once per hour plus on every 3rd login.

Comment 72 Bastien Nocera 2017-06-13 18:11:25 UTC

Tested with both patches applied to the F26 RPM, and I can't make it crash as easily as I used to in comment 28.

Comment 73 Maxim 2017-06-13 19:58:05 UTC

Applied two patches yesterday and today when watching youtube in Chrome.

gnome-shell[461]: segfault at 7fac98dfffe8 ip 00007faccdde2cad sp 00007ffe1f01ba70 error 4 in libgjs.so.0.0.0[7faccddb3000+c8000]

It is not informative, just a note.

Comment 74 Philip Chimento 2017-06-13 20:28:05 UTC

*** Bug 783769 has been marked as a duplicate of this bug. ***

Comment 75 Kalev Lember 2017-06-13 20:34:17 UTC

With these two patches applied the armv7hl builds in the Fedora build system fail with the following:

gi/arg.cpp: In function 'bool gjs_g_argument_release_in_array(JSContext*, GITransfer, GITypeInfo*, guint, GArgument*)':
gi/arg.cpp:3544:35: error: cast from 'void**' to 'GValue* {aka _GValue*}' increases required alignment of target type [-Werror=cast-align]
             GValue *v = ((GValue*)array) + i;
                                   ^~~~~

Other arches we're building build just fine. https://kojipkgs.fedoraproject.org//work/tasks/2827/20012827/build.log has the full build log (short lived link)

Comment 76 Philip Chimento 2017-06-13 21:25:56 UTC

(In reply to John from comment #70)
> I built gjs 1.48.3 with the patches from attachment 353579 [details] [review]
> [review] and 353580 on ArchLinux.  Prior to applying I was able to trigger
> crashes relatively easily using the preview functionality as mentioned
> above;  after applying both patches I can no longer reproduce it.

(In reply to Christian Stadelmann from comment #71)
> With these patches applied I cannot reproduce the crash of attachment 351010 [details]
> [details] from bug #782060 any more (as opposed to the situation before
> that, see comment #24).

Thanks, both of you. So this seems to have fixed _something_, at least. I'll release a GJS 1.48.4 as soon as the second patch is merged.

> I did not run into any of the related crashes on a GNOME/Xorg session, but
> GNOME/Wayland crashed regularly, about once per hour plus on every 3rd login.

Just to be clear, you mean with the patches applied? Or that was the situation before the patches and you are talking about the crash from bug #782060?

(In reply to Bastien Nocera from comment #72)
> Tested with both patches applied to the F26 RPM, and I can't make it crash
> as easily as I used to in comment 28.

Implying that you still get the crashes but less easily? Or you can't make it crash easily but don't want to assume that it never happens? :-)

(In reply to Maxim from comment #73)
> Applied two patches yesterday and today when watching youtube in Chrome.
> 
> gnome-shell[461]: segfault at 7fac98dfffe8 ip 00007faccdde2cad sp
> 00007ffe1f01ba70 error 4 in libgjs.so.0.0.0[7faccddb3000+c8000]
> 
> It is not informative, just a note.

Can you reproduce the crash? Do you happen to have the whole backtrace, preferably with debug symbols? It's not impossible that we have been chasing two different crashes; same concern as with Christian's comment above.

(In reply to Kalev Lember from comment #75)
> With these two patches applied the armv7hl builds in the Fedora build system
> fail with the following:
> 
> gi/arg.cpp: In function 'bool gjs_g_argument_release_in_array(JSContext*,
> GITransfer, GITypeInfo*, guint, GArgument*)':
> gi/arg.cpp:3544:35: error: cast from 'void**' to 'GValue* {aka _GValue*}'
> increases required alignment of target type [-Werror=cast-align]
>              GValue *v = ((GValue*)array) + i;
>                                    ^~~~~

Bizarre! That code wasn't touched by these patches. Are you certain that the failure wasn't there before, or it might have been caused by a change in the compiler flags?

Comment 77 Bastien Nocera 2017-06-13 21:29:19 UTC

(In reply to Philip Chimento from comment #76)
> (In reply to Bastien Nocera from comment #72)
> > Tested with both patches applied to the F26 RPM, and I can't make it crash
> > as easily as I used to in comment 28.
> 
> Implying that you still get the crashes but less easily? Or you can't make
> it crash easily but don't want to assume that it never happens? :-)

The latter :)

> (In reply to Kalev Lember from comment #75)
> > With these two patches applied the armv7hl builds in the Fedora build system
> > fail with the following:
> > 
> > gi/arg.cpp: In function 'bool gjs_g_argument_release_in_array(JSContext*,
> > GITransfer, GITypeInfo*, guint, GArgument*)':
> > gi/arg.cpp:3544:35: error: cast from 'void**' to 'GValue* {aka _GValue*}'
> > increases required alignment of target type [-Werror=cast-align]
> >              GValue *v = ((GValue*)array) + i;
> >                                    ^~~~~
> 
> Bizarre! That code wasn't touched by these patches. Are you certain that the
> failure wasn't there before, or it might have been caused by a change in the
> compiler flags?

It's a separate problem, likely caused by GCC changes on that platform. Best filed as a separate bug.

Comment 78 John 2017-06-13 21:32:22 UTC

Created attachment 353710 [details]
gnome-shell crash backtrace

I did experience another crash after applying the above two patches, when closing a Firefox window playing a flash video (cnn.com):

       Message: Process 1177 (gnome-shell) of user 1000 dumped core.
                
                Stack trace of thread 1177:
                #0  0x00007f4a555cf8b0 _ZNK8JSObject12lastPropertyEv (libmozjs-38.so)
                #1  0x00007f4a556aaf65 _ZNK8JSObject11compartmentEv (libmozjs-38.so)
                #2  0x00007f4a55b9dfba _ZN2js18CompartmentChecker5checkIN2JS5ValueEEEvNS2_6HandleIT_EE (libmozjs-38.so)
                #3  0x00007f4a5ca9aa16 gjs_call_function_value (libgjs.so.0)
                #4  0x00007f4a5ca657a6 gjs_closure_invoke (libgjs.so.0)
                #5  0x00007f4a5ca87e9e closure_marshal (libgjs.so.0)
                #6  0x00007f4a5ae50ead g_closure_invoke (libgobject-2.0.so.0)
                #7  0x00007f4a5ae634ee n/a (libgobject-2.0.so.0)
                #8  0x00007f4a5ae6bcd5 g_signal_emit_valist (libgobject-2.0.so.0)
                #9  0x00007f4a5ae6c6ef g_signal_emit (libgobject-2.0.so.0)
                #10 0x00007f4a5b69653c n/a (libmutter-clutter-0.so)
                #11 0x00007f4a5ae573d8 g_object_run_dispose (libgobject-2.0.so.0)
                #12 0x00007f4a5b68a2b6 clutter_actor_destroy (libmutter-clutter-0.so)
                #13 0x00007f4a560971c8 ffi_call_unix64 (libffi.so.6)
                #14 0x00007f4a56096c2a ffi_call (libffi.so.6)
                #15 0x00007f4a5ca6b7fe gjs_invoke_c_function (libgjs.so.0)
                #16 0x00007f4a5ca6d87a function_call (libgjs.so.0)
                #17 0x00007f4a5d442726 n/a (n/a)
                #18 0x0000000001d64768 n/a (n/a)
                #19 0x00007f49f6a54503 n/a (n/a)


Attached is the full backtrace.

Comment 79 Philip Chimento 2017-06-13 21:47:29 UTC

(In reply to John from comment #78)
> Created attachment 353710 [details]
> gnome-shell crash backtrace
> 
> I did experience another crash after applying the above two patches, when
> closing a Firefox window playing a flash video (cnn.com):

OK, sounds like the same problem Maxim reported, but that's definitely a different crash, because the backtrace has nothing to do with garbage collection - which is good :-)

I opened bug 783771 for this other crash. Please follow up there if you have more info.

Comment 80 Christian Stadelmann 2017-06-13 21:49:08 UTC

(In reply to Philip Chimento from comment #76)
[…]
> (In reply to Christian Stadelmann from comment #71)
> > With these patches applied I cannot reproduce the crash of attachment 351010 [details]
> > [details] from bug #782060 any more (as opposed to the situation before
> > that, see comment #24).
> […]
> > I did not run into any of the related crashes on a GNOME/Xorg session, but
> > GNOME/Wayland crashed regularly, about once per hour plus on every 3rd login.
> 
> Just to be clear, you mean with the patches applied? Or that was the
> situation before the patches and you are talking about the crash from bug
> #782060?

Oh, sorry, that wasn't clear.
I meant that I could not reproduce the crashes with GNOME/Xorg with or without the patches applied. I could reproduce the crashes with GNOME/Wayland without the patches applied. I cannot reproduce the crashes with GNOME/Wayland with the patches applied.

> (In reply to Bastien Nocera from comment #72)
> > Tested with both patches applied to the F26 RPM, and I can't make it crash
> > as easily as I used to in comment 28.
> 
> Implying that you still get the crashes but less easily? Or you can't make
> it crash easily but don't want to assume that it never happens? :-)

The second one for me too. That's why I only wrote I cannot reproduce the sushi crash any more. In general, gnome-shell seems to run more stable anyway, so I guess the other bugs are gone too, but I am not sure either. And I cannot be sure about that because they happened randomly.

Comment 81 Kalev Lember 2017-06-14 04:55:52 UTC

(In reply to Bastien Nocera from comment #77)
> > (In reply to Kalev Lember from comment #75)
> > > With these two patches applied the armv7hl builds in the Fedora build system
> > > fail with the following:
> > > 
> > > gi/arg.cpp: In function 'bool gjs_g_argument_release_in_array(JSContext*,
> > > GITransfer, GITypeInfo*, guint, GArgument*)':
> > > gi/arg.cpp:3544:35: error: cast from 'void**' to 'GValue* {aka _GValue*}'
> > > increases required alignment of target type [-Werror=cast-align]
> > >              GValue *v = ((GValue*)array) + i;
> > >                                    ^~~~~
> > 
> > Bizarre! That code wasn't touched by these patches. Are you certain that the
> > failure wasn't there before, or it might have been caused by a change in the
> > compiler flags?
> 
> It's a separate problem, likely caused by GCC changes on that platform. Best
> filed as a separate bug.

OK, tracked this one down and it was indeed a compiler flag change. Sorry for the noise. :) Bastien used git to apply the two patches and this made AX_IS_RELEASE([git-directory]) think it's working from a developer git checkout and it added a -Werror which tripped up the build on arm.

In any case, packages with the two patches applied are now on their way to Fedora 26 updates-testing (gjs-1.48.3-3.fc26)

Comment 82 Christian Stadelmann 2017-06-14 08:03:50 UTC

After installing my own build (as described in comment #71, comment #80), I am getting hundreds of these warnings logged to syslog, which never happened before since updating to GNOME 3.24:

gnome-shell[23838]: gclosure.c:724: unable to remove uninstalled invalidation notifier: 0x7f4970e47b50 (0x55ce3be324c0)
gnome-shell[23838]: gclosure.c:724: unable to remove uninstalled invalidation notifier: 0x7f4970e47b50 (0x55ce39e450a0)
gnome-shell[23838]: gclosure.c:724: unable to remove uninstalled invalidation notifier: 0x7f4970e47b50 (0x55ce39e45180)
gnome-shell[23838]: gclosure.c:724: unable to remove uninstalled invalidation notifier: 0x7f4970e47b50 (0x55ce3be32240)
gnome-shell[23838]: gclosure.c:724: unable to remove uninstalled invalidation notifier: 0x7f4970e47b50 (0x55ce3be3ea40)

the first address does not changes, the second one does. I don't know whether that is related. Can anyone confirm or deny?

Comment 83 Vladimir Stoyakin 2017-06-14 17:08:07 UTC

(In reply to Christian Stadelmann from comment #82)
> the first address does not changes, the second one does. I don't know
> whether that is related. Can anyone confirm or deny?

Patches work without drawbacks for me. I don't see any new messages in log.

Comment 84 Philip Chimento 2017-06-15 06:54:39 UTC

Created attachment 353799 [details] [review]
util-root: Require GjsMaybeOwned callback to reset

In the case of a closure, the GjsMaybeOwned object is embedded as part of
struct Closure. The context destroy notify callback will invalidate the
closure, which frees the GjsMaybeOwned object, causing a use-after-free
when the callback returns and calls reset().

In practice we did not need to call reset() after the callback returns;
all existing callbacks already call reset(). This patch adds a
requirement that the callback *must* call reset(), and only calls it
internally if there was no callback set.

Comment 85 Philip Chimento 2017-06-15 07:01:40 UTC

The above patch is a possibly better replacement for "util-root: Allow GjsMaybeOwned::DestroyNotify to free".

Comment 86 Jonas Ådahl 2017-06-15 09:49:46 UTC

*** Bug 783813 has been marked as a duplicate of this bug. ***

Comment 87 Cosimo Cecchi 2017-06-15 14:52:50 UTC

Review of attachment 353799 [details] [review]:

Thanks, I think I prefer this approach.

Comment 88 Christian Stadelmann 2017-06-15 15:53:18 UTC

(In reply to Cosimo Cecchi from comment #87)
> Review of attachment 353799 [details] [review] [review]:
> 
> Thanks, I think I prefer this approach.

I can confirm that it works. I rebuild my gjs package and installed it and have been using it for a while now. Since I found no reliable way to reproduce the behavior described in comment #82, I cannot confirm nor deny it is fixed.

Comment 89 Philip Chimento 2017-06-16 00:17:40 UTC

I can catch this one in the existing tests with -fsanitize=address
though, so I think we can consider that it solves this particular
problem.

Thanks everyone for the help! I will be releasing a new version of GJS
with these fixes from the stable branch shortly, so that distros can
update.

Attachment 353799 [details] pushed as 53e0c86 - util-root: Require GjsMaybeOwned callback to reset

Comment 90 Philip Chimento 2017-06-16 01:00:44 UTC

This is now released in GJS 1.48.4.

Comment 91 fedor 2017-06-16 16:30:42 UTC

Still getting crashes after updating gjs to 1.48.4 unfortunately

Arch Linux
gjs 1.48.4-1
gnome-shell 3.24.2-1
wayland 1.13.0-1
js38 38.8.0-3

           PID: 4396 (gnome-shell)
        Signal: 11 (SEGV)
  Command Line: /usr/bin/gnome-shell
    Executable: /usr/bin/gnome-shell
 Control Group: /user.slice/user-1000.slice/session-c4.scope
          Unit: session-c4.scope
         Slice: user-1000.slice
       Session: c4
       Message: Process 4396 (gnome-shell) of user 1000 dumped core.
                
                Stack trace of thread 4396:
                #0  0x00007fcbab260735 n/a (libgjs.so.0)
                #1  0x00007fcba937a8b5 g_main_context_dispatch (libglib-2.0.so.0)
                #2  0x00007fcba937ac78 n/a (libglib-2.0.so.0)
                #3  0x00007fcba937af92 g_main_loop_run (libglib-2.0.so.0)
                #4  0x00007fcbaab3dfdc meta_run (libmutter-0.so.0)
                #5  0x0000000000401ff7 main (gnome-shell)
                #6  0x00007fcba8d8d43a __libc_start_main (libc.so.6)
                #7  0x000000000040212a n/a (gnome-shell)
                
                Stack trace of thread 4435:
                #0  0x00007fcba911f39d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
                #1  0x00007fcb9bb90d10 PR_WaitCondVar (libnspr4.so)
                #2  0x00007fcba417b811 n/a (libmozjs-38.so)
                #3  0x00007fcb9bb9688b n/a (libnspr4.so)
                #4  0x00007fcba9119297 start_thread (libpthread.so.0)
                #5  0x00007fcba8e5a25f __clone (libc.so.6)
                
                Stack trace of thread 4436:
                #0  0x00007fcba911f39d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
                #1  0x00007fcb9bb90d10 PR_WaitCondVar (libnspr4.so)
                #2  0x00007fcba417b811 n/a (libmozjs-38.so)
                #3  0x00007fcb9bb9688b n/a (libnspr4.so)
                #4  0x00007fcba9119297 start_thread (libpthread.so.0)
                #5  0x00007fcba8e5a25f __clone (libc.so.6)
                
                Stack trace of thread 4434:
                #0  0x00007fcba911f39d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
                #1  0x00007fcb9bb90d10 PR_WaitCondVar (libnspr4.so)
                #2  0x00007fcba417b811 n/a (libmozjs-38.so)
                #3  0x00007fcb9bb9688b n/a (libnspr4.so)
                #4  0x00007fcba9119297 start_thread (libpthread.so.0)
                #5  0x00007fcba8e5a25f __clone (libc.so.6)
                
                Stack trace of thread 4397:
                #0  0x00007fcba8e502bd poll (libc.so.6)
                #1  0x00007fcba937abf9 n/a (libglib-2.0.so.0)
                #2  0x00007fcba937ad0c g_main_context_iteration (libglib-2.0.so.0)
                #3  0x00007fcba937ad51 n/a (libglib-2.0.so.0)
                #4  0x00007fcba93a1ac5 n/a (libglib-2.0.so.0)
                #5  0x00007fcba9119297 start_thread (libpthread.so.0)
                #6  0x00007fcba8e5a25f __clone (libc.so.6)
                
                Stack trace of thread 4430:
                #0  0x00007fcba8e502bd poll (libc.so.6)
                #1  0x00007fcba519eee1 n/a (libpulse.so.0)
                #2  0x00007fcba51906f1 pa_mainloop_poll (libpulse.so.0)
                #3  0x00007fcba5190d8e pa_mainloop_iterate (libpulse.so.0)
                #4  0x00007fcba5190e40 pa_mainloop_run (libpulse.so.0)
                #5  0x00007fcba519ee29 n/a (libpulse.so.0)
                #6  0x00007fcb9a447fe8 n/a (libpulsecommon-10.0.so)
                #7  0x00007fcba9119297 start_thread (libpthread.so.0)
                #8  0x00007fcba8e5a25f __clone (libc.so.6)
                
                Stack trace of thread 4438:
                #0  0x00007fcba911f39d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
                #1  0x00007fcb9bb90d10 PR_WaitCondVar (libnspr4.so)
                #2  0x00007fcba417b811 n/a (libmozjs-38.so)
                #3  0x00007fcb9bb9688b n/a (libnspr4.so)
                #4  0x00007fcba9119297 start_thread (libpthread.so.0)
                #5  0x00007fcba8e5a25f __clone (libc.so.6)
                
                Stack trace of thread 8496:
                #0  0x00007fcba8e553b9 syscall (libc.so.6)
                #1  0x00007fcba93bfc7a g_cond_wait_until (libglib-2.0.so.0)
                #2  0x00007fcba934f121 n/a (libglib-2.0.so.0)
                #3  0x00007fcba93a2464 n/a (libglib-2.0.so.0)
                #4  0x00007fcba93a1ac5 n/a (libglib-2.0.so.0)
                #5  0x00007fcba9119297 start_thread (libpthread.so.0)
                #6  0x00007fcba8e5a25f __clone (libc.so.6)
                
                Stack trace of thread 4400:
                #0  0x00007fcba8e502bd poll (libc.so.6)
                #1  0x00007fcba937abf9 n/a (libglib-2.0.so.0)
                #2  0x00007fcba937ad0c g_main_context_iteration (libglib-2.0.so.0)
                #3  0x00007fcb8cbfc55d n/a (libdconfsettings.so)
                #4  0x00007fcba93a1ac5 n/a (libglib-2.0.so.0)
                #5  0x00007fcba9119297 start_thread (libpthread.so.0)
                #6  0x00007fcba8e5a25f __clone (libc.so.6)
                
                Stack trace of thread 8509:
                #0  0x00007fcba8e553b9 syscall (libc.so.6)
                #1  0x00007fcba93bfc7a g_cond_wait_until (libglib-2.0.so.0)
                #2  0x00007fcba934f121 n/a (libglib-2.0.so.0)
                #3  0x00007fcba93a2464 n/a (libglib-2.0.so.0)
                #4  0x00007fcba93a1ac5 n/a (libglib-2.0.so.0)
                #5  0x00007fcba9119297 start_thread (libpthread.so.0)
                #6  0x00007fcba8e5a25f __clone (libc.so.6)
                
                Stack trace of thread 4437:
                #0  0x00007fcba911f39d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
                #1  0x00007fcb9bb90d10 PR_WaitCondVar (libnspr4.so)
                #2  0x00007fcba417b811 n/a (libmozjs-38.so)
                #3  0x00007fcb9bb9688b n/a (libnspr4.so)
                #4  0x00007fcba9119297 start_thread (libpthread.so.0)
                #5  0x00007fcba8e5a25f __clone (libc.so.6)
                
                Stack trace of thread 4439:
                #0  0x00007fcba911f39d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
                #1  0x00007fcb9bb90d10 PR_WaitCondVar (libnspr4.so)
                #2  0x00007fcba417b811 n/a (libmozjs-38.so)
                #3  0x00007fcb9bb9688b n/a (libnspr4.so)
                #4  0x00007fcba9119297 start_thread (libpthread.so.0)
                #5  0x00007fcba8e5a25f __clone (libc.so.6)
                
                Stack trace of thread 8507:
                #0  0x00007fcba8e553b9 syscall (libc.so.6)
                #1  0x00007fcba93bfc7a g_cond_wait_until (libglib-2.0.so.0)
                #2  0x00007fcba934f121 n/a (libglib-2.0.so.0)
                #3  0x00007fcba93a2464 n/a (libglib-2.0.so.0)
                #4  0x00007fcba93a1ac5 n/a (libglib-2.0.so.0)
                #5  0x00007fcba9119297 start_thread (libpthread.so.0)
                #6  0x00007fcba8e5a25f __clone (libc.so.6)
                
                Stack trace of thread 8508:
                #0  0x00007fcba8e553b9 syscall (libc.so.6)
                #1  0x00007fcba93bfc7a g_cond_wait_until (libglib-2.0.so.0)
                #2  0x00007fcba934f121 n/a (libglib-2.0.so.0)
                #3  0x00007fcba93a2464 n/a (libglib-2.0.so.0)
                #4  0x00007fcba93a1ac5 n/a (libglib-2.0.so.0)
                #5  0x00007fcba9119297 start_thread (libpthread.so.0)
                #6  0x00007fcba8e5a25f __clone (libc.so.6)
                
                Stack trace of thread 4441:
                #0  0x00007fcba911f39d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
                #1  0x00007fcb9bb90d10 PR_WaitCondVar (libnspr4.so)
                #2  0x00007fcba417b811 n/a (libmozjs-38.so)
                #3  0x00007fcb9bb9688b n/a (libnspr4.so)
                #4  0x00007fcba9119297 start_thread (libpthread.so.0)
                #5  0x00007fcba8e5a25f __clone (libc.so.6)
                
                Stack trace of thread 4440:
                #0  0x00007fcba911f39d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
                #1  0x00007fcb9bb90d10 PR_WaitCondVar (libnspr4.so)
                #2  0x00007fcba417b811 n/a (libmozjs-38.so)
                #3  0x00007fcb9bb9688b n/a (libnspr4.so)
                #4  0x00007fcba9119297 start_thread (libpthread.so.0)
                #5  0x00007fcba8e5a25f __clone (libc.so.6)
                
                Stack trace of thread 4398:
                #0  0x00007fcba8e502bd poll (libc.so.6)
                #1  0x00007fcba937abf9 n/a (libglib-2.0.so.0)
                #2  0x00007fcba937af92 g_main_loop_run (libglib-2.0.so.0)
                #3  0x00007fcba9962426 n/a (libgio-2.0.so.0)
                #4  0x00007fcba93a1ac5 n/a (libglib-2.0.so.0)
                #5  0x00007fcba9119297 start_thread (libpthread.so.0)
                #6  0x00007fcba8e5a25f __clone (libc.so.6)

Comment 92 Christian Stadelmann 2017-06-16 21:01:14 UTC

(In reply to fedor from comment #91)
> Still getting crashes after updating gjs to 1.48.4 unfortunately
> 
> Arch Linux

Your backtrace is missing debug symbols. Can you please install them and retry? Are there any steps to reproduce this crash?

Still, I think you are running into a different bug, because the one in comment #3 has a deep stack inside libmozjs, and yours does not.

Comment 93 Mark Blakeney 2017-06-16 23:45:06 UTC

I'm on Arch with the same latest versions as fedor listed above and I got a assertion crash yesterday:

Jun 16 19:00:27 pc org.gnome.Shell.desktop[553]: ^[[0;1;39mGjs:ERROR:./gjs/jsapi-util-root.h:317:void GjsMaybeOwned<T>::trace(JSTracer*, const char*) [with\
 T = JS::Value]: assertion failed: (!m_rooted)
Jun 16 19:00:27 pc systemd[1]: Started Process Core Dump (PID 2741/UID 0).


Stack trace of thread 553:
                #0  0x00007fc60d49a670 raise (libc.so.6)
                #1  0x00007fc60d49bd00 abort (libc.so.6)
                #2  0x00007fc60da9ac9d g_assertion_message (libglib-2.0.so.0)
                #3  0x00007fc60da9ad2a g_assertion_message_expr (libglib-2.0.so.0)
                #4  0x00007fc60f965c54 n/a (libgjs.so.0)
                #5  0x00007fc608851f6d n/a (libmozjs-38.so)
                #6  0x00007fc60882e5cd n/a (libmozjs-38.so)
                #7  0x00007fc608b85e90 n/a (libmozjs-38.so)
                #8  0x00007fc608bb1ede n/a (libmozjs-38.so)
                #9  0x00007fc608bb28c0 n/a (libmozjs-38.so)
                #10 0x00007fc608bb2b0d n/a (libmozjs-38.so)
                #11 0x00007fc608bb2ed4 n/a (libmozjs-38.so)
                #12 0x00007fc60f97ef49 gjs_schedule_gc_if_needed (libgjs.so.0)
                #13 0x00007fc60f97efb4 gjs_call_function_value (libgjs.so.0)
                #14 0x00007fc60f95a0b5 gjs_closure_invoke (libgjs.so.0)
                #15 0x00007fc60f971e1c n/a (libgjs.so.0)
                #16 0x00007fc60dd4cead g_closure_invoke (libgobject-2.0.so.0)
                #17 0x00007fc60dd68f1c n/a (libgobject-2.0.so.0)
                #18 0x00007fc60da75333 n/a (libglib-2.0.so.0)
                #19 0x00007fc60da748b5 g_main_context_dispatch (libglib-2.0.so.0)
                #20 0x00007fc60da74c78 n/a (libglib-2.0.so.0)
                #21 0x00007fc60da74f92 g_main_loop_run (libglib-2.0.so.0)
                #22 0x00007fc60f237fdc meta_run (libmutter-0.so.0)
                #23 0x0000000000401ff7 main (gnome-shell)
                #24 0x00007fc60d48743a __libc_start_main (libc.so.6)
                #25 0x000000000040212a n/a (gnome-shell)

Comment 94 fedor 2017-06-17 04:15:38 UTC

(In reply to Christian Stadelmann from comment #92)
> (In reply to fedor from comment #91)
> > Still getting crashes after updating gjs to 1.48.4 unfortunately
> > 
> > Arch Linux
> 
> Your backtrace is missing debug symbols. Can you please install them and
> retry? Are there any steps to reproduce this crash?
> 
> Still, I think you are running into a different bug, because the one in
> comment #3 has a deep stack inside libmozjs, and yours does not.

so I built gjs 1.48.4 with debug flags and got this:

       Message: Process 852 (gnome-shell) of user 1000 dumped core.
                
                Stack trace of thread 852:
                #0  0x00007f4f051f5735 _ZN2js9GCMethodsIP8JSObjectE16needsPostBarrierES2_ (libgjs.so.0)
                #1  0x00007f4f0330f8b5 g_main_context_dispatch (libglib-2.0.so.0)
                #2  0x00007f4f0330fc78 n/a (libglib-2.0.so.0)
                #3  0x00007f4f0330ff92 g_main_loop_run (libglib-2.0.so.0)
                #4  0x00007f4f04ad2fdc meta_run (libmutter-0.so.0)
                #5  0x0000000000401ff7 main (gnome-shell)
                #6  0x00007f4f02d2243a __libc_start_main (libc.so.6)
                #7  0x000000000040212a n/a (gnome-shell)

I got this crash approximately 5-10 minutes after start of gnome session while surfing the web in google-chrome. Is there anything else needed to be built with debug flags on?

Comment 95 Maxim 2017-06-17 10:11:19 UTC

>Is there anything else needed to be built with debug flags on?

https://wiki.archlinux.org/index.php/Debug_-_Getting_Traces

Comment 96 Philip Chimento 2017-06-18 20:14:26 UTC

Fedor: I have opened bug 783935 for this, please follow up there.

Mark Blakeney: Would it be possible to get a stack trace with debug symbols and also execute `call gjs_dumpstack()` in GDB? If so, please open a new bug. Reproducer instructions would also be very helpful.

All readers:
============
The problems originally described by the stack traces on this bug report have supposedly been fixed now. Please do not post new stack traces on this bug unless you are *sure* that they describe the original problem, and that the fix in 1.48.4 was faulty.

Of course, there may be one or even several problems still in existence that cause gnome-shell to crash for you! Here's what you can do instead.

- Check if your stack trace matches one of these bugs. These are the gnome-shell crashes currently open (or opened but later closed due to lack of information)
* bug 782464
* bug 782692
* bug 783771
* bug 783935
Post reproducer info there, stack traces with debug symbols, and output of `call gjs_dumpstack()` from GDB. If the bug was closed as INCOMPLETE but you can provide the missing information, fantastic! Feel free to reopen it.

- If none of the above bugs match your stack trace, and no-one else has reported a similar stack trace to yours in the meantime, then please open a new bug.

The reason I ask this is not to be bureaucratic or to deny that crashes are happening, but to keep the information manageable for myself as I fix these bugs. If all of the stack traces from unrelated problems are posted here, then I will lose track of which ones are fixed and which ones are not.

Thank you.

Comment 97 Cosimo Cecchi 2017-06-19 01:18:51 UTC

*** Bug 783904 has been marked as a duplicate of this bug. ***

Comment 98 jeyhunn 2017-06-26 16:02:13 UTC

Still gnome-shell crash  on Wayland!! Version 3.24.2.

Comment 99 jeyhunn 2017-06-26 16:04:29 UTC

Created attachment 354522 [details]
Still gnome-shell crash on Wayland!! Version 3.24.2

Comment 100 Philip Chimento 2017-06-26 17:57:30 UTC

Jeyhunn: Thanks for the report, but I'm afraid it is not very helpful in that form. Can you please read comment 96 and check if your stack trace matches one of the crasher bugs listed there?

Comment 101 jeyhunn 2017-06-26 18:47:45 UTC

(In reply to Philip Chimento from comment #100)
> Jeyhunn: Thanks for the report, but I'm afraid it is not very helpful in
> that form. Can you please read comment 96 and check if your stack trace
> matches one of the crasher bugs listed there?

Hi Philip Chimento,
Seems like bug 783771, but gnome-shell suddenly crashed after playing html5 video (youtube) in Chrome,

Comment 102 jeyhunn 2017-06-26 18:49:18 UTC

Created attachment 354530 [details]
Systemd CoreDump Info for Gnome-Shell crash

Comment 103 Philip Chimento 2017-06-26 19:10:41 UTC

It is almost certainly bug 783771, please follow up there. If you can provide the missing information it would be helpful.