GNOME Bugzilla – Bug 781799
gnome-shell 3.24.1 crash on wayland
Last modified: 2017-06-26 19:10:41 UTC
Created attachment 350505 [details] output of journalctl -xe OS: Archlinux Linux 4.10.11 Gnome 3.24.1 Wayland 1.13.0 With my dual-monitor setup (laptop + 21" hp monitor), gnome-shell randomly crashes with the following error message: Apr 26 22:15:00 plugsuite python3[7044]: Error reading events from display: Connection reset by peer Apr 26 22:15:00 plugsuite unknown[6667]: Error reading events from display: Broken pipe Apr 26 22:15:00 plugsuite org.gnome.Shell.desktop[914]: (EE) Apr 26 22:15:00 plugsuite org.gnome.Shell.desktop[914]: Fatal server error: Apr 26 22:15:00 plugsuite org.gnome.Shell.desktop[914]: (EE) failed to read Wayland events: Broken pipe Apr 26 22:15:00 plugsuite org.gnome.Shell.desktop[914]: (EE) After that it brings me to the GDM login screen. The only way I have to reproduce this issue is to start a wayland session from gdm and wait a random amount of time. This does not happen on Xorg. I have provided as an attachment the journalctl log.
Seems to be gjs trying to free something it shouldn't, possibly during GC. "random amount of time", is this random number of seconds, minutes, hours or days? Philip, does this look similar to the other gjs related issues?
If it's GNOME 3.24.1, then it's likely GJS 1.48.1 which still suffered from bug 781194. The stack trace looks like it could be the same, but it's hard to tell for sure without debug symbols. If this is GJS 1.48.2, then it's a new issue. A stack trace with debug symbols would be most helpful in that case.
I use only one monitor, but I think it is the same bug. This is stack trace from my machine: Stack trace of thread 369: #0 0x00007f548cd62a10 raise (libc.so.6) #1 0x00007f548cd6413a abort (libc.so.6) #2 0x00007f548cda12b0 __libc_message (libc.so.6) #3 0x00007f548cda790e malloc_printerr (libc.so.6) #4 0x00007f548cda811e _int_free (libc.so.6) #5 0x00007f548f22d9a3 _ZN13GjsMaybeOwnedIN2JS5ValueEE16teardown_rootingEv (libgjs.so.0) #6 0x00007f5488456183 _ZN8JSObject8finalizeEPN2js6FreeOpE (libmozjs-38.so) #7 0x00007f54884b0d7c _ZN2js2gc10ArenaLists16forceFinalizeNowEPNS_6FreeOpENS0_9AllocKindENS1_14KeepArenasEnumEPPNS0_11ArenaHeaderE (libmozjs-38.so) #8 0x00007f5488457489 _ZN2js2gc10ArenaLists11finalizeNowEPNS_6FreeOpENS0_9AllocKindENS1_14KeepArenasEnumEPPNS0_11ArenaHeaderE (libmozjs-38.so) #9 0x00007f548846cbe3 _ZN2js2gc9GCRuntime22beginSweepingZoneGroupEv (libmozjs-38.so) #10 0x00007f548846d622 _ZN2js2gc9GCRuntime15beginSweepPhaseEb (libmozjs-38.so) #11 0x00007f548846f398 _ZN2js2gc9GCRuntime23incrementalCollectSliceERNS_11SliceBudgetEN2JS8gcreason6ReasonE (libmozjs-38.so) #12 0x00007f548846fd40 _ZN2js2gc9GCRuntime7gcCycleEbRNS_11SliceBudgetEN2JS8gcreason6ReasonE (libmozjs-38.so) #13 0x00007f548846ff8d _ZN2js2gc9GCRuntime7collectEbNS_11SliceBudgetEN2JS8gcreason6ReasonE (libmozjs-38.so) #14 0x00007f5488470354 _ZN2js2gc9GCRuntime7startGCE18JSGCInvocationKindN2JS8gcreason6ReasonEl (libmozjs-38.so) #15 0x00007f548f241df9 gjs_schedule_gc_if_needed (libgjs.so.0) #16 0x00007f548f241e64 gjs_call_function_value (libgjs.so.0) #17 0x00007f548f21cfe5 gjs_closure_invoke (libgjs.so.0) #18 0x00007f548f234cdc closure_marshal (libgjs.so.0) #19 0x00007f548d614f75 g_closure_invoke (libgobject-2.0.so.0) #20 0x00007f548d626f82 n/a (libgobject-2.0.so.0) #21 0x00007f548d62fbdc g_signal_emit_valist (libgobject-2.0.so.0) #22 0x00007f548d62ffbf g_signal_emit (libgobject-2.0.so.0) #23 0x00007f548d6193a4 n/a (libgobject-2.0.so.0) #24 0x00007f548d618c46 n/a (libgobject-2.0.so.0) #25 0x00007f548d61d130 g_object_set_property (libgobject-2.0.so.0) #26 0x00007f548f229557 set_g_param_from_prop (libgjs.so.0) #27 0x00007f5488181972 _ZN2js22CallJSPropertyOpSetterEP9JSContextPFbS1_N2JS6HandleIP8JSObjectEENS3_I4jsidEEbNS2_13MutableHandleINS2_5ValueEEEES6_S8_bSB_ (libmozjs-38.so) #28 0x00007f548815c0c9 NativeSet (libmozjs-38.so) #29 0x00007f548815e8db SetExistingProperty (libmozjs-38.so) #30 0x00007f54882ae07f _ZN2js11SetPropertyEP9JSContextN2JS6HandleIP8JSObjectEES6_NS3_I4jsidEENS2_13MutableHandleINS2_5ValueEEEb (libmozjs-38.so) #31 0x00007f548fbbd186 n/a (n/a) OS: ArchLinux linux 4.10.11-1 gnome-shell 3.24.1+2+g45c2627d4-1 gjs 1.48.2-1 js38 38.8.0-3
With GJS 1.48.2, then this is likely not the same as bug 781194. It looks like a new one; I'll reassign this to GJS. Any information on how often this happens or any common circumstances where it is triggered? Is it possible to get a stack trace with full debug info? I would need to know the output of `p *this` at frame 5 in any case, and getting demangled names with line numbers in the stack trace would also be very helpful.
It is random crashes (one time after four hours without any user actions), so I can't reproduce them intentionally. It happened three or four times last week for me (at everyday use). I report if I get more information.
*** Bug 782058 has been marked as a duplicate of this bug. ***
Created attachment 351009 [details] Stack trace
Created attachment 351010 [details] Stack trace - full
Today gnome-shell crashed with message: 08:49:59 kernel: gnome-shell[364]: segfault at 51 ip 00007f2e9a98b98b sp 00007fff2e1d3020 error 6 in libgjs.so.0.0.0[7f2e9a950000+bc000] I have attached stack trace. Is there any thing that I should find in coredump?
Is it possible to find out the values of the fields of *this (of type GjsMaybeOwned) in frame 3 and/or 4?
(gdb) p *((GjsMaybeOwned<JS::Value>*) 0x5f76770) $5 = {m_rooted = 40, m_has_weakref = 229, m_cx = 0x5f35a50, m_heap = {<js::HeapBase<JS::Value>> = {<js::ValueOperations<JS::Heap<JS::Value> >> = {<No data fields>}, <No data fields>}, ptr = {data = { asBits = 99834448, debugView = {payload47 = 99834448, tag = 0}, s = {payload = {i32 = 99834448, u32 = 99834448, why = 99834448}}, asDouble = 4.9324771028324344e-316, asPtr = 0x5f35a50, asWord = 99834448, asUIntPtr = 99834448}}}, m_root = 0x5f35c50, m_notify = 0x771e528, m_data = 0x0}
I have also experience this issue. I can report that it is not specific to wayland, though as people note, when running in Wayland you lose everything as your entire session is closed. If you are running X11 (as I have just switched back to since the problem started) gnome-shell is restarted and I don't lose all my work in-progress (since X11 stays up). I've had 4 crashes today. 2 in Wayland, and 2 since I switched to Gnome on X11. Arch Package versions: * gjs 1.48.2-1 * js38 38.8.0-3 * gnome-shell 3.24.1+2+g45c2627d4-1
Created attachment 351144 [details] Another journalctl stacktrace (just in case it is helpful)
Mike, your stack trace is different again from the existing two. Any chance you could install debug symbols and wait for another crash to see if it's the same stack trace? The frame of interest is the one on the very top, inside libgjs after g_main_context_dispatch(). Vladimir, thanks, that's very helpful. Looks like the object consists of garbage. That might point to use-after-free. If you are familiar with the RR debugger (or want to become so), you could try running gnome-shell under RR, and stepping backwards to see at what point the object becomes garbage. CC Georges, since you suffered from the previous crashes on Arch. Have you seen this crash at all?
(In reply to Philip Chimento from comment #14) > Mike, your stack trace is different again from the existing two. Any chance > you could install debug symbols and wait for another crash to see if it's > the same stack trace? The frame of interest is the one on the very top, > inside libgjs after g_main_context_dispatch(). Apologies, I didn't really examine the stack traces posted previously, I just read the comments and they seemed to indicate a match. I will try to work out how to install the debug symbols when I am back at the problematic computer tomorrow.
(In reply to Mike Javorski from comment #15) > (In reply to Philip Chimento from comment #14) > > Mike, your stack trace is different again from the existing two. Any chance > > you could install debug symbols and wait for another crash to see if it's > > the same stack trace? The frame of interest is the one on the very top, > > inside libgjs after g_main_context_dispatch(). > > Apologies, I didn't really examine the stack traces posted previously, I > just read the comments and they seemed to indicate a match. I will try to > work out how to install the debug symbols when I am back at the problematic > computer tomorrow. I didn't mean that as a criticism! The more different stack traces, the more we (hopefully) know about this bug...
No problem Philip. In my world (ruby and golang mostly) different stacktrace generally means a different problem. I will see what I can do about the debug symbols in the morning and see if I can produce a crash or two with more info. - mike (In reply to Philip Chimento from comment #16) > (In reply to Mike Javorski from comment #15) > > (In reply to Philip Chimento from comment #14) > > > Mike, your stack trace is different again from the existing two. Any chance > > > you could install debug symbols and wait for another crash to see if it's > > > the same stack trace? The frame of interest is the one on the very top, > > > inside libgjs after g_main_context_dispatch(). > > > > Apologies, I didn't really examine the stack traces posted previously, I > > just read the comments and they seemed to indicate a match. I will try to > > work out how to install the debug symbols when I am back at the problematic > > computer tomorrow. > > I didn't mean that as a criticism! The more different stack traces, the more > we (hopefully) know about this bug...
(In reply to Philip Chimento from comment #14) > CC Georges, since you suffered from the previous crashes on Arch. Have you > seen this crash at all? I had this happening on my machine twice, but I also couldn't reliably reproduce it. And it didn't happen frequently enough to bother me, so I didn't dig deeper. I'll try and see what's happening during this weekend, thanks for pointing me to this bug.
Is this for collecting all segfaults in libgjs since 3.24 update? Ok here is mine:
+ Trace 237422
Distro: Arch Linux Package gjs 1.48.2-1
*** Bug 782060 has been marked as a duplicate of this bug. ***
(In reply to Philip Chimento from comment #14) > Vladimir, thanks, that's very helpful. Looks like the object consists of > garbage. That might point to use-after-free. If you are familiar with the RR > debugger (or want to become so), you could try running gnome-shell under RR, > and stepping backwards to see at what point the object becomes garbage. Unfortunately, my old Core 2 Duo isn't supported by RR. If I can help with something else, I am ready.
Another trace that seems somewhat related. It crashes during garbage collection. It was originally posted in bug 781975.
+ Trace 237467
Thread 1 (Thread 0x7f3e18816f80 (LWP 29811))
*** Bug 782589 has been marked as a duplicate of this bug. ***
I can confirm the reproducer in bug #782060, which also causes the backtrace in attachment 351010 [details]. This crash also happens on other occasions and causes data loss.
Unfortunately I have not observed this yet. Anyone who comments to confirm this bug - please provide at least the following information: - Linux distribution - gnome-shell version - mozjs38 version - gjs version - Wayland or X or both? If anyone who can reproduce the crash reliably would like to help, the best thing you could do would be investigate with RR and find the point at which the GjsMaybeOwned<JS::Value> object becomes garbage, as described in comment 14.
(In reply to Philip Chimento from comment #25) > Unfortunately I have not observed this yet. Have you tried the reproducer in bug #782060? > Anyone who comments to confirm > this bug - please provide at least the following information: Fedora 26 gnome-shell-3.24.2-1.fc26.x86_64 mutter-3.24.2-1.fc26.x86_64 gjs-1.48.3-1.fc26.x86_64 mozjs38-38.8.0-4.fc26.x86_64 libwayland-server-1.13.0-1.fc26.x86_64 GNOME/Wayland session. > If anyone who can reproduce the crash reliably would like to help, the best > thing you could do would be investigate with RR and find the point at which > the GjsMaybeOwned<JS::Value> object becomes garbage, as described in comment > 14. What is this RR tool, how do I get it and how do I use it? Do you have any commands I can run there? I don't think it is even packaged for my distro. On Fedora 26, there is a "gnome-valgrind-session" which should – in theory – help debugging this. If someone here uses Fedora, maybe he/she can fix it: https://apps.fedoraproject.org/packages/gnome-valgrind-session/bugs/
(In reply to Philip Chimento from comment #25) > Unfortunately I have not observed this yet. Anyone who comments to confirm > this bug - please provide at least the following information: > > - Linux distribution Fedora 26 (beta I guess) > - gnome-shell version gnome-shell-3.24.2-1.fc26.x86_64 > - mozjs38 version mozjs38-38.8.0-4.fc26.x86_64 > - gjs version gjs-1.48.3-1.fc26.x86_64 > - Wayland or X or both? Using Wayland. Just like in one of the duplicates, I reproduced this twice in a row with the same assertion by launching a clutter-gst based video player (it was sushi in the dupe, totem for me). org.gnome.Shell.desktop[18358]: Gjs:ERROR:./gjs/jsapi-util-root.h:317:void GjsMaybeOwned<T>::trace(JSTracer*, const char*) [with T = JS::Value]: assertion failed: (!m_rooted) #0 0x00007f66b021b7bb __GI_raise (libc.so.6) #1 0x00007f66b021d5d1 __GI_abort (libc.so.6) #2 0x00007f66b1def75d g_assertion_message (libglib-2.0.so.0) #3 0x00007f66b1def7ea g_assertion_message_expr (libglib-2.0.so.0) #4 0x00007f66b9594724 n/a (libgjs.so.0) #5 0x00007f66ae96a8cf _ZN2js8GCMarker19processMarkStackTopERNS_11SliceBudgetE (libmozjs-38.so) #6 0x00007f66ae945f5d _ZN2js8GCMarker14drainMarkStackERNS_11SliceBudgetE (libmozjs-38.so) #7 0x00007f66aec95e94 _ZN2js2gc9GCRuntime14drainMarkStackERNS_11SliceBudgetENS_7gcstats5PhaseE (libmozjs-38.so) #8 0x00007f66aecc4ba1 _ZN2js2gc9GCRuntime23incrementalCollectSliceERNS_11SliceBudgetEN2JS8gcreason6ReasonE (libmozjs-38.so) #9 0x00007f66aecc55d2 _ZN2js2gc9GCRuntime7gcCycleEbRNS_11SliceBudgetEN2JS8gcreason6ReasonE (libmozjs-38.so) #10 0x00007f66aecc582d _ZN2js2gc9GCRuntime7collectEbNS_11SliceBudgetEN2JS8gcreason6ReasonE (libmozjs-38.so) #11 0x00007f66b95ad739 gjs_schedule_gc_if_needed (libgjs.so.0) #12 0x00007f66b95ad7a4 gjs_call_function_value (libgjs.so.0) #13 0x00007f66b9588895 gjs_closure_invoke (libgjs.so.0) #14 0x00007f66b95a056e n/a (libgjs.so.0) #15 0x00007f66b20a130d g_closure_invoke (libgobject-2.0.so.0) #16 0x00007f66b20b398e signal_emit_unlocked_R (libgobject-2.0.so.0) #17 0x00007f66b20bc1a5 g_signal_emit_valist (libgobject-2.0.so.0) #18 0x00007f66b20bcb0f g_signal_emit (libgobject-2.0.so.0) #19 0x00007f66b20a5594 g_object_dispatch_properties_changed (libgobject-2.0.so.0) #20 0x00007f66b20a4f3e g_object_notify_queue_thaw (libgobject-2.0.so.0) #21 0x00007f66b20a94de g_object_set_property (libgobject-2.0.so.0) #22 0x00007f66b95951b6 n/a (libgjs.so.0) #23 0x00007f66ae9db442 _ZN2js5Shape3setEP9JSContextN2JS6HandleIPNS_12NativeObjectEEENS4_IP8JSObjectEEbNS3_13MutableHandleINS3_5ValueEEE (libmozjs-38.so) #24 0x00007f66ae9b5499 _ZL9NativeSetP9JSContextN2JS6HandleIPN2js12NativeObjectEEENS2_IP8JSObjectEENS2_IPNS3_5ShapeEEEbNS1_13MutableHandleINS1_5ValueEEE (libmozjs-38.so) #25 0x00007f66ae9b7c28 _ZN2js17NativeSetPropertyEP9JSContextN2JS6HandleIPNS_12NativeObjectEEENS3_IP8JSObjectEENS3_I4jsidEENS_13QualifiedBoolENS2_13MutableHandleINS2_5ValueEEEb (libmozjs-38.so) #26 0x00007f66ae9b8a06 _ZL17SetObjectPropertyP9JSContext4JSOpN2JS6HandleINS2_5ValueEEENS3_I4jsidEENS2_13MutableHandleIS4_EE (libmozjs-38.so) #27 0x00007f66ae9ab58f _ZL9InterpretP9JSContextRN2js8RunStateE (libmozjs-38.so) #28 0x00007f66ae9b4324 _ZN2js9RunScriptEP9JSContextRNS_8RunStateE (libmozjs-38.so) #29 0x00007f66ae9b4614 _ZN2js6InvokeEP9JSContextN2JS8CallArgsENS_14MaybeConstructE (libmozjs-38.so) #30 0x00007f66ae9b5243 _ZN2js6InvokeEP9JSContextRKN2JS5ValueES5_jPS4_NS2_13MutableHandleIS3_EE (libmozjs-38.so) #31 0x00007f66aec0dbfb _ZN2js3jit14InvokeFunctionEP9JSContextN2JS6HandleIP8JSObjectEEjPNS3_5ValueES9_ (libmozjs-38.so) #32 0x00007f66bab2a134 n/a (n/a)
(In reply to Bastien Nocera from comment #27) <snip> > Just like in one of the duplicates, I reproduced this twice in a row with > the same assertion by launching a clutter-gst based video player (it was > sushi in the dupe, totem for me). 1. In nautilus, open a folder with a video 2. Press Enter to launch the video 3. Press 'q' in totem to exit 4. Go back to 2. I can make it crash like that in under a minute.
(In reply to Bastien Nocera from comment #28) > 1. In nautilus, open a folder with a video > 2. Press Enter to launch the video > 3. Press 'q' in totem to exit > 4. Go back to 2. > > I can make it crash like that in under a minute. This works (breaks) very well on my system, too! Parabola GNU+Linux-libre (Arch Linux derivative) gnome-shell 3.24.2-1 js38 38.8.0-3 gjs 1.48.3-1 Reproducible under both X and Wayland.
*** Bug 782827 has been marked as a duplicate of this bug. ***
Created attachment 352149 [details] Stack strace
Hi, I think I have the same bug. This is my system configuration: -Manjaro 17.0. -GNOME-Shell: 3.24.2-1 -js38: 38.8.0-3 -gjs: 1.48.3-1 -glibc: 2.25-1 I have two computers. One of them is my main computer, a desktop equipped with an ASUS P5K Motherboard, a NVIDIA GTX 1050 as GPU and an Intel Core 2 Quad Q8300 as CPU. I use the proprietary blob driver 375.66-1 version from Manjaro repos, Linux 4.9 and Intel Microcode. The second computer I have is an old Toshiba laptop Satellite Pro P200, with an Intel Core 2 Duo T7300 and an ATI Mobility Radeon HD 2600 as GPU identified as AMD® Rv630 by the free drivers stack (Linux 4.9 LTS and Mesa 17.0.5). I use Intel Microcode here too. The environment crashes randomly. It recovers perfect after crashing, but when the bug occurs is very annoying. I couldn't catch any log or message to know what is the exact error, the only thing I have clear is the bug runs always when I have an amount of windows spreaded through some virtual desktops. Yes, the bug occurs on Mesa too if it's the same problem I'm responding. Wayland looks much more stable than Xorg, but the bug happens on both servers. This is what I get from journalctl from my desktop: may 19 12:57:54 manjarog-p5k polkitd[577]: Registered Authentication Agent for unix-session:c2 (system bus name :1.117 [/usr/bin/gnome-shell], object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale es_ES.UTF-8) may 19 12:57:54 manjarog-p5k systemd-coredump[7094]: Process 3002 (gnome-shell) of user 1000 dumped core. Stack trace of thread 3002: #0 0x00007f28aa658ed5 n/a (libgjs.so.0) #1 0x00007f28a877666a g_main_context_dispatch (libglib-2.0.so.0) #2 0x00007f28a8776a20 n/a (libglib-2.0.so.0) #3 0x00007f28a8776d42 g_main_loop_run (libglib-2.0.so.0) #4 0x00007f28a9f36d0c meta_run (libmutter-0.so.0) #5 0x0000000000401ff7 main (gnome-shell) #6 0x00007f28a818a511 __libc_start_main (libc.so.6) #7 0x000000000040212a n/a (gnome-shell) Stack trace of thread 3055: #0 0x00007f28a851b756 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f289af93500 PR_WaitCondVar (libnspr4.so) #2 0x00007f28a3578811 n/a (libmozjs-38.so) #3 0x00007f289af98d8c n/a (libnspr4.so) #4 0x00007f28a85152e7 start_thread (libpthread.so.0) #5 0x00007f28a825654f __clone (libc.so.6) Stack trace of thread 3060: #0 0x00007f28a851b756 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f289af93500 PR_WaitCondVar (libnspr4.so) #2 0x00007f28a3578811 n/a (libmozjs-38.so) #3 0x00007f289af98d8c n/a (libnspr4.so) #4 0x00007f28a85152e7 start_thread (libpthread.so.0) #5 0x00007f28a825654f __clone (libc.so.6) Stack trace of thread 3003: #0 0x00007f28a824c67d poll (libc.so.6) #1 0x00007f28a87769b6 n/a (libglib-2.0.so.0) #2 0x00007f28a8776acc g_main_context_iteration (libglib-2.0.so.0) #3 0x00007f28a8776b11 n/a (libglib-2.0.so.0) #4 0x00007f28a879e295 n/a (libglib-2.0.so.0) #5 0x00007f28a85152e7 start_thread (libpthread.so.0) #6 0x00007f28a825654f __clone (libc.so.6) Stack trace of thread 3004: #0 0x00007f28a824c67d poll (libc.so.6) #1 0x00007f28a87769b6 n/a (libglib-2.0.so.0) #2 0x00007f28a8776d42 g_main_loop_run (libglib-2.0.so.0) #3 0x00007f28a8d5dff6 n/a (libgio-2.0.so.0) #4 0x00007f28a879e295 n/a (libglib-2.0.so.0) #5 0x00007f28a85152e7 start_thread (libpthread.so.0) #6 0x00007f28a825654f __clone (libc.so.6) Stack trace of thread 3061: #0 0x00007f28a851b756 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f289af93500 PR_WaitCondVar (libnspr4.so) #2 0x00007f28a3578811 n/a (libmozjs-38.so) #3 0x00007f289af98d8c n/a (libnspr4.so) #4 0x00007f28a85152e7 start_thread (libpthread.so.0) #5 0x00007f28a825654f __clone (libc.so.6) Stack trace of thread 3058: #0 0x00007f28a851b756 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f289af93500 PR_WaitCondVar (libnspr4.so) #2 0x00007f28a3578811 n/a (libmozjs-38.so) #3 0x00007f289af98d8c n/a (libnspr4.so) #4 0x00007f28a85152e7 start_thread (libpthread.so.0) #5 0x00007f28a825654f __clone (libc.so.6) Stack trace of thread 3026: #0 0x00007f28a824c67d poll (libc.so.6) #1 0x00007f28a459bee1 n/a (libpulse.so.0) #2 0x00007f28a458d6f1 pa_mainloop_poll (libpulse.so.0) #3 0x00007f28a458dd8e pa_mainloop_iterate (libpulse.so.0) #4 0x00007f28a458de40 pa_mainloop_run (libpulse.so.0) #5 0x00007f28a459be29 n/a (libpulse.so.0) #6 0x00007f2899849fe8 n/a (libpulsecommon-10.0.so) #7 0x00007f28a85152e7 start_thread (libpthread.so.0) #8 0x00007f28a825654f __clone (libc.so.6) Stack trace of thread 3059: #0 0x00007f28a851b756 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f289af93500 PR_WaitCondVar (libnspr4.so) #2 0x00007f28a3578811 n/a (libmozjs-38.so) #3 0x00007f289af98d8c n/a (libnspr4.so) #4 0x00007f28a85152e7 start_thread (libpthread.so.0) #5 0x00007f28a825654f __clone (libc.so.6) Stack trace of thread 3056: #0 0x00007f28a851b756 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f289af93500 PR_WaitCondVar (libnspr4.so) #2 0x00007f28a3578811 n/a (libmozjs-38.so) #3 0x00007f289af98d8c n/a (libnspr4.so) #4 0x00007f28a85152e7 start_thread (libpthread.so.0) #5 0x00007f28a825654f __clone (libc.so.6) Stack trace of thread 3025: #0 0x00007f28a824c67d poll (libc.so.6) #1 0x00007f28a87769b6 n/a (libglib-2.0.so.0) #2 0x00007f28a8776acc g_main_context_iteration (libglib-2.0.so.0) #3 0x00007f289053e55d n/a (libdconfsettings.so) #4 0x00007f28a879e295 n/a (libglib-2.0.so.0) #5 0x00007f28a85152e7 start_thread (libpthread.so.0) #6 0x00007f28a825654f __clone (libc.so.6) Stack trace of thread 3462: #0 0x00007f28a8251889 syscall (libc.so.6) #1 0x00007f28a87bc32f g_cond_wait (libglib-2.0.so.0) #2 0x00007f28a5c3568d n/a (libmutter-cogl-0.so) #3 0x00007f28a879e295 n/a (libglib-2.0.so.0) #4 0x00007f28a85152e7 start_thread (libpthread.so.0) #5 0x00007f28a825654f __clone (libc.so.6) Stack trace of thread 3057: #0 0x00007f28a851b756 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f289af93500 PR_WaitCondVar (libnspr4.so) #2 0x00007f28a3578811 n/a (libmozjs-38.so) #3 0x00007f289af98d8c n/a (libnspr4.so) #4 0x00007f28a85152e7 start_thread (libpthread.so.0) #5 0x00007f28a825654f __clone (libc.so.6) Stack trace of thread 3054: #0 0x00007f28a851b756 pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007f289af93500 PR_WaitCondVar (libnspr4.so) #2 0x00007f28a3578811 n/a (libmozjs-38.so) #3 0x00007f289af98d8c n/a (libnspr4.so) #4 0x00007f28a85152e7 start_thread (libpthread.so.0) #5 0x00007f28a825654f __clone (libc.so.6) may 19 12:57:54 manjarog-p5k gnome-shell[7104]: JS WARNING: [resource:///org/gnome/gjs/modules/tweener/tweener.js 538]: reference to undefined property properties[istr].arrayIndex may 19 12:57:54 manjarog-p5k gnome-shell[7104]: JS WARNING: [resource:///org/gnome/shell/ui/search.js 436]: reference to undefined property provider.isRemoteProvider may 19 12:57:54 manjarog-p5k gnome-shell[7104]: No permission to trigger offline updates: Polkit.Error: GDBus.Error:org.freedesktop.PolicyKit1.Error.Failed: Action org.freedesktop.packagekit.trigger-offline-update is not registered may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): DFP-0: disconnected may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): DFP-0: Internal TMDS may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): DFP-0: 330.0 MHz maximum pixel clock may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): Ancor Communications Inc VX229 (DFP-1): connected may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): Ancor Communications Inc VX229 (DFP-1): Internal TMDS may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): Ancor Communications Inc VX229 (DFP-1): 600.0 MHz maximum pixel clock may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): DFP-2: disconnected may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): DFP-2: Internal DisplayPort may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): DFP-2: 1440.0 MHz maximum pixel clock may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): DFP-3: disconnected may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): DFP-3: Internal TMDS may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): DFP-3: 330.0 MHz maximum pixel clock may 19 12:57:54 manjarog-p5k /usr/lib/gdm/gdm-x-session[2956]: (--) NVIDIA(GPU-0): I don't know if it's enough or it would be better to publish the journalctl from my laptop too.
*** Bug 782851 has been marked as a duplicate of this bug. ***
Has anyone observed this crash on mozjs38-38.2.1rc0 rather than 38.8.0? (Or in other words, does the crash occur when using jhbuild and the gnome-3.24 modulesets?) It seems like these crashes were only reported on Arch after they upgraded their mozjs38, but I'd be happy to be proven wrong. > What is this RR tool, how do I get it and how do I use it? Do you have any > commands I can run there? I don't think it is even packaged for my distro. http://rr-project.org/ Instructions for building from source, and Fedora and Ubuntu binary packages, are provided there. Read comment 14 for the information that would help here.
(In reply to Philip Chimento from comment #34) > Has anyone observed this crash on mozjs38-38.2.1rc0 rather than 38.8.0? (Or > in other words, does the crash occur when using jhbuild and the gnome-3.24 > modulesets?) It seems like these crashes were only reported on Arch after > they upgraded their mozjs38, but I'd be happy to be proven wrong. I'm seeing the crash on Fedora (as do some other people), with mozjs38-38.8.0-4.fc26.x86_64, which is extracted from the https://ftp.mozilla.org/pub/firefox/releases/38.8.0esr/source/firefox-38.8.0esr.source.tar.bz2 sources. 38.8.0 has been in Fedora repositories for 8 months, so anyone using Gnome 3.24 on Fedora has the 38.8.0 builds. These are a few downstream bug reports, don't know whether that helps or not: https://bugzilla.redhat.com/show_bug.cgi?id=1451805 https://bugzilla.redhat.com/show_bug.cgi?id=1451914 https://bugzilla.redhat.com/show_bug.cgi?id=1452453 https://bugzilla.redhat.com/show_bug.cgi?id=1452901 https://bugzilla.redhat.com/show_bug.cgi?id=1451919 (missing debug symbols) > http://rr-project.org/ > > Instructions for building from source, and Fedora and Ubuntu binary > packages, are provided there. Read comment 14 for the information that would > help here. Thanks!
When will be available the patch for this? I work with GNOME Shell and my patience is ending.
(In reply to Eduardo Medina from comment #36) > When will be available the patch for this? I work with GNOME Shell and my > patience is ending. When someone fixes it. Please, refrain from posting this kind of comment in Bugzilla. It just creates noise and doesn't add to the technical discussion.
(In reply to Eduardo Medina from comment #36) > When will be available the patch for this? I work with GNOME Shell and my > patience is ending. Before you end your patience, my suggestion would be to use the X.org session, so a compositor crash does not mean the session terminates.
Xorg is less stable than Wayland on my computer. I'm a Linux gamer, so I need Xorg more than Wayland.
I spent this evening trying to run gnome-shell under rr, so far without success. The problem is that gnome-shell makes a DRM ioctl syscall which is currently not supported by rr, on both of my machines. See also this comment: https://github.com/mozilla/rr/issues/1596#issuecomment-303191420. I might try to fix this some time, but for now I am unable to use rr for this. It might be possible to sidestep the issue by not using DRM, but so far I have not been able to figure out how to force gdm/gnome-session to use the Gallium llvmpipe driver. I have been able to consistently reproduce this issue within a few seconds under Wayland with the steps from comment 28. I have not been able to reproduce this issue at all under X. If anybody wants to attempt to debug this, the way to start gnome-session under rr is to change the Exec= line in /usr/share/applications/org.gnome.Shell.desktop to invoke 'rr record'.
(In reply to Ruud van Asseldonk from comment #40) > I spent this evening trying to run gnome-shell under rr, so far without > success. Thanks so much for trying! > The problem is that gnome-shell makes a DRM ioctl syscall which is > currently not supported by rr, on both of my machines. See also this > comment: https://github.com/mozilla/rr/issues/1596#issuecomment-303191420. I > might try to fix this some time, but for now I am unable to use rr for this. I'll be trying to repro it under X for now.
(In reply to Eduardo Medina from comment #36) > When will be available the patch for this? I work with GNOME Shell and my > patience is ending. Eduardo, I'd like to ask you to assume that we are all acting in good faith here. This bug has so far proven difficult for me to reproduce, and at the moment I'm also struggling to find free time to spend on tracking it down. I know that the crashes must be frustrating, but with comments like that you are demotivating the very people who can help fix the problem.
(In reply to Philip Chimento from comment #42) > (In reply to Eduardo Medina from comment #36) > > When will be available the patch for this? I work with GNOME Shell and my > > patience is ending. > > Eduardo, I'd like to ask you to assume that we are all acting in good faith > here. This bug has so far proven difficult for me to reproduce, and at the > moment I'm also struggling to find free time to spend on tracking it down. > > I know that the crashes must be frustrating, but with comments like that you > are demotivating the very people who can help fix the problem. Sorry if one of my previous comments sounded harsh. The crash is reproduced every 15-45 minutes (on Xorg and Wayland) in my computer and I use GNOME Shell for my work, so the result of this is my frustration because this problem breaks my workflow. These are the applications and the their distribution I use to crash the desktop environment: -Desktop 1: LibreOffice, Firefox, Chromium and Opera. -Desktop 2: Gedit, Files/Nautilus and Gimp and/or Krita. -Desktop 3: Telegram and Audacious. The bug doesn't happen if you use only one virtual desktop, you have to use more than one.
@Eduardo - I suggest you downgrade your gnome-shell packages to the previous version. That might work much better for you for the short term.
Is there a Fedora bug report for this? A crash that brings down the entire desktop is a likely candidate to be a Fedora blocker.
(In reply to Michael Catanzaro from comment #45) > Is there a Fedora bug report for this? A crash that brings down the entire > desktop is a likely candidate to be a Fedora blocker. Looks like you found the answer already ;) https://bugzilla.redhat.com/show_bug.cgi?id=1451914
*** Status as of May 24 - Please Read Before Posting *** I am trying to reproduce this on my Fedora box but instead managed to permanently screw up my video drivers while trying to boot in VESA mode. That may take a while to fix. I will be trying to reproduce it with a VM instead. Here's what you can do to help: - Post your OS distro/version, versions of gnome-shell, gjs, mozjs38, and whether you are using Xorg or Wayland. - Prove or disprove that this crash doesn't happen using mozjs38-38.2.1.rc0 (the version built by jhbuild.) I suspect that might be the case because I did a reasonable amount of testing on that version, and it was never reported by Arch users when Arch still used that version (whereas other, now fixed, crashes were reported on Arch.) It's a long shot but it would narrow down the search quite a bit, so it's worth it. - Use RR to find the point at which the offending GjsMaybeOwned<JS::Value> object becomes garbage. * As Ruud found out, RR will likely not work on Wayland so you need an Xorg session. (And you need to be able to reproduce the crash under X) * See for more info https://github.com/mozilla/rr/issues/1596#issuecomment-303191420 * Ctrl+Alt+F2 to a VT, run `DISPLAY=:1 XDG_SESSION_TYPE=x11 rr gnome-shell --replace`, replacing the DISPLAY value with the correct display, and Ctrl+Alt+F1 (or F7 depending on distro) back to the X server. Wait for the crash to happen. Run `rr replay`. - Build mozjs38 with --enable-debug and reproduce the crash. This might give more useful information. Note, this is different from debug symbols! It enables a bunch of extra assertions and sanity checks at runtime. This also requires rebuilding GJS against the rebuilt mozjs38.
Created attachment 352595 [details] Just another log with stacktrace I've bumped into this with gnome-shell 3.24.2 and gjs 1.48.3
I, out of desperation of loosing my work everytime this randomly happens, tried upgrading to gjs 1.49.2 (commit=d74c0ab5968449c4d790e24cad694d9ad022ef7e) No use, problem still remains. mei 27 11:19:54 roo systemd-coredump[13071]: Process 10052 (gnome-shell) of user 1000 dumped core. Stack trace of thread 10052: #0 0x00007fe5da376a10 raise (libc.so.6) #1 0x00007fe5da37813a abort (libc.so.6) #2 0x00007fe5da3b52b0 __libc_message (libc.so.6) #3 0x00007fe5da3bb90e malloc_printerr (libc.so.6) #4 0x00007fe5da3bc11e _int_free (libc.so.6) #5 0x00007fe5dc84514b _ZN13GjsMaybeOwnedIN2JS5ValueEE16teardown_rootingEv (libgjs.so.0) #6 0x00007fe5d5a73c73 n/a (libmozjs-38.so) #7 0x00007fe5d5acea3c n/a (libmozjs-38.so) #8 0x00007fe5d5a74f61 n/a (libmozjs-38.so) #9 0x00007fe5d5a8a733 n/a (libmozjs-38.so) #10 0x00007fe5d5a8b172 n/a (libmozjs-38.so) #11 0x00007fe5d5a8cf18 n/a (libmozjs-38.so) #12 0x00007fe5d5a8d8c0 n/a (libmozjs-38.so) #13 0x00007fe5d5a8db0d n/a (libmozjs-38.so) #14 0x00007fe5d5a8de64 n/a (libmozjs-38.so) #15 0x00007fe5dc85b016 gjs_gc_if_needed (libgjs.so.0) #16 0x00007fe5dc850b44 trigger_gc_if_needed (libgjs.so.0) #17 0x00007fe5da94f66a g_main_context_dispatch (libglib-2.0.so.0) #18 0x00007fe5da94fa20 n/a (libglib-2.0.so.0) #19 0x00007fe5da94fd42 g_main_loop_run (libglib-2.0.so.0) #20 0x00007fe5dc110d0c meta_run (libmutter-0.so.0) #21 0x0000000000401ff7 main (gnome-shell) #22 0x00007fe5da363511 __libc_start_main (libc.so.6) #23 0x000000000040212a n/a (gnome-shell) libraries involved are: glibc 2.25-1 js38 38.8.0-3 gjs 1.49.2-1 glib2 2.52.2+1+gb8bd46bc8-1 mutter 3.24.2-1 gnome-shell 3.24.2-1
I can reproduce Wayland session crash following the steps from comment 28 https://bugzilla.gnome.org/show_bug.cgi?id=781799#c28 I use Antergos (Arch-based).
Created attachment 352698 [details] Crash of gnome-shell by running sushi Reproduced with: Archlinux js38 38.8.0-3 with --enable-debug flag gjs 1.48.3-1 gnome-shell 3.24.2-1 sushi-3.24.0-1 It contains some messages about assertion failures. Sushi crashed at closing because I didn't see it.
Created attachment 353154 [details] Valgrind-Memcheck report Valgrind reports about "use-after-free" at ./gi/object.cpp:1747 before crash. Corresponding code: static gboolean signal_connection_invalidate_idle(void *user_data) { ConnectData *connect_data = (ConnectData *) user_data; // referenced line connect_data->obj->signals = g_list_delete_link(connect_data->obj->signals, connect_data->link); g_slice_free(ConnectData, connect_data); return G_SOURCE_REMOVE; } I am not sure that this is connected with damaging of GjsMaybeOwned<JS::Value>, but it may be useful.
I also am experience this relatively frequently. I managed to capture this crash with a non-stripped version of libmozjs. Let me know if there's more info that would be helpful. Arch Linux js38 38.8.0-3 gjs 1.48.3-1 gnome-shell 3.24.2-1
(In reply to John from comment #53) > I also am experience this relatively frequently. I managed to capture this > crash with a non-stripped version of libmozjs. Let me know if there's more > info that would be helpful. > > Arch Linux > js38 38.8.0-3 > gjs 1.48.3-1 > gnome-shell 3.24.2-1 PID: 4587 (gnome-shell) UID: 1000 (john) GID: 1000 (john) Signal: 6 (ABRT) Timestamp: Fri 2017-06-09 11:56:35 EDT (17min ago) Command Line: /usr/bin/gnome-shell Executable: /usr/bin/gnome-shell Control Group: /user.slice/user-1000.slice/session-c2.scope Unit: session-c2.scope Slice: user-1000.slice Session: c2 Owner UID: 1000 (john) Boot ID: d7fcb0f81a6b4b96ac9dbf5e7a0808b6 Machine ID: dceef662384b4f94ac526646ffcae8b9 Hostname: oberon Storage: /var/lib/systemd/coredump/core.gnome-shell.1000.d7fcb0f81a6b4b96ac9dbf5e7a0808b6.4587.1497023795000000000000.lz4 Message: Process 4587 (gnome-shell) of user 1000 dumped core. Stack trace of thread 4587: #0 0x00007f68b8a54670 raise (libc.so.6) #1 0x00007f68b8a55d00 abort (libc.so.6) #2 0x00007f68b8a93551 __libc_message (libc.so.6) #3 0x00007f68b8a99bfb malloc_printerr (libc.so.6) #4 0x00007f68b8a9afd1 _int_free (libc.so.6) #5 0x00007f68baf23ab3 n/a (libgjs.so.0) #6 0x00007f68b4153af3 _ZN8JSObject8finalizeEPN2js6FreeOpE (libmozjs-38.so) #7 0x00007f68b41ae643 _ZN2js2gc10ArenaLists16forceFinalizeNowEPNS_6FreeOpENS0_9AllocKindENS1_14KeepArenasEnumEPPNS0_11ArenaHeaderE (libmozjs-38.so) #8 0x00007f68b4154dd1 _ZN2js2gc10ArenaLists11finalizeNowEPNS_6FreeOpENS0_9AllocKindENS1_14KeepArenasEnumEPPNS0_11ArenaHeaderE (libmozjs-38.so) #9 0x00007f68b416b696 _ZN2js2gc9GCRuntime22beginSweepingZoneGroupEv (libmozjs-38.so) #10 0x00007f68b416bdd8 _ZN2js2gc9GCRuntime15beginSweepPhaseEb (libmozjs-38.so) #11 0x00007f68b416df93 _ZN2js2gc9GCRuntime23incrementalCollectSliceERNS_11SliceBudgetEN2JS8gcreason6ReasonE (libmozjs-38.so) #12 0x00007f68b416e959 _ZN2js2gc9GCRuntime7gcCycleEbRNS_11SliceBudgetEN2JS8gcreason6ReasonE (libmozjs-38.so) #13 0x00007f68b416eba5 _ZN2js2gc9GCRuntime7collectEbNS_11SliceBudgetEN2JS8gcreason6ReasonE (libmozjs-38.so) #14 0x00007f68b416f102 _ZN2js2gc9GCRuntime13gcIfRequestedEP9JSContext (libmozjs-38.so) #15 0x00007f68b3e9cad7 InvokeInterruptCallback (libmozjs-38.so) #16 0x00007f68bb905930 n/a (n/a) #17 0x0000000005ce15c0 n/a (n/a) #18 0x00007f68a32a7cb2 n/a (n/a)
(In reply to John from comment #53) > I also am experience this relatively frequently. I managed to capture this > crash with a non-stripped version of libmozjs. Let me know if there's more > info that would be helpful. As written above, catching the crash with RR would be useful, in a way that you can roll back and find the moment where the data was free()d first.
(In reply to Christian Stadelmann from comment #55) > (In reply to John from comment #53) > > I also am experience this relatively frequently. I managed to capture this > > crash with a non-stripped version of libmozjs. Let me know if there's more > > info that would be helpful. > > As written above, catching the crash with RR would be useful, in a way that > you can roll back and find the moment where the data was free()d first. Sadly I can't seem to reproduce gnome-shell crashing under X11. Under X11 I see that Sushi crashes when previewing a file, whereas under Wayland Sushi crashes, but eventually also the whole gnome-shell crashes. I did manage to get rr setup and running, though, so I'll continue to try when I have some spare time.
(In reply to Vladimir Stoyakin from comment #51) > js38 38.8.0-3 with --enable-debug flag > > It contains some messages about assertion failures. > Sushi crashed at closing because I didn't see it. Sushi fails a debug-mode assertion at closing because it doesn't properly clean up its GjsContext (it needs a g_object_unref(gjs_context) at the end of main.c). This can safely be ignored for now. (In reply to Vladimir Stoyakin from comment #52) > Valgrind reports about "use-after-free" at ./gi/object.cpp:1747 before crash. > > I am not sure that this is connected with damaging of > GjsMaybeOwned<JS::Value>, but it may be useful. This is absolutely fantastic, thank you. I'm not certain it is connected either, but I would not be surprised if this is the cause. I will attach a patch shortly. (In reply to John from comment #56) > (In reply to Christian Stadelmann from comment #55) > > (In reply to John from comment #53) > > > I also am experience this relatively frequently. I managed to capture this > > > crash with a non-stripped version of libmozjs. Let me know if there's more > > > info that would be helpful. > > > > As written above, catching the crash with RR would be useful, in a way that > > you can roll back and find the moment where the data was free()d first. > > Sadly I can't seem to reproduce gnome-shell crashing under X11. Under X11 I > see that Sushi crashes when previewing a file, whereas under Wayland Sushi > crashes, but eventually also the whole gnome-shell crashes. It may well be crashing under X anyway, it just has a different effect. Under Wayland a gnome-shell crash is more catastrophic, because it takes out all your running applications too. --- Other news - I have reproduced the crash using Bastien's instructions: (In reply to Bastien Nocera from comment #28) > 1. In nautilus, open a folder with a video > 2. Press Enter to launch the video > 3. Press 'q' in totem to exit > 4. Go back to 2. > > I can make it crash like that in under a minute. For me it's more like 10 minutes of frenetically mashing those keys, but I have managed to observe the crash on a VM running Fedora 27 Alpha. Unfortunately RR doesn't work under VirtualBox. I have almost got my Fedora box back up and running, so I'll concentrate my effort there. First of all to double check that the patches that I'm about to attach fix the problem, and if that doesn't work then to run RR.
Created attachment 353579 [details] [review] object: Prevent use-after-free in signal connections Objects trace their signal connections in order to keep the closures alive during garbage collection. When invalidating a signal connection, we must do so in an idle function, since it is illegal to stop tracing a GC-thing in the middle of GC. However, this caused a possible use-after-free if the signal connection was invalidated, and then the object itself was finalized before the idle function could be run. This refactor avoids the use-after-free by cancelling any pending idle invalidations in the object's finalizer, and invalidating any remaining signal connections in such a way that no more idle functions are scheduled.
Created attachment 353580 [details] [review] util-root: Allow GjsMaybeOwned::DestroyNotify to free In the case of a closure, the GjsMaybeOwned object is embedded as part of struct Closure. The context destroy notify callback will invalidate the closure, which frees the GjsMaybeOwned object, causing a use-after-free when the callback returns. This patch gives the callback a boolean return value; it should return true if it has freed the GjsMaybeOwned object and false if it does not. If the callback returns true, then the GjsMaybeOwned object will be considered invalid from then on.
These patches apply to both master and gnome-3-24. If you are able, please check and let me know if the crash still occurs with these patches applied.
*** Bug 783699 has been marked as a duplicate of this bug. ***
Review of attachment 353579 [details] [review]: Good catch! I wonder if it's possible to encode this in a testcase? I also have a hypothetical comment below, but feel free to push if the answer is no. ::: gi/object.cpp @@ +1402,3 @@ +{ + auto cd = static_cast<ConnectData *>(data); + cd->obj->signals.erase(cd); Is it possible for this code to get called before signal_connection_invalidate_idle() had a chance to fire? If not, then all good; otherwise you would need to only schedule the idle timeout when it hasn't been scheduled already.
Review of attachment 353580 [details] [review]: I can see how this solves the bug, but it feels to me that this could be prevented by either e.g. ref-counting GjsMaybeOwned so that it can't get freed by the callback, or moving the code that frees the object outside of the callback. I am not against this solution though, so feel free to push it if you have already considered alternatives.
*** Bug 783723 has been marked as a duplicate of this bug. ***
(In reply to Cosimo Cecchi from comment #62) > Review of attachment 353579 [details] [review] [review]: > > Good catch! I wonder if it's possible to encode this in a testcase? I tried a few things; definitely not possible to write a test case directly for it, since it won't crash reliably, but depends on the freed memory's contents. I tried to write something that would at least crash when run under -fsanitize=address as in bug 783220, but no luck, since I'm still not sure what caused this. > I also have a hypothetical comment below, but feel free to push if the > answer is no. > > ::: gi/object.cpp > @@ +1402,3 @@ > +{ > + auto cd = static_cast<ConnectData *>(data); > + cd->obj->signals.erase(cd); > > Is it possible for this code to get called before > signal_connection_invalidate_idle() had a chance to fire? If not, then all > good; otherwise you would need to only schedule the idle timeout when it > hasn't been scheduled already. The documentation seemed to imply that a closure's invalidate notifier can only ever be called once, and I double checked in the source: https://git.gnome.org/browse/glib/tree/gobject/gclosure.c#n572
Comment on attachment 353579 [details] [review] object: Prevent use-after-free in signal connections Attachment 353579 [details] pushed as 2593d3d - object: Prevent use-after-free in signal connections
(In reply to Cosimo Cecchi from comment #63) > Review of attachment 353580 [details] [review] [review]: > > I can see how this solves the bug, but it feels to me that this could be > prevented by either e.g. ref-counting GjsMaybeOwned so that it can't get > freed by the callback, or moving the code that frees the object outside of > the callback. > I am not against this solution though, so feel free to push it if you have > already considered alternatives. I had not considered those alternatives. I think the problem is that the lifetimes of the GjsMaybeOwned and the closure struct are tied together. My guess is that uncoupling those lifetimes would make things more complicated since if the callback doesn't free the closure, then the GjsMaybeOwned is responsible for freeing it. (Or the GjsMaybeOwned owns the last reference to itself.) I'll think about it a bit longer before pushing this patch. One alternative might be that if you provide a callback, then the callback is required to free the GjsMaybeOwned.
*** Status as of June 12 - Please Read Before Posting *** My Fedora box died entirely, and will be a while before I can open it up to repair it. I will be relying on a virtual machine for the time being, which unfortunately means I can't run RR on the crash. Here's what you can do to help: - Confirm or disprove that the patches attached here solve the crashes. If you are on Fedora, the easiest way might be to rebuild RPMs with the patches. Take the instructions here [1] as a starting point. After installing the source RPM, download the two patches from this bug and save them in the rpmbuild/SOURCES directory. Add Patch0: name-of-first.patch Patch1: name-of-second.patch to the rpmbuild/SPECS/gjs.spec file under `Source0`. Under `%setup` add these lines: %global _default_patch_fuzz 2 %patch0 -p1 %patch1 -p1 Finally, increment the integer in `Release` and add `debug`, e.g. `5debug` in order to distinguish the built packages from the system-provided version. Then when you build the RPM use the -ba option instead of -bp. Use `dnf install gjs-1.48.3-whatever-your-old-version-was` to go back to the prior situation. - Any of the other things listed in comment 47. [1] https://ask.fedoraproject.org/en/question/87205/how-do-i-install-a-src-rpm-with-dnf/
(In reply to Philip Chimento from comment #65) > (In reply to Cosimo Cecchi from comment #62) > > Review of attachment 353579 [details] [review] [review] [review]: > > > > Good catch! I wonder if it's possible to encode this in a testcase? > > I tried a few things; definitely not possible to write a test case directly > for it, since it won't crash reliably, but depends on the freed memory's > contents. I tried to write something that would at least crash when run > under -fsanitize=address as in bug 783220, but no luck, since I'm still not > sure what caused this. You could use valgrind in that test case and make it abort (=fail) at the first memory access bug.
(In reply to Philip Chimento from comment #68) > *** Status as of June 12 - Please Read Before Posting *** > > My Fedora box died entirely, and will be a while before I can open it up to > repair it. I will be relying on a virtual machine for the time being, which > unfortunately means I can't run RR on the crash. > > Here's what you can do to help: > > - Confirm or disprove that the patches attached here solve the crashes. > > If you are on Fedora, the easiest way might be to rebuild RPMs with the > patches. Take the instructions here [1] as a starting point. After > installing the source RPM, download the two patches from this bug and save > them in the rpmbuild/SOURCES directory. Add > Patch0: name-of-first.patch > Patch1: name-of-second.patch > to the rpmbuild/SPECS/gjs.spec file under `Source0`. Under `%setup` add > these lines: > %global _default_patch_fuzz 2 > %patch0 -p1 > %patch1 -p1 > Finally, increment the integer in `Release` and add `debug`, e.g. `5debug` > in order to distinguish the built packages from the system-provided version. > Then when you build the RPM use the -ba option instead of -bp. > Use `dnf install gjs-1.48.3-whatever-your-old-version-was` to go back to > the prior situation. > > - Any of the other things listed in comment 47. > > [1] > https://ask.fedoraproject.org/en/question/87205/how-do-i-install-a-src-rpm- > with-dnf/ I built gjs 1.48.3 with the patches from attachment 353579 [details] [review] and 353580 on ArchLinux. Prior to applying I was able to trigger crashes relatively easily using the preview functionality as mentioned above; after applying both patches I can no longer reproduce it. Cheers!
(In reply to Philip Chimento from comment #68) > *** Status as of June 12 - Please Read Before Posting *** > […] > > Here's what you can do to help: > > - Confirm or disprove that the patches attached here solve the crashes. […] Done that, thanks for the links by the way! With these patches applied I cannot reproduce the crash of attachment 351010 [details] from bug #782060 any more (as opposed to the situation before that, see comment #24). Fedora 26 gnome-shell-3.24.2-1.fc26.x86_64 mutter-3.24.2-1.fc26.x86_64 gjs-1.48.3-2.fc26.x86_64 with patches from attachment 353579 [details] [review] and attachment 353580 [details] [review] applied mozjs38-38.8.0-4.fc26.x86_64 libwayland-server-1.13.0-1.fc26.x86_64 GNOME/Wayland session. I did not run into any of the related crashes on a GNOME/Xorg session, but GNOME/Wayland crashed regularly, about once per hour plus on every 3rd login.
Tested with both patches applied to the F26 RPM, and I can't make it crash as easily as I used to in comment 28.
Applied two patches yesterday and today when watching youtube in Chrome. gnome-shell[461]: segfault at 7fac98dfffe8 ip 00007faccdde2cad sp 00007ffe1f01ba70 error 4 in libgjs.so.0.0.0[7faccddb3000+c8000] It is not informative, just a note.
*** Bug 783769 has been marked as a duplicate of this bug. ***
With these two patches applied the armv7hl builds in the Fedora build system fail with the following: gi/arg.cpp: In function 'bool gjs_g_argument_release_in_array(JSContext*, GITransfer, GITypeInfo*, guint, GArgument*)': gi/arg.cpp:3544:35: error: cast from 'void**' to 'GValue* {aka _GValue*}' increases required alignment of target type [-Werror=cast-align] GValue *v = ((GValue*)array) + i; ^~~~~ Other arches we're building build just fine. https://kojipkgs.fedoraproject.org//work/tasks/2827/20012827/build.log has the full build log (short lived link)
(In reply to John from comment #70) > I built gjs 1.48.3 with the patches from attachment 353579 [details] [review] > [review] and 353580 on ArchLinux. Prior to applying I was able to trigger > crashes relatively easily using the preview functionality as mentioned > above; after applying both patches I can no longer reproduce it. (In reply to Christian Stadelmann from comment #71) > With these patches applied I cannot reproduce the crash of attachment 351010 [details] > [details] from bug #782060 any more (as opposed to the situation before > that, see comment #24). Thanks, both of you. So this seems to have fixed _something_, at least. I'll release a GJS 1.48.4 as soon as the second patch is merged. > I did not run into any of the related crashes on a GNOME/Xorg session, but > GNOME/Wayland crashed regularly, about once per hour plus on every 3rd login. Just to be clear, you mean with the patches applied? Or that was the situation before the patches and you are talking about the crash from bug #782060? (In reply to Bastien Nocera from comment #72) > Tested with both patches applied to the F26 RPM, and I can't make it crash > as easily as I used to in comment 28. Implying that you still get the crashes but less easily? Or you can't make it crash easily but don't want to assume that it never happens? :-) (In reply to Maxim from comment #73) > Applied two patches yesterday and today when watching youtube in Chrome. > > gnome-shell[461]: segfault at 7fac98dfffe8 ip 00007faccdde2cad sp > 00007ffe1f01ba70 error 4 in libgjs.so.0.0.0[7faccddb3000+c8000] > > It is not informative, just a note. Can you reproduce the crash? Do you happen to have the whole backtrace, preferably with debug symbols? It's not impossible that we have been chasing two different crashes; same concern as with Christian's comment above. (In reply to Kalev Lember from comment #75) > With these two patches applied the armv7hl builds in the Fedora build system > fail with the following: > > gi/arg.cpp: In function 'bool gjs_g_argument_release_in_array(JSContext*, > GITransfer, GITypeInfo*, guint, GArgument*)': > gi/arg.cpp:3544:35: error: cast from 'void**' to 'GValue* {aka _GValue*}' > increases required alignment of target type [-Werror=cast-align] > GValue *v = ((GValue*)array) + i; > ^~~~~ Bizarre! That code wasn't touched by these patches. Are you certain that the failure wasn't there before, or it might have been caused by a change in the compiler flags?
(In reply to Philip Chimento from comment #76) > (In reply to Bastien Nocera from comment #72) > > Tested with both patches applied to the F26 RPM, and I can't make it crash > > as easily as I used to in comment 28. > > Implying that you still get the crashes but less easily? Or you can't make > it crash easily but don't want to assume that it never happens? :-) The latter :) > (In reply to Kalev Lember from comment #75) > > With these two patches applied the armv7hl builds in the Fedora build system > > fail with the following: > > > > gi/arg.cpp: In function 'bool gjs_g_argument_release_in_array(JSContext*, > > GITransfer, GITypeInfo*, guint, GArgument*)': > > gi/arg.cpp:3544:35: error: cast from 'void**' to 'GValue* {aka _GValue*}' > > increases required alignment of target type [-Werror=cast-align] > > GValue *v = ((GValue*)array) + i; > > ^~~~~ > > Bizarre! That code wasn't touched by these patches. Are you certain that the > failure wasn't there before, or it might have been caused by a change in the > compiler flags? It's a separate problem, likely caused by GCC changes on that platform. Best filed as a separate bug.
Created attachment 353710 [details] gnome-shell crash backtrace I did experience another crash after applying the above two patches, when closing a Firefox window playing a flash video (cnn.com): Message: Process 1177 (gnome-shell) of user 1000 dumped core. Stack trace of thread 1177: #0 0x00007f4a555cf8b0 _ZNK8JSObject12lastPropertyEv (libmozjs-38.so) #1 0x00007f4a556aaf65 _ZNK8JSObject11compartmentEv (libmozjs-38.so) #2 0x00007f4a55b9dfba _ZN2js18CompartmentChecker5checkIN2JS5ValueEEEvNS2_6HandleIT_EE (libmozjs-38.so) #3 0x00007f4a5ca9aa16 gjs_call_function_value (libgjs.so.0) #4 0x00007f4a5ca657a6 gjs_closure_invoke (libgjs.so.0) #5 0x00007f4a5ca87e9e closure_marshal (libgjs.so.0) #6 0x00007f4a5ae50ead g_closure_invoke (libgobject-2.0.so.0) #7 0x00007f4a5ae634ee n/a (libgobject-2.0.so.0) #8 0x00007f4a5ae6bcd5 g_signal_emit_valist (libgobject-2.0.so.0) #9 0x00007f4a5ae6c6ef g_signal_emit (libgobject-2.0.so.0) #10 0x00007f4a5b69653c n/a (libmutter-clutter-0.so) #11 0x00007f4a5ae573d8 g_object_run_dispose (libgobject-2.0.so.0) #12 0x00007f4a5b68a2b6 clutter_actor_destroy (libmutter-clutter-0.so) #13 0x00007f4a560971c8 ffi_call_unix64 (libffi.so.6) #14 0x00007f4a56096c2a ffi_call (libffi.so.6) #15 0x00007f4a5ca6b7fe gjs_invoke_c_function (libgjs.so.0) #16 0x00007f4a5ca6d87a function_call (libgjs.so.0) #17 0x00007f4a5d442726 n/a (n/a) #18 0x0000000001d64768 n/a (n/a) #19 0x00007f49f6a54503 n/a (n/a) Attached is the full backtrace.
(In reply to John from comment #78) > Created attachment 353710 [details] > gnome-shell crash backtrace > > I did experience another crash after applying the above two patches, when > closing a Firefox window playing a flash video (cnn.com): OK, sounds like the same problem Maxim reported, but that's definitely a different crash, because the backtrace has nothing to do with garbage collection - which is good :-) I opened bug 783771 for this other crash. Please follow up there if you have more info.
(In reply to Philip Chimento from comment #76) […] > (In reply to Christian Stadelmann from comment #71) > > With these patches applied I cannot reproduce the crash of attachment 351010 [details] > > [details] from bug #782060 any more (as opposed to the situation before > > that, see comment #24). > […] > > I did not run into any of the related crashes on a GNOME/Xorg session, but > > GNOME/Wayland crashed regularly, about once per hour plus on every 3rd login. > > Just to be clear, you mean with the patches applied? Or that was the > situation before the patches and you are talking about the crash from bug > #782060? Oh, sorry, that wasn't clear. I meant that I could not reproduce the crashes with GNOME/Xorg with or without the patches applied. I could reproduce the crashes with GNOME/Wayland without the patches applied. I cannot reproduce the crashes with GNOME/Wayland with the patches applied. > (In reply to Bastien Nocera from comment #72) > > Tested with both patches applied to the F26 RPM, and I can't make it crash > > as easily as I used to in comment 28. > > Implying that you still get the crashes but less easily? Or you can't make > it crash easily but don't want to assume that it never happens? :-) The second one for me too. That's why I only wrote I cannot reproduce the sushi crash any more. In general, gnome-shell seems to run more stable anyway, so I guess the other bugs are gone too, but I am not sure either. And I cannot be sure about that because they happened randomly.
(In reply to Bastien Nocera from comment #77) > > (In reply to Kalev Lember from comment #75) > > > With these two patches applied the armv7hl builds in the Fedora build system > > > fail with the following: > > > > > > gi/arg.cpp: In function 'bool gjs_g_argument_release_in_array(JSContext*, > > > GITransfer, GITypeInfo*, guint, GArgument*)': > > > gi/arg.cpp:3544:35: error: cast from 'void**' to 'GValue* {aka _GValue*}' > > > increases required alignment of target type [-Werror=cast-align] > > > GValue *v = ((GValue*)array) + i; > > > ^~~~~ > > > > Bizarre! That code wasn't touched by these patches. Are you certain that the > > failure wasn't there before, or it might have been caused by a change in the > > compiler flags? > > It's a separate problem, likely caused by GCC changes on that platform. Best > filed as a separate bug. OK, tracked this one down and it was indeed a compiler flag change. Sorry for the noise. :) Bastien used git to apply the two patches and this made AX_IS_RELEASE([git-directory]) think it's working from a developer git checkout and it added a -Werror which tripped up the build on arm. In any case, packages with the two patches applied are now on their way to Fedora 26 updates-testing (gjs-1.48.3-3.fc26)
After installing my own build (as described in comment #71, comment #80), I am getting hundreds of these warnings logged to syslog, which never happened before since updating to GNOME 3.24: gnome-shell[23838]: gclosure.c:724: unable to remove uninstalled invalidation notifier: 0x7f4970e47b50 (0x55ce3be324c0) gnome-shell[23838]: gclosure.c:724: unable to remove uninstalled invalidation notifier: 0x7f4970e47b50 (0x55ce39e450a0) gnome-shell[23838]: gclosure.c:724: unable to remove uninstalled invalidation notifier: 0x7f4970e47b50 (0x55ce39e45180) gnome-shell[23838]: gclosure.c:724: unable to remove uninstalled invalidation notifier: 0x7f4970e47b50 (0x55ce3be32240) gnome-shell[23838]: gclosure.c:724: unable to remove uninstalled invalidation notifier: 0x7f4970e47b50 (0x55ce3be3ea40) the first address does not changes, the second one does. I don't know whether that is related. Can anyone confirm or deny?
(In reply to Christian Stadelmann from comment #82) > the first address does not changes, the second one does. I don't know > whether that is related. Can anyone confirm or deny? Patches work without drawbacks for me. I don't see any new messages in log.
Created attachment 353799 [details] [review] util-root: Require GjsMaybeOwned callback to reset In the case of a closure, the GjsMaybeOwned object is embedded as part of struct Closure. The context destroy notify callback will invalidate the closure, which frees the GjsMaybeOwned object, causing a use-after-free when the callback returns and calls reset(). In practice we did not need to call reset() after the callback returns; all existing callbacks already call reset(). This patch adds a requirement that the callback *must* call reset(), and only calls it internally if there was no callback set.
The above patch is a possibly better replacement for "util-root: Allow GjsMaybeOwned::DestroyNotify to free".
*** Bug 783813 has been marked as a duplicate of this bug. ***
Review of attachment 353799 [details] [review]: Thanks, I think I prefer this approach.
(In reply to Cosimo Cecchi from comment #87) > Review of attachment 353799 [details] [review] [review]: > > Thanks, I think I prefer this approach. I can confirm that it works. I rebuild my gjs package and installed it and have been using it for a while now. Since I found no reliable way to reproduce the behavior described in comment #82, I cannot confirm nor deny it is fixed.
I can catch this one in the existing tests with -fsanitize=address though, so I think we can consider that it solves this particular problem. Thanks everyone for the help! I will be releasing a new version of GJS with these fixes from the stable branch shortly, so that distros can update. Attachment 353799 [details] pushed as 53e0c86 - util-root: Require GjsMaybeOwned callback to reset
This is now released in GJS 1.48.4.
Still getting crashes after updating gjs to 1.48.4 unfortunately Arch Linux gjs 1.48.4-1 gnome-shell 3.24.2-1 wayland 1.13.0-1 js38 38.8.0-3 PID: 4396 (gnome-shell) Signal: 11 (SEGV) Command Line: /usr/bin/gnome-shell Executable: /usr/bin/gnome-shell Control Group: /user.slice/user-1000.slice/session-c4.scope Unit: session-c4.scope Slice: user-1000.slice Session: c4 Message: Process 4396 (gnome-shell) of user 1000 dumped core. Stack trace of thread 4396: #0 0x00007fcbab260735 n/a (libgjs.so.0) #1 0x00007fcba937a8b5 g_main_context_dispatch (libglib-2.0.so.0) #2 0x00007fcba937ac78 n/a (libglib-2.0.so.0) #3 0x00007fcba937af92 g_main_loop_run (libglib-2.0.so.0) #4 0x00007fcbaab3dfdc meta_run (libmutter-0.so.0) #5 0x0000000000401ff7 main (gnome-shell) #6 0x00007fcba8d8d43a __libc_start_main (libc.so.6) #7 0x000000000040212a n/a (gnome-shell) Stack trace of thread 4435: #0 0x00007fcba911f39d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007fcb9bb90d10 PR_WaitCondVar (libnspr4.so) #2 0x00007fcba417b811 n/a (libmozjs-38.so) #3 0x00007fcb9bb9688b n/a (libnspr4.so) #4 0x00007fcba9119297 start_thread (libpthread.so.0) #5 0x00007fcba8e5a25f __clone (libc.so.6) Stack trace of thread 4436: #0 0x00007fcba911f39d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007fcb9bb90d10 PR_WaitCondVar (libnspr4.so) #2 0x00007fcba417b811 n/a (libmozjs-38.so) #3 0x00007fcb9bb9688b n/a (libnspr4.so) #4 0x00007fcba9119297 start_thread (libpthread.so.0) #5 0x00007fcba8e5a25f __clone (libc.so.6) Stack trace of thread 4434: #0 0x00007fcba911f39d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007fcb9bb90d10 PR_WaitCondVar (libnspr4.so) #2 0x00007fcba417b811 n/a (libmozjs-38.so) #3 0x00007fcb9bb9688b n/a (libnspr4.so) #4 0x00007fcba9119297 start_thread (libpthread.so.0) #5 0x00007fcba8e5a25f __clone (libc.so.6) Stack trace of thread 4397: #0 0x00007fcba8e502bd poll (libc.so.6) #1 0x00007fcba937abf9 n/a (libglib-2.0.so.0) #2 0x00007fcba937ad0c g_main_context_iteration (libglib-2.0.so.0) #3 0x00007fcba937ad51 n/a (libglib-2.0.so.0) #4 0x00007fcba93a1ac5 n/a (libglib-2.0.so.0) #5 0x00007fcba9119297 start_thread (libpthread.so.0) #6 0x00007fcba8e5a25f __clone (libc.so.6) Stack trace of thread 4430: #0 0x00007fcba8e502bd poll (libc.so.6) #1 0x00007fcba519eee1 n/a (libpulse.so.0) #2 0x00007fcba51906f1 pa_mainloop_poll (libpulse.so.0) #3 0x00007fcba5190d8e pa_mainloop_iterate (libpulse.so.0) #4 0x00007fcba5190e40 pa_mainloop_run (libpulse.so.0) #5 0x00007fcba519ee29 n/a (libpulse.so.0) #6 0x00007fcb9a447fe8 n/a (libpulsecommon-10.0.so) #7 0x00007fcba9119297 start_thread (libpthread.so.0) #8 0x00007fcba8e5a25f __clone (libc.so.6) Stack trace of thread 4438: #0 0x00007fcba911f39d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007fcb9bb90d10 PR_WaitCondVar (libnspr4.so) #2 0x00007fcba417b811 n/a (libmozjs-38.so) #3 0x00007fcb9bb9688b n/a (libnspr4.so) #4 0x00007fcba9119297 start_thread (libpthread.so.0) #5 0x00007fcba8e5a25f __clone (libc.so.6) Stack trace of thread 8496: #0 0x00007fcba8e553b9 syscall (libc.so.6) #1 0x00007fcba93bfc7a g_cond_wait_until (libglib-2.0.so.0) #2 0x00007fcba934f121 n/a (libglib-2.0.so.0) #3 0x00007fcba93a2464 n/a (libglib-2.0.so.0) #4 0x00007fcba93a1ac5 n/a (libglib-2.0.so.0) #5 0x00007fcba9119297 start_thread (libpthread.so.0) #6 0x00007fcba8e5a25f __clone (libc.so.6) Stack trace of thread 4400: #0 0x00007fcba8e502bd poll (libc.so.6) #1 0x00007fcba937abf9 n/a (libglib-2.0.so.0) #2 0x00007fcba937ad0c g_main_context_iteration (libglib-2.0.so.0) #3 0x00007fcb8cbfc55d n/a (libdconfsettings.so) #4 0x00007fcba93a1ac5 n/a (libglib-2.0.so.0) #5 0x00007fcba9119297 start_thread (libpthread.so.0) #6 0x00007fcba8e5a25f __clone (libc.so.6) Stack trace of thread 8509: #0 0x00007fcba8e553b9 syscall (libc.so.6) #1 0x00007fcba93bfc7a g_cond_wait_until (libglib-2.0.so.0) #2 0x00007fcba934f121 n/a (libglib-2.0.so.0) #3 0x00007fcba93a2464 n/a (libglib-2.0.so.0) #4 0x00007fcba93a1ac5 n/a (libglib-2.0.so.0) #5 0x00007fcba9119297 start_thread (libpthread.so.0) #6 0x00007fcba8e5a25f __clone (libc.so.6) Stack trace of thread 4437: #0 0x00007fcba911f39d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007fcb9bb90d10 PR_WaitCondVar (libnspr4.so) #2 0x00007fcba417b811 n/a (libmozjs-38.so) #3 0x00007fcb9bb9688b n/a (libnspr4.so) #4 0x00007fcba9119297 start_thread (libpthread.so.0) #5 0x00007fcba8e5a25f __clone (libc.so.6) Stack trace of thread 4439: #0 0x00007fcba911f39d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007fcb9bb90d10 PR_WaitCondVar (libnspr4.so) #2 0x00007fcba417b811 n/a (libmozjs-38.so) #3 0x00007fcb9bb9688b n/a (libnspr4.so) #4 0x00007fcba9119297 start_thread (libpthread.so.0) #5 0x00007fcba8e5a25f __clone (libc.so.6) Stack trace of thread 8507: #0 0x00007fcba8e553b9 syscall (libc.so.6) #1 0x00007fcba93bfc7a g_cond_wait_until (libglib-2.0.so.0) #2 0x00007fcba934f121 n/a (libglib-2.0.so.0) #3 0x00007fcba93a2464 n/a (libglib-2.0.so.0) #4 0x00007fcba93a1ac5 n/a (libglib-2.0.so.0) #5 0x00007fcba9119297 start_thread (libpthread.so.0) #6 0x00007fcba8e5a25f __clone (libc.so.6) Stack trace of thread 8508: #0 0x00007fcba8e553b9 syscall (libc.so.6) #1 0x00007fcba93bfc7a g_cond_wait_until (libglib-2.0.so.0) #2 0x00007fcba934f121 n/a (libglib-2.0.so.0) #3 0x00007fcba93a2464 n/a (libglib-2.0.so.0) #4 0x00007fcba93a1ac5 n/a (libglib-2.0.so.0) #5 0x00007fcba9119297 start_thread (libpthread.so.0) #6 0x00007fcba8e5a25f __clone (libc.so.6) Stack trace of thread 4441: #0 0x00007fcba911f39d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007fcb9bb90d10 PR_WaitCondVar (libnspr4.so) #2 0x00007fcba417b811 n/a (libmozjs-38.so) #3 0x00007fcb9bb9688b n/a (libnspr4.so) #4 0x00007fcba9119297 start_thread (libpthread.so.0) #5 0x00007fcba8e5a25f __clone (libc.so.6) Stack trace of thread 4440: #0 0x00007fcba911f39d pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0) #1 0x00007fcb9bb90d10 PR_WaitCondVar (libnspr4.so) #2 0x00007fcba417b811 n/a (libmozjs-38.so) #3 0x00007fcb9bb9688b n/a (libnspr4.so) #4 0x00007fcba9119297 start_thread (libpthread.so.0) #5 0x00007fcba8e5a25f __clone (libc.so.6) Stack trace of thread 4398: #0 0x00007fcba8e502bd poll (libc.so.6) #1 0x00007fcba937abf9 n/a (libglib-2.0.so.0) #2 0x00007fcba937af92 g_main_loop_run (libglib-2.0.so.0) #3 0x00007fcba9962426 n/a (libgio-2.0.so.0) #4 0x00007fcba93a1ac5 n/a (libglib-2.0.so.0) #5 0x00007fcba9119297 start_thread (libpthread.so.0) #6 0x00007fcba8e5a25f __clone (libc.so.6)
(In reply to fedor from comment #91) > Still getting crashes after updating gjs to 1.48.4 unfortunately > > Arch Linux Your backtrace is missing debug symbols. Can you please install them and retry? Are there any steps to reproduce this crash? Still, I think you are running into a different bug, because the one in comment #3 has a deep stack inside libmozjs, and yours does not.
I'm on Arch with the same latest versions as fedor listed above and I got a assertion crash yesterday: Jun 16 19:00:27 pc org.gnome.Shell.desktop[553]: ^[[0;1;39mGjs:ERROR:./gjs/jsapi-util-root.h:317:void GjsMaybeOwned<T>::trace(JSTracer*, const char*) [with\ T = JS::Value]: assertion failed: (!m_rooted) Jun 16 19:00:27 pc systemd[1]: Started Process Core Dump (PID 2741/UID 0). Stack trace of thread 553: #0 0x00007fc60d49a670 raise (libc.so.6) #1 0x00007fc60d49bd00 abort (libc.so.6) #2 0x00007fc60da9ac9d g_assertion_message (libglib-2.0.so.0) #3 0x00007fc60da9ad2a g_assertion_message_expr (libglib-2.0.so.0) #4 0x00007fc60f965c54 n/a (libgjs.so.0) #5 0x00007fc608851f6d n/a (libmozjs-38.so) #6 0x00007fc60882e5cd n/a (libmozjs-38.so) #7 0x00007fc608b85e90 n/a (libmozjs-38.so) #8 0x00007fc608bb1ede n/a (libmozjs-38.so) #9 0x00007fc608bb28c0 n/a (libmozjs-38.so) #10 0x00007fc608bb2b0d n/a (libmozjs-38.so) #11 0x00007fc608bb2ed4 n/a (libmozjs-38.so) #12 0x00007fc60f97ef49 gjs_schedule_gc_if_needed (libgjs.so.0) #13 0x00007fc60f97efb4 gjs_call_function_value (libgjs.so.0) #14 0x00007fc60f95a0b5 gjs_closure_invoke (libgjs.so.0) #15 0x00007fc60f971e1c n/a (libgjs.so.0) #16 0x00007fc60dd4cead g_closure_invoke (libgobject-2.0.so.0) #17 0x00007fc60dd68f1c n/a (libgobject-2.0.so.0) #18 0x00007fc60da75333 n/a (libglib-2.0.so.0) #19 0x00007fc60da748b5 g_main_context_dispatch (libglib-2.0.so.0) #20 0x00007fc60da74c78 n/a (libglib-2.0.so.0) #21 0x00007fc60da74f92 g_main_loop_run (libglib-2.0.so.0) #22 0x00007fc60f237fdc meta_run (libmutter-0.so.0) #23 0x0000000000401ff7 main (gnome-shell) #24 0x00007fc60d48743a __libc_start_main (libc.so.6) #25 0x000000000040212a n/a (gnome-shell)
(In reply to Christian Stadelmann from comment #92) > (In reply to fedor from comment #91) > > Still getting crashes after updating gjs to 1.48.4 unfortunately > > > > Arch Linux > > Your backtrace is missing debug symbols. Can you please install them and > retry? Are there any steps to reproduce this crash? > > Still, I think you are running into a different bug, because the one in > comment #3 has a deep stack inside libmozjs, and yours does not. so I built gjs 1.48.4 with debug flags and got this: Message: Process 852 (gnome-shell) of user 1000 dumped core. Stack trace of thread 852: #0 0x00007f4f051f5735 _ZN2js9GCMethodsIP8JSObjectE16needsPostBarrierES2_ (libgjs.so.0) #1 0x00007f4f0330f8b5 g_main_context_dispatch (libglib-2.0.so.0) #2 0x00007f4f0330fc78 n/a (libglib-2.0.so.0) #3 0x00007f4f0330ff92 g_main_loop_run (libglib-2.0.so.0) #4 0x00007f4f04ad2fdc meta_run (libmutter-0.so.0) #5 0x0000000000401ff7 main (gnome-shell) #6 0x00007f4f02d2243a __libc_start_main (libc.so.6) #7 0x000000000040212a n/a (gnome-shell) I got this crash approximately 5-10 minutes after start of gnome session while surfing the web in google-chrome. Is there anything else needed to be built with debug flags on?
>Is there anything else needed to be built with debug flags on? https://wiki.archlinux.org/index.php/Debug_-_Getting_Traces
Fedor: I have opened bug 783935 for this, please follow up there. Mark Blakeney: Would it be possible to get a stack trace with debug symbols and also execute `call gjs_dumpstack()` in GDB? If so, please open a new bug. Reproducer instructions would also be very helpful. All readers: ============ The problems originally described by the stack traces on this bug report have supposedly been fixed now. Please do not post new stack traces on this bug unless you are *sure* that they describe the original problem, and that the fix in 1.48.4 was faulty. Of course, there may be one or even several problems still in existence that cause gnome-shell to crash for you! Here's what you can do instead. - Check if your stack trace matches one of these bugs. These are the gnome-shell crashes currently open (or opened but later closed due to lack of information) * bug 782464 * bug 782692 * bug 783771 * bug 783935 Post reproducer info there, stack traces with debug symbols, and output of `call gjs_dumpstack()` from GDB. If the bug was closed as INCOMPLETE but you can provide the missing information, fantastic! Feel free to reopen it. - If none of the above bugs match your stack trace, and no-one else has reported a similar stack trace to yours in the meantime, then please open a new bug. The reason I ask this is not to be bureaucratic or to deny that crashes are happening, but to keep the information manageable for myself as I fix these bugs. If all of the stack traces from unrelated problems are posted here, then I will lose track of which ones are fixed and which ones are not. Thank you.
*** Bug 783904 has been marked as a duplicate of this bug. ***
Still gnome-shell crash on Wayland!! Version 3.24.2.
Created attachment 354522 [details] Still gnome-shell crash on Wayland!! Version 3.24.2
Jeyhunn: Thanks for the report, but I'm afraid it is not very helpful in that form. Can you please read comment 96 and check if your stack trace matches one of the crasher bugs listed there?
(In reply to Philip Chimento from comment #100) > Jeyhunn: Thanks for the report, but I'm afraid it is not very helpful in > that form. Can you please read comment 96 and check if your stack trace > matches one of the crasher bugs listed there? Hi Philip Chimento, Seems like bug 783771, but gnome-shell suddenly crashed after playing html5 video (youtube) in Chrome,
Created attachment 354530 [details] Systemd CoreDump Info for Gnome-Shell crash
It is almost certainly bug 783771, please follow up there. If you can provide the missing information it would be helpful.