After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 791761 - Occasional crash on startup due to fatal error "toggling down object GSettings that's already queued to toggle up"
Occasional crash on startup due to fatal error "toggling down object GSetting...
Status: RESOLVED OBSOLETE
Product: glib
Classification: Platform
Component: gsettings
2.52.x
Other Linux
: Normal blocker
: ---
Assigned To: Allison Karlitskaya (desrt)
gtkdev
Depends on:
Blocks:
 
 
Reported: 2017-12-18 23:25 UTC by Adam Williamson
Modified: 2018-05-24 20:03 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
backtrace of the most recent occurrence of the crash (49.96 KB, text/plain)
2017-12-18 23:25 UTC, Adam Williamson
Details

Description Adam Williamson 2017-12-18 23:25:18 UTC
Created attachment 365726 [details]
backtrace of the most recent occurrence of the crash

I've been de-duping erroneously duplicated crashes and sorting them into groups downstream (in Fedora) lately. This bug is one case that's been clarified by that.

The downstream report I'm using for this crash is https://bugzilla.redhat.com/show_bug.cgi?id=1370073 . The crash is pretty simple: sometimes, GNOME crashes during startup. The backtrace indicates the crash was caused by a fatal error in gjs:

"toggling down object GSettings that's already queued to toggle up"

This can be found in gjs here:

https://gitlab.gnome.org/GNOME/gjs/blob/master/gi/object.cpp#L1099

The desktop automated tests we run via openQA - https://openqa.fedoraproject.org/ - run into this crash periodically when simply logging into a freshly-installed system. The most recent case of such a crash I have found is https://openqa.fedoraproject.org/tests/180655 , which occurred with Fedora 26 three days ago (2017-12-15), with the latest Fedora 26 gnome-shell and gjs packages: gnome-shell-3.24.3-2.fc26.x86_64 and gjs-1.48.7-1.fc26.x86_64 .

I am not currently 100% sure if this crash is still present in Fedora 27 and Rawhide; the most recent Shell crash I could find in a Fedora 27 test is a little over a month old and its logs have been garbage collected, and Rawhide is currently too broken for it to be practical to find a crash like this (there are just too many failures at the same point for other reasons). But after quite a lot of poking in gnome-shell and gjs (and glib) commit logs and bugs I could not find any upstream bug report or commit indicating this crash had been specifically encountered and fixed, so I figured it's at least worth reporting.

I'll attach a full backtrace from the most recent F26 occurrence.
Comment 1 Adam Williamson 2017-12-18 23:37:37 UTC
I can follow the backtrace about as far as thinking that this is *probably* happening in one of the g_settings_backend_(foo)_changed() functions in glib/glio/gsettingsbackend.c functions that handles signals relating to gsettings keys changing in some way, but that's about as far as I can get.
Comment 2 Adam Williamson 2017-12-18 23:47:17 UTC
It's not much use, but just for the record, this is what's in the user journal around the time of the crash:

Dec 15 06:39:13 localhost.localdomain gnome-session-binary[1313]: Entering running state
Dec 15 06:39:13 localhost.localdomain dbus-daemon[1310]: [session uid=1000 pid=1310] Successfully activated service 'org.freedesktop.Tracker1'
Dec 15 06:39:13 localhost.localdomain systemd[1290]: Started Tracker metadata database store and lookup manager.
Dec 15 06:39:13 localhost.localdomain tracker-store.desktop[1721]: (uint32 1,)
Dec 15 06:39:13 localhost.localdomain dbus-daemon[1310]: [session uid=1000 pid=1310] Successfully activated service 'org.gnome.evolution.dataserver.AddressBook9'
Dec 15 06:39:13 localhost.localdomain systemd[1290]: Started Evolution address book service.
Dec 15 06:39:14 localhost.localdomain dbus-daemon[1310]: [session uid=1000 pid=1310] Activating via systemd: service name='org.gtk.vfs.Metadata' unit='gvfs-metadata.service' requested by ':1.64' (uid=1000 pid=1716 comm="/usr/libexec/evolution-calendar-factory-subprocess" label="unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023")
Dec 15 06:39:14 localhost.localdomain systemd[1290]: Starting Virtual filesystem metadata service...
Dec 15 06:39:14 localhost.localdomain dbus-daemon[1310]: [session uid=1000 pid=1310] Successfully activated service 'org.gtk.vfs.Metadata'
Dec 15 06:39:14 localhost.localdomain systemd[1290]: Started Virtual filesystem metadata service.
Dec 15 06:39:14 localhost.localdomain gnome-shell[1398]: Error looking up permission: GDBus.Error:org.freedesktop.portal.Error.NotFound: No entry for geolocation
Dec 15 06:39:14 localhost.localdomain gsd-color[1584]: failed to get edid: unable to get EDID for output
Dec 15 06:39:14 localhost.localdomain gsd-color[1584]: unable to get EDID for xrandr-Virtual-1: unable to get EDID for output
Dec 15 06:39:14 localhost.localdomain gnome-shell[1398]: STACK_OP_ADD: window 0x2200001 already in stack
Dec 15 06:39:14 localhost.localdomain gnome-shell[1398]: STACK_OP_ADD: window 0x2200001 already in stack
Dec 15 06:39:15 localhost.localdomain libcanberra-login-sound.desktop[1720]: Failed to play sound: File or data not found
Dec 15 06:39:15 localhost.localdomain gnome-shell[1398]: GNOME Shell started at Fri Dec 15 2017 09:39:11 GMT-0500 (EST)
Dec 15 06:39:16 localhost.localdomain gnome-shell[1398]: toggling down object GSettings that's already queued to toggle up
Dec 15 06:39:18 localhost.localdomain abrt-applet[1738]: Error reading events from display: Broken pipe
Dec 15 06:39:18 localhost.localdomain evolution-alarm[1732]: Error reading events from display: Broken pipe
Dec 15 06:39:18 localhost.localdomain gnome-software[1729]: Error reading events from display: Connection reset by peer
Dec 15 06:39:18 localhost.localdomain org.gnome.Shell.desktop[1398]: (EE)
Dec 15 06:39:18 localhost.localdomain org.gnome.Shell.desktop[1398]: Fatal server error:
Dec 15 06:39:18 localhost.localdomain org.gnome.Shell.desktop[1398]: (EE) failed to read Wayland events: Broken pipe
Dec 15 06:39:18 localhost.localdomain org.gnome.Shell.desktop[1398]: (EE)

And the system journal:

Dec 15 06:39:13 localhost.localdomain dbus-daemon[610]: [system] Activating via systemd: service name='org.freedesktop.locale1' unit='dbus-org.freedesktop.locale1.service' requested by ':1.74' (uid=1000 pid=1594 comm="/usr/libexec/gsd-keyboard " label="unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023")
Dec 15 06:39:13 localhost.localdomain systemd[1]: Starting Locale Service...
Dec 15 06:39:13 localhost.localdomain dbus-daemon[610]: [system] Successfully activated service 'org.freedesktop.locale1'
Dec 15 06:39:13 localhost.localdomain systemd[1]: Started Locale Service.
Dec 15 06:39:13 localhost.localdomain audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-localed comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Dec 15 06:39:14 localhost.localdomain gsd-color[1139]: unable to get EDID for xrandr-Virtual-1: unable to get EDID for output
Dec 15 06:39:15 localhost.localdomain dbus-daemon[610]: [system] Activating service name='org.freedesktop.problems' requested by ':1.78' (uid=1000 pid=1738 comm="abrt-applet " label="unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023") (using servicehelper)
Dec 15 06:39:15 localhost.localdomain dbus-daemon[610]: [system] Successfully activated service 'org.freedesktop.problems'
Dec 15 06:39:16 localhost.localdomain audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=geoclue comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Dec 15 06:39:16 localhost.localdomain kernel: do_trap: 68 callbacks suppressed
Dec 15 06:39:16 localhost.localdomain kernel: traps: gnome-shell[1398] trap int3 ip:7fd27ddd4e51 sp:7fffd776ed30 error:0 in libglib-2.0.so.0.5200.3[7fd27dd85000+110000]
Dec 15 06:39:16 localhost.localdomain audit[1398]: ANOM_ABEND auid=1000 uid=1000 gid=1000 ses=2 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=1398 comm="gnome-shell" exe="/usr/bin/gnome-shell" sig=5 res=1
Dec 15 06:39:16 localhost.localdomain dbus-daemon[610]: [system] Activating via systemd: service name='org.freedesktop.fwupd' unit='fwupd.service' requested by ':1.79' (uid=1000 pid=1729 comm="/usr/bin/gnome-software --gapplication-service " label="unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023")
Dec 15 06:39:16 localhost.localdomain systemd[1]: Starting Firmware update daemon...
Dec 15 06:39:16 localhost.localdomain audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-coredump@0-1932-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Dec 15 06:39:16 localhost.localdomain systemd[1]: Created slice system-systemd\x2dcoredump.slice.
Dec 15 06:39:16 localhost.localdomain systemd[1]: Started Process Core Dump (PID 1932/UID 0).
Dec 15 06:39:17 localhost.localdomain fwupd[1933]: disabling plugin because: failed to startup amt: Unable to find a ME interface
Dec 15 06:39:17 localhost.localdomain fwupd[1933]: disabling plugin because: failed to startup dell: Firmware updating not supported
Dec 15 06:39:17 localhost.localdomain fwupd[1933]: disabling plugin because: failed to startup test: Test plugin is only used for continuous integration
Dec 15 06:39:17 localhost.localdomain fwupd[1933]: disabling plugin because: failed to coldplug raspberrypi: Raspberry PI firmware updating not supported, no /boot/start.elf
Dec 15 06:39:17 localhost.localdomain fwupd[1933]: disabling plugin because: failed to coldplug synapticsmst: MST firmware updating not supported by OEM
Dec 15 06:39:17 localhost.localdomain fwupd[1933]: disabling plugin because: failed to coldplug uefi: UEFI firmware updating not supported
Dec 15 06:39:17 localhost.localdomain fwupd[1933]: using plugins: altos, colorhug, dfu, ebitdo, steelseries, udev, unifying, upower, usb
Dec 15 06:39:17 localhost.localdomain fwupd[1933]: Daemon ready for requests
Dec 15 06:39:17 localhost.localdomain dbus-daemon[610]: [system] Successfully activated service 'org.freedesktop.fwupd'
Dec 15 06:39:17 localhost.localdomain audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=fwupd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Dec 15 06:39:17 localhost.localdomain systemd[1]: Started Firmware update daemon.
Dec 15 06:39:18 localhost.localdomain polkitd[651]: Unregistered Authentication Agent for unix-session:2 (system bus name :1.48, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8) (disconnected from bus)
Dec 15 06:39:19 localhost.localdomain gdm-password][1278]: pam_unix(gdm-password:session): session closed for user test
Dec 15 06:39:19 localhost.localdomain audit[1278]: USER_END pid=1278 uid=0 auid=1000 ses=2 subj=system_u:system_r:xdm_t:s0-s0:c0.c1023 msg='op=PAM:session_close grantors=pam_selinux,pam_loginuid,pam_selinux,pam_keyinit,pam_namespace,pam_keyinit,pam_limits,pam_systemd,pam_unix,pam_gnome_keyring acct="test" exe="/usr/libexec/gdm-session-worker" hostname=localhost.localdomain addr=? terminal=/dev/tty2 res=success'
Dec 15 06:39:19 localhost.localdomain audit[1278]: USER_LOGOUT pid=1278 uid=0 auid=1000 ses=2 subj=system_u:system_r:xdm_t:s0-s0:c0.c1023 msg='uid=1000 exe="/usr/libexec/gdm-session-worker" hostname=? addr=? terminal=? res=success'
Dec 15 06:39:19 localhost.localdomain audit[1278]: CRED_DISP pid=1278 uid=0 auid=1000 ses=2 subj=system_u:system_r:xdm_t:s0-s0:c0.c1023 msg='op=PAM:setcred grantors=pam_unix,pam_gnome_keyring acct="test" exe="/usr/libexec/gdm-session-worker" hostname=localhost.localdomain addr=? terminal=/dev/tty2 res=success'
Dec 15 06:39:19 localhost.localdomain gnome-shell[841]: Failed to apply DRM plane transform 0: Invalid argument
Dec 15 06:39:19 localhost.localdomain gsd-color[1139]: unable to get EDID for xrandr-Virtual-1: unable to get EDID for output
Dec 15 06:39:19 localhost.localdomain systemd-logind[646]: Removed session 2.
Dec 15 06:39:19 localhost.localdomain systemd[1]: Stopping User Manager for UID 1000...
Dec 15 06:39:19 localhost.localdomain systemd[1]: Stopped User Manager for UID 1000.
Dec 15 06:39:19 localhost.localdomain audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=user@1000 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Dec 15 06:39:19 localhost.localdomain systemd[1]: Removed slice User Slice of test.
Dec 15 06:39:19 localhost.localdomain gsd-color[1139]: unable to get EDID for xrandr-Virtual-1: unable to get EDID for output
Dec 15 06:39:20 localhost.localdomain audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-coredump@0-1932-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Dec 15 06:39:22 localhost.localdomain kernel: random: crng init done
Dec 15 06:39:22 localhost.localdomain abrt-notification[2061]: Process 1398 (gnome-shell) crashed in _g_log_abort()
Comment 3 Philip Withnall 2017-12-19 09:17:04 UTC
Philip Chimento, do you have any ideas about this? It looks like some kind of missing synchronisation between GSettings and its binding in GJS to me, but I don’t really know anything about this toggle up/down stuff.

Adam, thanks for reporting.
Comment 4 Florian Müllner 2017-12-19 18:43:31 UTC
Bug 702280 suggests the code in question was refactored, but I'm not entirely sure about the version - I *think* code prior to January 17 should be in 3.24/f26 ...
Comment 5 Adam Williamson 2017-12-19 18:50:26 UTC
Well, I did come across that crash, but note it was actually slightly different: the error in that trace is "toggling down object GDBusConnection that's already queued to toggle up" (GDBusConnection, not GSettings).

There's an obvious potentially-related commit that's in 3.26 / master but not 3.24: "object: Queue object toggles and drain them in idle" (which was committed as 9f0e50f5d01314ca3b9182724a27d5278943fcf6 , reverted as 96299394adf714058c947499fb7fe4a6025aa706 , then committed again as 0cf7f839a917ff3346d3266d491eed67230f6128 . But the bug report for that commit - https://bugzilla.gnome.org/show_bug.cgi?id=778862 - mentions lots of things but *not* any crashes precisely like this one, and with my limited knowledge of this code and C++ I couldn't satisfy myself that it clearly would address the cause of this crash. Maybe you can see that it does, though, in which case the obvious idea would be to apply it to 3.24 / F26 too...
Comment 6 Adam Williamson 2017-12-19 18:54:50 UTC
On the timing, 1.48.0 was tagged on 2017-03-20, and that tag is present on master branch. Further 1.48 release tags are only on the gnome-3-24 branch. So I'm figuring everything on master branch up to 2017-03-20, presumably including whatever refactoring was referred to in 702280, is in the gnome-3-24 branch.
Comment 7 Florian Müllner 2017-12-19 19:41:13 UTC
(In reply to Adam Williamson from comment #5)
> Well, I did come across that crash, but note it was actually slightly
> different: the error in that trace is "toggling down object GDBusConnection
> that's already queued to toggle up" (GDBusConnection, not GSettings).

Right, but the warning in gjs is related to ownership handling of GObjects, which is the super class of both GDBusConnection and GSettings.

But now that you mention it, this may be a gnome-shell issue. I'm fairly certain that connecting a signal handler used to keep a wrapper object alive (that is, prevent it from getting garbage collected) - is this correct and still the case Philip (Chimento, not Withnall)?

Otherwise there's code in js/ui/status/accessibility.js that relies on this and will need fixing ...
Comment 8 Philip Chimento 2017-12-23 16:44:48 UTC
(In reply to Florian Müllner from comment #4)
> Bug 702280 suggests the code in question was refactored, but I'm not
> entirely sure about the version - I *think* code prior to January 17 should
> be in 3.24/f26 ...

I'm not so sure that bug was related, but the gc_blocked thing was refactored out even before I became the GJS maintainer.

(In reply to Florian Müllner from comment #7)
> (In reply to Adam Williamson from comment #5)
> > Well, I did come across that crash, but note it was actually slightly
> > different: the error in that trace is "toggling down object GDBusConnection
> > that's already queued to toggle up" (GDBusConnection, not GSettings).
> 
> Right, but the warning in gjs is related to ownership handling of GObjects,
> which is the super class of both GDBusConnection and GSettings.
> 
> But now that you mention it, this may be a gnome-shell issue. I'm fairly
> certain that connecting a signal handler used to keep a wrapper object alive
> (that is, prevent it from getting garbage collected) - is this correct and
> still the case Philip (Chimento, not Withnall)?

I don't believe that is the case (and may have only unintentionally worked before I refactored that code, since the garbage collector has gotten a lot better in recent Firefox releases.) If it were, that would be a huge source of memory leaks.
Comment 9 Ariel 2017-12-24 19:27:51 UTC
Since upgrading to fedora 27/shell 3.26, I'm seeing several daily gnomeshell crashes, it's so unstable. Sometimes while using nautilus, sometimes while coming out of suspend-to-RAM and trying to log back in.

I had to move away from wayland, back to Xorg so at least the system is usable and I don't lose my ongoing work when gnome-shell crashes in the middle of a session taking down the entire session.




systemd-coredump[5390]: Process 2039 (gnome-shell) of user 1000 dumped core.
                                                            
                                                            Stack trace of thread 2039:
                                                            #0  0x00007f560cb8be51 _g_log_abort (libglib-2.0.so.0)
                                                            #1  0x00007f560cb8ce8c g_log_default_handler (libglib-2.0.so.0)
                                                            #2  0x00005571bb877945 default_log_handler (gnome-shell)
                                                            #3  0x00007f560cb8d11d g_logv (libglib-2.0.so.0)
                                                            #4  0x00007f560cb8d28f g_log (libglib-2.0.so.0)
                                                            #5  0x00007f561434e09e n/a (libgjs.so.0)
                                                            #6  0x00007f5609a65563 _ZL14FinalizeArenasPN2js6FreeOpEPPNS_2gc11ArenaHe
                                                            #7  0x00007f5609ac02b3 _ZN2js2gc10ArenaLists16forceFinalizeNowEPNS_6Free
                                                            #8  0x00007f5609a66831 _ZN2js2gc10ArenaLists30queueForegroundObjectsForS
                                                            #9  0x00007f5609a7d1e9 _ZN2js2gc9GCRuntime22beginSweepingZoneGroupEv (li
                                                            #10 0x00007f5609a7d9c8 _ZN2js2gc9GCRuntime15beginSweepPhaseEb (libmozjs-
                                                            #11 0x00007f5609a7fbe3 _ZN2js2gc9GCRuntime23incrementalCollectSliceERNS_
                                                            #12 0x00007f5609a805d2 _ZN2js2gc9GCRuntime7gcCycleEbRNS_11SliceBudgetEN2
                                                            #13 0x00007f5609a8082d _ZN2js2gc9GCRuntime7collectEbNS_11SliceBudgetEN2J
                                                            #14 0x00007f5609a80ba4 _ZN2js2gc9GCRuntime2gcE18JSGCInvocationKindN2JS8g
                                                            #15 0x00007f561435a1fc n/a (libgjs.so.0)
                                                            #16 0x00007f560ce62e88 g_object_unref (libgobject-2.0.so.0)
                                                            #17 0x00005571bb8774cb main (gnome-shell)
                                                            #18 0x00007f560afc04da __libc_start_main (libc.so.6)
                                                            #19 0x00005571bb8775ba _start (gnome-shell)
                                                            
                                                            Stack trace of thread 2043:
                                                            #0  0x00007f560b37e81b pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
                                                            #1  0x00007f55e6a4e64b util_queue_thread_func (radeonsi_dri.so)
                                                            #2  0x00007f55e6a4e387 impl_thrd_routine (radeonsi_dri.so)
                                                            #3  0x00007f560b37836d start_thread (libpthread.so.0)
                                                            #4  0x00007f560b0b0b8f __clone (libc.so.6)
                                                            
                                                            Stack trace of thread 2046:
                                                            #0  0x00007f560b37e81b pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
                                                            #1  0x00007f55e6a4e64b util_queue_thread_func (radeonsi_dri.so)
                                                            #2  0x00007f55e6a4e387 impl_thrd_routine (radeonsi_dri.so)
                                                            #3  0x00007f560b37836d start_thread (libpthread.so.0)
                                                            #4  0x00007f560b0b0b8f __clone (libc.so.6)
                                                            
                                                            Stack trace of thread 2047:
                                                            #0  0x00007f560b37e81b pthread_cond_wait@@GLIBC_2.3.2 (libpthread.so.0)
                                                            #1  0x00007f55e6a4e64b util_queue_thread_func (radeonsi_dri.so)
                                                            #2  0x00007f55e6a4e387 impl_thrd_routine (radeonsi_dri.so)
                                                            #3  0x00007f560b37836d start_thread (libpthread.so.0)
                                                            #4  0x00007f560b0b0b8f __clone (libc.so.6)
                                                            
                                                            Stack trace of thread 2040:
                                                            #0  0x00007f560b0a4a9d poll (libc.so.6)
                                                            #1  0x00007f560cb86569 g_main_context_iterate.isra.25 (libglib-2.0.so.0)
                                                            #2  0x00007f560cb8667c g_main_context_iteration (libglib-2.0.so.0)
                                                            #3  0x00007f560cb866c1 glib_worker_main (libglib-2.0.so.0)
Comment 10 GNOME Infrastructure Team 2018-05-24 20:03:42 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/glib/issues/1317.