GNOME Bugzilla – Bug 780213
gdm enters a respawn loop once the main process dies (insufficient permissions)
Last modified: 2017-05-10 19:00:00 UTC
Version: 3.22.3 Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857995 Once the main gdm process dies, we enter a respawn loop, effectivly DoSing your system. Reproducing the problem is rather simple. Simply kill the main gdm process via kill -9 and let system respawn it, or switch to the console as root and run "systemctl restart gdm" I could reproduce the problem in a Fedora 25 VM and on an up-to-date Debian Stretch system with GNOME 3.22. I get lots of gdm-x-session[1540]: (EE) xf86OpenConsole: Switching VT failed in the journal (full log attached). I assume the interaction with logind is somehow broken. https://bugzilla.redhat.com/show_bug.cgi?id=1335511 is most likely the idential bug report bug was closed due to EOL. If you need more information, please don't hesitate to ask.
Created attachment 348193 [details] journal log
Created attachment 348195 [details] journal log Fedora 25
The interesting bit is, that it's gdm itself doing the respawning, i.e. it's not the main gdm process which dies and is restarted by systemd, but gdm repeatedly trying to start a new session. Mai 02 13:29:52 pluto systemd[1]: Starting GNOME Display Manager... Mai 02 13:29:52 pluto systemd[1]: Started GNOME Display Manager. Mai 02 13:29:52 pluto gdm-launch-environment][755]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0) Mai 02 13:30:00 pluto gdm-password][1509]: pam_unix(gdm-password:session): session opened for user michael by (uid=0) Mai 02 13:30:48 pluto systemd[1]: Stopping GNOME Display Manager... Mai 02 13:30:48 pluto gdm3[740]: GLib: g_hash_table_find: assertion 'version == hash_table->version' failed Mai 02 13:30:48 pluto systemd[1]: Stopped GNOME Display Manager. Mai 02 13:30:48 pluto systemd[1]: Starting GNOME Display Manager... Mai 02 13:30:48 pluto systemd[1]: Started GNOME Display Manager. Mai 02 13:30:48 pluto gdm-launch-environment][2286]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0) Mai 02 13:30:48 pluto gdm3[2282]: GdmDisplay: display lasted 0.187789 seconds Mai 02 13:30:48 pluto gdm3[2282]: Child process -2290 was already dead. Mai 02 13:30:48 pluto gdm3[2282]: Child process 2286 was already dead. Mai 02 13:30:48 pluto gdm3[2282]: Unable to kill session worker process Mai 02 13:30:48 pluto gdm-launch-environment][2306]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0) Mai 02 13:30:48 pluto gdm3[2282]: Child process -2309 was already dead. Mai 02 13:30:48 pluto gdm3[2282]: Child process 2306 was already dead. Mai 02 13:30:48 pluto gdm3[2282]: Unable to kill session worker process Mai 02 13:30:48 pluto gdm-launch-environment][2314]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0) Mai 02 13:30:48 pluto gdm3[2282]: GdmDisplay: display lasted 0.128089 seconds Mai 02 13:30:48 pluto gdm3[2282]: Child process -2317 was already dead. Mai 02 13:30:48 pluto gdm3[2282]: Child process 2314 was already dead. Mai 02 13:30:48 pluto gdm3[2282]: Unable to kill session worker process Mai 02 13:30:48 pluto gdm-launch-environment][2333]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0) Mai 02 13:30:48 pluto gdm3[2282]: Child process -2336 was already dead. Mai 02 13:30:48 pluto gdm3[2282]: Child process 2333 was already dead. Mai 02 13:30:48 pluto gdm3[2282]: Unable to kill session worker process Mai 02 13:30:48 pluto gdm-launch-environment][2340]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0) Mai 02 13:30:49 pluto gdm3[2282]: GdmDisplay: display lasted 0.140515 seconds Mai 02 13:30:49 pluto gdm3[2282]: Child process -2343 was already dead. Mai 02 13:30:49 pluto gdm3[2282]: Child process 2340 was already dead. Mai 02 13:30:49 pluto gdm3[2282]: Unable to kill session worker process Mai 02 13:30:49 pluto gdm-launch-environment][2359]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0) Mai 02 13:30:49 pluto gdm3[2282]: Child process -2362 was already dead. Mai 02 13:30:49 pluto gdm3[2282]: Child process 2359 was already dead. Mai 02 13:30:49 pluto gdm3[2282]: Unable to kill session worker process Mai 02 13:30:49 pluto gdm-launch-environment][2366]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0) Mai 02 13:30:49 pluto gdm3[2282]: GdmDisplay: display lasted 0.139748 seconds Mai 02 13:30:49 pluto gdm3[2282]: Child process -2369 was already dead. Mai 02 13:30:49 pluto gdm3[2282]: Child process 2366 was already dead. Mai 02 13:30:49 pluto gdm3[2282]: Unable to kill session worker process Mai 02 13:30:49 pluto gdm-launch-environment][2385]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0) Mai 02 13:30:49 pluto gdm3[2282]: Child process -2388 was already dead. Mai 02 13:30:49 pluto gdm3[2282]: Child process 2385 was already dead. Mai 02 13:30:49 pluto gdm3[2282]: Unable to kill session worker process Mai 02 13:30:49 pluto gdm-launch-environment][2392]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0) Mai 02 13:30:49 pluto gdm3[2282]: GdmDisplay: display lasted 0.128068 seconds Mai 02 13:30:49 pluto gdm3[2282]: Child process -2395 was already dead. Mai 02 13:30:49 pluto gdm3[2282]: Child process 2392 was already dead. Mai 02 13:30:49 pluto gdm3[2282]: Unable to kill session worker process Mai 02 13:30:49 pluto gdm-launch-environment][2411]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0) Mai 02 13:30:49 pluto gdm3[2282]: Child process -2414 was already dead. Mai 02 13:30:49 pluto gdm3[2282]: Child process 2411 was already dead. Mai 02 13:30:49 pluto gdm3[2282]: Unable to kill session worker process Mai 02 13:30:49 pluto gdm-launch-environment][2418]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0) Mai 02 13:30:49 pluto gdm3[2282]: GdmDisplay: display lasted 0.142297 seconds Mai 02 13:30:49 pluto gdm3[2282]: Child process -2421 was already dead. Mai 02 13:30:49 pluto gdm3[2282]: Child process 2418 was already dead. Mai 02 13:30:49 pluto gdm3[2282]: Unable to kill session worker process Mai 02 13:30:49 pluto gdm-launch-environment][2437]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0) Mai 02 13:30:49 pluto gdm3[2282]: Child process -2440 was already dead. Mai 02 13:30:49 pluto gdm3[2282]: Child process 2437 was already dead. Mai 02 13:30:49 pluto gdm3[2282]: Unable to kill session worker process Mai 02 13:30:50 pluto gdm-launch-environment][2444]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0) Mai 02 13:30:50 pluto gdm3[2282]: GdmDisplay: display lasted 0.140668 seconds Mai 02 13:30:50 pluto gdm3[2282]: Child process -2447 was already dead. Mai 02 13:30:50 pluto gdm3[2282]: Child process 2444 was already dead. Mai 02 13:30:50 pluto gdm3[2282]: Unable to kill session worker process Mai 02 13:30:50 pluto gdm-launch-environment][2463]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0) Mai 02 13:30:50 pluto gdm3[2282]: Child process -2466 was already dead. Mai 02 13:30:50 pluto gdm3[2282]: Child process 2463 was already dead. Mai 02 13:30:50 pluto gdm3[2282]: Unable to kill session worker process Mai 02 13:30:50 pluto gdm-launch-environment][2470]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0) Mai 02 13:30:50 pluto gdm3[2282]: GdmDisplay: display lasted 0.125414 seconds Mai 02 13:30:50 pluto gdm3[2282]: Child process -2473 was already dead. Mai 02 13:30:50 pluto gdm3[2282]: Child process 2470 was already dead. Mai 02 13:30:50 pluto gdm3[2282]: Unable to kill session worker process Mai 02 13:30:50 pluto gdm-launch-environment][2489]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0) Mai 02 13:30:50 pluto gdm3[2282]: Child process -2492 was already dead. Mai 02 13:30:50 pluto gdm3[2282]: Child process 2489 was already dead. Mai 02 13:30:50 pluto gdm3[2282]: Unable to kill session worker process Mai 02 13:30:50 pluto gdm-launch-environment][2496]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0) Mai 02 13:30:50 pluto gdm3[2282]: GdmDisplay: display lasted 0.127995 seconds Mai 02 13:30:50 pluto gdm3[2282]: Child process -2499 was already dead. Mai 02 13:30:50 pluto gdm3[2282]: Child process 2496 was already dead. Mai 02 13:30:50 pluto gdm3[2282]: Unable to kill session worker process Mai 02 13:30:50 pluto gdm-launch-environment][2515]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0) Mai 02 13:30:50 pluto systemd[1]: Stopping GNOME Display Manager... Mai 02 13:30:51 pluto gdm3[2282]: GdmLocalDisplayFactory: Failed to issue method call: GDBus.Error:org.freedesktop.DBus.Error.NoReply: Message recipient disconnected from message bus without replying Mai 02 13:30:51 pluto gdm3[2282]: Child process -2518 was already dead. Mai 02 13:30:51 pluto gdm3[2282]: Child process 2515 was already dead. Mai 02 13:30:51 pluto systemd[1]: Stopped GNOME Display Manager.
Some additional information, which might be helpful: User 109 is the gdm system uid. After starting gdm, I have this systemd-cgls output Stopping gdm.service does *not* stop session-c2.scope, and all it's processes. So, if I run systemctl stop gdm.service systemctl stop session-c2.scope systemctl start gdm.service Then the gdm service is properly started. │ └─user-109.slice │ ├─user@109.service │ │ ├─at-spi-dbus-bus.service │ │ │ ├─3299 /usr/lib/at-spi2-core/at-spi-bus-launcher │ │ │ ├─3304 /usr/bin/dbus-daemon --config-file=/usr/share/defaults/at-spi2/accessibility.conf --nofork --print-address 3 │ │ │ └─3306 /usr/lib/at-spi2-core/at-spi2-registryd --use-gnome-session │ │ ├─dbus.service │ │ │ ├─3276 /usr/bin/dbus-daemon --session --address=systemd: --nofork --nopidfile --systemd-activation │ │ │ └─3317 /usr/lib/x86_64-linux-gnu/gconf/gconfd-2 │ │ ├─xdg-permission-store.service │ │ │ └─3324 /usr/lib/flatpak/xdg-permission-store │ │ └─init.scope │ │ ├─3269 /lib/systemd/systemd --user │ │ └─3270 (sd-pam) │ └─session-c2.scope │ ├─3265 gdm-session-worker [pam/gdm-launch-environment] │ ├─3274 /usr/lib/gdm3/gdm-wayland-session gnome-session --autostart /usr/share/gdm/greeter/autostart │ ├─3278 /usr/lib/gnome-session/gnome-session-binary --autostart /usr/share/gdm/greeter/autostart │ ├─3286 /usr/bin/gnome-shell │ ├─3292 /usr/bin/Xwayland :1024 -rootless -noreset -listen 4 -listen 5 -displayfd 6 │ ├─3312 /usr/bin/pulseaudio --start --log-target=syslog │ ├─3315 /usr/lib/x86_64-linux-gnu/pulse/gconf-helper │ └─3333 /usr/lib/gnome-settings-daemon/gnome-settings-daemon ├─init.scope
A workaround I found was to add ExecStopPost=/bin/loginctl kill-user Debian-gdm (Debian-gdm is the system user name that is used in Debian)
Created attachment 351358 [details] [review] halfline patch After discussing with halfline on IRC, he sent me the attached patch. I just tried it now and it seems to work as expected (the gdm-session-worker process of the greeter is killed) However it seems that the the pulseaudio is properly killed and the session stays open. I'm also seeing gnome-shell crashes, this needs to be investigated as well I guess
(In reply to Laurent Bigonville from comment #6) > Created attachment 351358 [details] [review] [review] > halfline patch > > After discussing with halfline on IRC, he sent me the attached patch. > > I just tried it now and it seems to work as expected (the gdm-session-worker > process of the greeter is killed) > > However it seems that the the pulseaudio is properly killed and the session > stays open. I assume you mean that the pulseaudio process is *not* properly killed, keeping the logind session in closing state? Which means on every restart we pile up a old session scopes which are never cleaned up. We could try starting pulseaudio as a systemd user service for the gdm user session. @bigon: could you try if that please?
(In reply to Michael Biebl from comment #7) > We could try starting pulseaudio as a systemd user service for the gdm user > session. > @bigon: could you try if that please? # ln -s /usr/lib/systemd/user/pulseaudio.socket /usr/lib/systemd/user/sockets.target.wants/pulseaudio.socket (and then reboot)
attachment 351358 [details] [review] pushed as commit 81f61eb
(i think the gnome-shell crashed alluded to in comment 6 can be fixed by moving xwayland to its own process group, like https://da.gd/bMwU does, but that should probably get filed separately)
Regarding pulseaudio, I think the problem is more the fact that it doesn't exit in time and the previous instance is reused by the new GDM. The old logind session is still open (or in a closing state)
(In reply to Laurent Bigonville from comment #11) > Regarding pulseaudio, I think the problem is more the fact that it doesn't > exit in time and the previous instance is reused by the new GDM. > > The old logind session is still open (or in a closing state) Just wanted to confirm that this observation is correct. If I run "systemctl restart gdm", the pulseaudio process does not immediately exit but waits for a given timeout for clients to connect before it shuts down. As the new gdm session has been started in the mean time, the old pulseaudio process keeps running and the old session scope remains active and in state "closing". If I run "systemctl stop gdm; sleep 30; systemctl start gdm", then the old pulseaudio process terminates properly and the old session scope along with it. For the Debian package we are considering turning the pulseaudio daemon into a user service (i.e. enabling pulseaudio.socket for the Debian-gdm user).