After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 780213 - gdm enters a respawn loop once the main process dies (insufficient permissions)
gdm enters a respawn loop once the main process dies (insufficient permissions)
Status: RESOLVED FIXED
Product: gdm
Classification: Core
Component: general
3.22.x
Other Linux
: Normal major
: ---
Assigned To: GDM maintainers
GDM maintainers
Depends on:
Blocks:
 
 
Reported: 2017-03-17 17:16 UTC by Michael Biebl
Modified: 2017-05-10 19:00 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
journal log (425.40 KB, text/plain)
2017-03-17 17:17 UTC, Michael Biebl
  Details
journal log Fedora 25 (1.76 MB, text/plain)
2017-03-17 17:22 UTC, Michael Biebl
  Details
halfline patch (2.47 KB, patch)
2017-05-08 16:40 UTC, Laurent Bigonville
committed Details | Review

Description Michael Biebl 2017-03-17 17:16:08 UTC
Version: 3.22.3
Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=857995

Once the main gdm process dies, we enter a respawn loop, effectivly DoSing your system.

Reproducing the problem is rather simple. Simply kill the main gdm process via kill -9 and let system respawn it, or switch to the console as root and run
"systemctl restart gdm"

I could reproduce the problem in a Fedora 25 VM and on an up-to-date Debian Stretch system with GNOME 3.22.

I get lots of 
gdm-x-session[1540]: (EE) xf86OpenConsole: Switching VT failed
in the journal (full log attached).

I assume the interaction with logind is somehow broken.

https://bugzilla.redhat.com/show_bug.cgi?id=1335511 is most likely the idential bug report bug was closed due to EOL.

If you need more information, please don't hesitate to ask.
Comment 1 Michael Biebl 2017-03-17 17:17:24 UTC
Created attachment 348193 [details]
journal log
Comment 2 Michael Biebl 2017-03-17 17:22:09 UTC
Created attachment 348195 [details]
journal log Fedora 25
Comment 3 Michael Biebl 2017-05-02 11:51:15 UTC
The interesting bit is, that it's gdm itself doing the respawning, i.e. it's not the main gdm process which dies and is restarted by systemd, but gdm repeatedly trying to start a new session.

Mai 02 13:29:52 pluto systemd[1]: Starting GNOME Display Manager...
Mai 02 13:29:52 pluto systemd[1]: Started GNOME Display Manager.
Mai 02 13:29:52 pluto gdm-launch-environment][755]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0)
Mai 02 13:30:00 pluto gdm-password][1509]: pam_unix(gdm-password:session): session opened for user michael by (uid=0)
Mai 02 13:30:48 pluto systemd[1]: Stopping GNOME Display Manager...
Mai 02 13:30:48 pluto gdm3[740]: GLib: g_hash_table_find: assertion 'version == hash_table->version' failed
Mai 02 13:30:48 pluto systemd[1]: Stopped GNOME Display Manager.
Mai 02 13:30:48 pluto systemd[1]: Starting GNOME Display Manager...
Mai 02 13:30:48 pluto systemd[1]: Started GNOME Display Manager.
Mai 02 13:30:48 pluto gdm-launch-environment][2286]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0)
Mai 02 13:30:48 pluto gdm3[2282]: GdmDisplay: display lasted 0.187789 seconds
Mai 02 13:30:48 pluto gdm3[2282]: Child process -2290 was already dead.
Mai 02 13:30:48 pluto gdm3[2282]: Child process 2286 was already dead.
Mai 02 13:30:48 pluto gdm3[2282]: Unable to kill session worker process
Mai 02 13:30:48 pluto gdm-launch-environment][2306]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0)
Mai 02 13:30:48 pluto gdm3[2282]: Child process -2309 was already dead.
Mai 02 13:30:48 pluto gdm3[2282]: Child process 2306 was already dead.
Mai 02 13:30:48 pluto gdm3[2282]: Unable to kill session worker process
Mai 02 13:30:48 pluto gdm-launch-environment][2314]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0)
Mai 02 13:30:48 pluto gdm3[2282]: GdmDisplay: display lasted 0.128089 seconds
Mai 02 13:30:48 pluto gdm3[2282]: Child process -2317 was already dead.
Mai 02 13:30:48 pluto gdm3[2282]: Child process 2314 was already dead.
Mai 02 13:30:48 pluto gdm3[2282]: Unable to kill session worker process
Mai 02 13:30:48 pluto gdm-launch-environment][2333]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0)
Mai 02 13:30:48 pluto gdm3[2282]: Child process -2336 was already dead.
Mai 02 13:30:48 pluto gdm3[2282]: Child process 2333 was already dead.
Mai 02 13:30:48 pluto gdm3[2282]: Unable to kill session worker process
Mai 02 13:30:48 pluto gdm-launch-environment][2340]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0)
Mai 02 13:30:49 pluto gdm3[2282]: GdmDisplay: display lasted 0.140515 seconds
Mai 02 13:30:49 pluto gdm3[2282]: Child process -2343 was already dead.
Mai 02 13:30:49 pluto gdm3[2282]: Child process 2340 was already dead.
Mai 02 13:30:49 pluto gdm3[2282]: Unable to kill session worker process
Mai 02 13:30:49 pluto gdm-launch-environment][2359]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0)
Mai 02 13:30:49 pluto gdm3[2282]: Child process -2362 was already dead.
Mai 02 13:30:49 pluto gdm3[2282]: Child process 2359 was already dead.
Mai 02 13:30:49 pluto gdm3[2282]: Unable to kill session worker process
Mai 02 13:30:49 pluto gdm-launch-environment][2366]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0)
Mai 02 13:30:49 pluto gdm3[2282]: GdmDisplay: display lasted 0.139748 seconds
Mai 02 13:30:49 pluto gdm3[2282]: Child process -2369 was already dead.
Mai 02 13:30:49 pluto gdm3[2282]: Child process 2366 was already dead.
Mai 02 13:30:49 pluto gdm3[2282]: Unable to kill session worker process
Mai 02 13:30:49 pluto gdm-launch-environment][2385]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0)
Mai 02 13:30:49 pluto gdm3[2282]: Child process -2388 was already dead.
Mai 02 13:30:49 pluto gdm3[2282]: Child process 2385 was already dead.
Mai 02 13:30:49 pluto gdm3[2282]: Unable to kill session worker process
Mai 02 13:30:49 pluto gdm-launch-environment][2392]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0)
Mai 02 13:30:49 pluto gdm3[2282]: GdmDisplay: display lasted 0.128068 seconds
Mai 02 13:30:49 pluto gdm3[2282]: Child process -2395 was already dead.
Mai 02 13:30:49 pluto gdm3[2282]: Child process 2392 was already dead.
Mai 02 13:30:49 pluto gdm3[2282]: Unable to kill session worker process
Mai 02 13:30:49 pluto gdm-launch-environment][2411]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0)
Mai 02 13:30:49 pluto gdm3[2282]: Child process -2414 was already dead.
Mai 02 13:30:49 pluto gdm3[2282]: Child process 2411 was already dead.
Mai 02 13:30:49 pluto gdm3[2282]: Unable to kill session worker process
Mai 02 13:30:49 pluto gdm-launch-environment][2418]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0)
Mai 02 13:30:49 pluto gdm3[2282]: GdmDisplay: display lasted 0.142297 seconds
Mai 02 13:30:49 pluto gdm3[2282]: Child process -2421 was already dead.
Mai 02 13:30:49 pluto gdm3[2282]: Child process 2418 was already dead.
Mai 02 13:30:49 pluto gdm3[2282]: Unable to kill session worker process
Mai 02 13:30:49 pluto gdm-launch-environment][2437]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0)
Mai 02 13:30:49 pluto gdm3[2282]: Child process -2440 was already dead.
Mai 02 13:30:49 pluto gdm3[2282]: Child process 2437 was already dead.
Mai 02 13:30:49 pluto gdm3[2282]: Unable to kill session worker process
Mai 02 13:30:50 pluto gdm-launch-environment][2444]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0)
Mai 02 13:30:50 pluto gdm3[2282]: GdmDisplay: display lasted 0.140668 seconds
Mai 02 13:30:50 pluto gdm3[2282]: Child process -2447 was already dead.
Mai 02 13:30:50 pluto gdm3[2282]: Child process 2444 was already dead.
Mai 02 13:30:50 pluto gdm3[2282]: Unable to kill session worker process
Mai 02 13:30:50 pluto gdm-launch-environment][2463]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0)
Mai 02 13:30:50 pluto gdm3[2282]: Child process -2466 was already dead.
Mai 02 13:30:50 pluto gdm3[2282]: Child process 2463 was already dead.
Mai 02 13:30:50 pluto gdm3[2282]: Unable to kill session worker process
Mai 02 13:30:50 pluto gdm-launch-environment][2470]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0)
Mai 02 13:30:50 pluto gdm3[2282]: GdmDisplay: display lasted 0.125414 seconds
Mai 02 13:30:50 pluto gdm3[2282]: Child process -2473 was already dead.
Mai 02 13:30:50 pluto gdm3[2282]: Child process 2470 was already dead.
Mai 02 13:30:50 pluto gdm3[2282]: Unable to kill session worker process
Mai 02 13:30:50 pluto gdm-launch-environment][2489]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0)
Mai 02 13:30:50 pluto gdm3[2282]: Child process -2492 was already dead.
Mai 02 13:30:50 pluto gdm3[2282]: Child process 2489 was already dead.
Mai 02 13:30:50 pluto gdm3[2282]: Unable to kill session worker process
Mai 02 13:30:50 pluto gdm-launch-environment][2496]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0)
Mai 02 13:30:50 pluto gdm3[2282]: GdmDisplay: display lasted 0.127995 seconds
Mai 02 13:30:50 pluto gdm3[2282]: Child process -2499 was already dead.
Mai 02 13:30:50 pluto gdm3[2282]: Child process 2496 was already dead.
Mai 02 13:30:50 pluto gdm3[2282]: Unable to kill session worker process
Mai 02 13:30:50 pluto gdm-launch-environment][2515]: pam_unix(gdm-launch-environment:session): session opened for user Debian-gdm by (uid=0)
Mai 02 13:30:50 pluto systemd[1]: Stopping GNOME Display Manager...
Mai 02 13:30:51 pluto gdm3[2282]: GdmLocalDisplayFactory: Failed to issue method call: GDBus.Error:org.freedesktop.DBus.Error.NoReply: Message recipient disconnected from message bus without replying
Mai 02 13:30:51 pluto gdm3[2282]: Child process -2518 was already dead.
Mai 02 13:30:51 pluto gdm3[2282]: Child process 2515 was already dead.
Mai 02 13:30:51 pluto systemd[1]: Stopped GNOME Display Manager.
Comment 4 Michael Biebl 2017-05-02 11:55:29 UTC
Some additional information, which might be helpful:

User 109 is the gdm system uid. After starting gdm, I have this systemd-cgls output
Stopping gdm.service does *not* stop session-c2.scope, and all it's processes.

So, if I run
 systemctl stop gdm.service
 systemctl stop session-c2.scope
 systemctl start gdm.service

Then the gdm service is properly started.


│ └─user-109.slice
│   ├─user@109.service
│   │ ├─at-spi-dbus-bus.service
│   │ │ ├─3299 /usr/lib/at-spi2-core/at-spi-bus-launcher
│   │ │ ├─3304 /usr/bin/dbus-daemon --config-file=/usr/share/defaults/at-spi2/accessibility.conf --nofork --print-address 3
│   │ │ └─3306 /usr/lib/at-spi2-core/at-spi2-registryd --use-gnome-session
│   │ ├─dbus.service
│   │ │ ├─3276 /usr/bin/dbus-daemon --session --address=systemd: --nofork --nopidfile --systemd-activation
│   │ │ └─3317 /usr/lib/x86_64-linux-gnu/gconf/gconfd-2
│   │ ├─xdg-permission-store.service
│   │ │ └─3324 /usr/lib/flatpak/xdg-permission-store
│   │ └─init.scope
│   │   ├─3269 /lib/systemd/systemd --user
│   │   └─3270 (sd-pam)
│   └─session-c2.scope
│     ├─3265 gdm-session-worker [pam/gdm-launch-environment]
│     ├─3274 /usr/lib/gdm3/gdm-wayland-session gnome-session --autostart /usr/share/gdm/greeter/autostart
│     ├─3278 /usr/lib/gnome-session/gnome-session-binary --autostart /usr/share/gdm/greeter/autostart
│     ├─3286 /usr/bin/gnome-shell
│     ├─3292 /usr/bin/Xwayland :1024 -rootless -noreset -listen 4 -listen 5 -displayfd 6
│     ├─3312 /usr/bin/pulseaudio --start --log-target=syslog
│     ├─3315 /usr/lib/x86_64-linux-gnu/pulse/gconf-helper
│     └─3333 /usr/lib/gnome-settings-daemon/gnome-settings-daemon
├─init.scope
Comment 5 Michael Biebl 2017-05-02 12:06:59 UTC
A workaround I found was to add
ExecStopPost=/bin/loginctl kill-user Debian-gdm 

(Debian-gdm is the system user name that is used in Debian)
Comment 6 Laurent Bigonville 2017-05-08 16:40:10 UTC
Created attachment 351358 [details] [review]
halfline patch

After discussing with halfline on IRC, he sent me the attached patch.

I just tried it now and it seems to work as expected (the gdm-session-worker process of the greeter is killed)

However it seems that the the pulseaudio is properly killed and the session stays open.

I'm also seeing gnome-shell crashes, this needs to be investigated as well I guess
Comment 7 Michael Biebl 2017-05-08 17:01:03 UTC
(In reply to Laurent Bigonville from comment #6)
> Created attachment 351358 [details] [review] [review]
> halfline patch
> 
> After discussing with halfline on IRC, he sent me the attached patch.
> 
> I just tried it now and it seems to work as expected (the gdm-session-worker
> process of the greeter is killed)
> 
> However it seems that the the pulseaudio is properly killed and the session
> stays open.

I assume you mean that the pulseaudio process is *not* properly killed, keeping the logind session in closing state? Which means on every restart we pile up a old session scopes which are never cleaned up.

We could try starting pulseaudio as a systemd user service for the gdm user session. 
@bigon: could you try if that please?
Comment 8 Michael Biebl 2017-05-08 17:44:46 UTC
(In reply to Michael Biebl from comment #7)
> We could try starting pulseaudio as a systemd user service for the gdm user
> session. 
> @bigon: could you try if that please?

# ln -s /usr/lib/systemd/user/pulseaudio.socket /usr/lib/systemd/user/sockets.target.wants/pulseaudio.socket
(and then reboot)
Comment 9 Ray Strode [halfline] 2017-05-08 20:49:10 UTC
attachment 351358 [details] [review] pushed as commit 81f61eb
Comment 10 Ray Strode [halfline] 2017-05-08 20:51:15 UTC
(i think the gnome-shell crashed alluded to in comment 6 can be fixed by  moving xwayland to its own process group, like https://da.gd/bMwU does, but that should probably get filed separately)
Comment 11 Laurent Bigonville 2017-05-09 11:42:28 UTC
Regarding pulseaudio, I think the problem is more the fact that it doesn't exit in time and the previous instance is reused by the new GDM.

The old logind session is still open (or in a closing state)
Comment 12 Michael Biebl 2017-05-09 20:19:21 UTC
(In reply to Laurent Bigonville from comment #11)
> Regarding pulseaudio, I think the problem is more the fact that it doesn't
> exit in time and the previous instance is reused by the new GDM.
> 
> The old logind session is still open (or in a closing state)

Just wanted to confirm that this observation is correct.
If I run "systemctl restart gdm", the pulseaudio process does not immediately exit but waits for a given timeout for clients to connect before it shuts down.
As the new gdm session has been started in the mean time, the old pulseaudio process keeps running and the old session scope remains active and in state "closing".

If I run "systemctl stop gdm; sleep 30; systemctl start gdm", then the old pulseaudio process terminates properly and the old session scope along with it.

For the Debian package we are considering turning the pulseaudio daemon into a user service (i.e. enabling pulseaudio.socket for the Debian-gdm user).