GNOME Bugzilla – Bug 782660
gnome-shell login screen crashes on shutdown
Last modified: 2021-07-05 13:52:13 UTC
GDM kills the entire process group associated with the login screen when shutting down. This means Xwayland may die before mutter. mutter crashes when Xwayland dies. There two patches here. The first one is tested and verified by Laurent Bigonville. The second is something I think we should probably hold off on for now, but I'm posting it to get it archived in case we need it.
Created attachment 351900 [details] [review] wayland: start Xwayland in its own process group We expect mutter to manage the life cycle of its Xwayland instance. We enforce this expectation by crashing if Xwayland dies for any reason we aren't expecting. One legitimate reason Xwayland could die unexpectedly just before mutter is if the process group that mutter and Xwayland are running in is killed. In that case, which one dies first is a game of chance. Mutter should not crash if Xwayland happens to die first in such a scenario. This commit moves Xwayland to its own process group, so killing mutter's process group won't kill Xwayland, but will instead just kill mutter. For extra resilience, this commit makes sure Xwayland kills itself anytime mutter dies.
Created attachment 351901 [details] [review] wayland: don't core dump when the X server dies or disconnects It makes no sense for us to drop core when the X server dies. We didn't do anything wrong and the core dump implicates us unfairly (confusing everyone). Furthermore, there are legitimate reasons where the X connection can go away out from under us (like session getting killed explicitly or display manager clearing the server of clients in preparation for ending the session). This commit changes the code to more gracefully handle the X server going away than crashing. If it goes away, we treat it as a hint that the session is ending and just exit.
Comment on attachment 351901 [details] [review] wayland: don't core dump when the X server dies or disconnects (marking rejected since I'd rather not introduce a meta_exit() call into the code if we can help it)
Review of attachment 351900 [details] [review]: This looks like a reasonable thing to do for now to me. Though, isn't there still a risk we'd still dump a core when the X server exits? The concern is that if we'd try to access the X connection after it is terminated, but before X has closed and sent SIGTERM to mutter, we'd still be hitting a g_error() you remove in the other patch.
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled). If you can still reproduce the situation described in this ticket in a recent and supported software version, then please follow https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines and create a new ticket at https://gitlab.gnome.org/GNOME/mutter/-/issues/ Thank you for your understanding and your help.