GNOME Bugzilla – Bug 677875
Fail whale prevents the user from closing applications
Last modified: 2021-06-14 18:21:16 UTC
So it turns out that the shell process has this bug where it often crashes if I drag a window to a new workspace. This is not fatal, the shell is simply restarted and life is good. But if you do this twice within a short time (because it doesn't crash all the time it's natural to just retry), then you are presented with this "Oh noz! Something has gone wrong" dialog and there is *no way* to dismiss it even though everything is actually fine. This causes data-loss e.g. unsaved work in already open applications. I'm sure there are nothing but good intentions behind this design but unfortunately it doesn't make it right. The problem is that your assumption "two crashes within a short time in a component means it's game-over" is just wrong. There ought to be at least a way to dismiss this "Oh Noz! We know much better!" dialog for the rare cases where the operator actually knows better than you or the gnome-session developers what he is doing. I would suggest having a small grey X in the top-right for this. Thanks for considering.
In cases where it's not the window manager crashing, the operator can work around the fail whale screen by hitting alt-spacebar and minimizing the window from the resulting menu, or press alt-f4 to close the window entirely. That won't help if it's gnome-shell that's crashing, and it's not discoverable anyway, so it's hardly optimal. Adding an X button won't solve much for your particular case, though, since the user won't have a window manager anymore and focus won't behave right. There are few reports of this already. See bug 677129 , bug 658671 , and bug 649661 (I really need to do some triage) I totally agree that presenting fail whale in the situation you fell into, was wrong. Retitling bug to emphasize that aspect of the report.
So right now we have: /* if a component crashes twice within a minute, we count that as a fatal error */ #define _GSM_APP_RESPAWN_RATELIMIT_SECONDS 60 If the whole point is to detect crash loops, 60 seconds is a bit on the high side.
Adding a close button makes no sense at all. The bug is simply that the fail whale wasn't supposed to happen. No need for drama about knowing better or operators. Just need to fix the problem.
(In reply to comment #3) > Adding a close button makes no sense at all. The bug is simply that the fail > whale wasn't supposed to happen. No need for drama about knowing better or > operators. Just need to fix the problem. Sorry but "wasn't supposed to happen" is just naive. This is software - there will always be bugs... pretending there isn't is just an unhealthy attitude (which in this case actually lost data for me) with some pretty interesting implications...
(In reply to comment #1) > I totally agree that presenting fail whale in the situation you fell into, was > wrong. Better heuristics is certainly welcome... > Retitling bug to emphasize that aspect of the report. ... but at the end of the day heuristics are just that: heuristics. I very specifically chose the title '"Oh Noz!" dialog cannot be dismissed' to reflect that view. Thinking more about it, here's what I want on a more abstract level: It needs to be possible to properly close (maybe "end" is a better word) the applications / browser-tabs / remote shells / IM conversations / chat sessions / whatever (including saving data) before ending the session. So I'm changing the summary to "fail whale prevents user from closing applications" to try to capture this. Feel free to come up with better wording but please don't change it to something else (better to just WONTFIX it, I guess). Also, I honestly don't care how we let the user achieve this [1] but the point is that you just don't nuke the session and throw away the user's data *just because* of some heuristic. Because that's just rude. [1] : dismissing the entire fail whale is one option, making it non-full screen is another, I'm sure it's possible to come up with better ideas...
Hey, sorry if I'm not being as clear as I could be. You've mentioned two orthogonal problems in your comment 0: 1) 'You are presented with this "Oh noz! Something has gone wrong" dialog and there is *no way* to dismiss it even though everything is actually fine' 2) 'The problem is that your assumption "two crashes within a short time in a component means it's game-over" is just wrong.' As mentioned earlier, 1) has been filed 4 times already (I really need to dupe them together into one bug), so I was really hoping to make this bug be used to track 2). Is that reasonable? If not, I can file a new bug to track 2, and we can leave this one open until I get around to doing the duping triage, and then it will get closed along with 3 of the other 4 bugs. I don't care either way, but I don't want 2 to fall through the cracks.
I actually think my complaint is more about 1) than 2). Basically, I just want a way to close my applications and finish my business in case GNOME tells me that it needs to restart. I also think it needs to be better supported than some magic WM key-combo. Think about it: it's user data you are messing with. So having some way for the user to rescue his data is key here. If we don't want to do that, I'd rather you just close the bug WONTFIX. (Ideally GNOME would be more stateless and less dependent on things starting in the right order. That way we would be able to survive processes segfaulting a lot better than we do today. Of course there's a cost to that etc etc.)
*** Bug 692769 has been marked as a duplicate of this bug. ***
*** Bug 677129 has been marked as a duplicate of this bug. ***
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled). If you can still reproduce the situation described in this ticket in a recent and supported software version of gnome-session, then please follow https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines and create a new ticket at https://gitlab.gnome.org/GNOME/gnome-session/-/issues/ Thank you for your understanding and your help.