GNOME Bugzilla – Bug 660343
use of GnomeWallClock broke gnome-screensaver on OpenBSD
Last modified: 2011-10-03 21:21:52 UTC
Hi. The following commit: ---------------------------------------------------------------------- From 62c7262736e93dbeb5be80bfc5c7e9200b1f0c01 Mon Sep 17 00:00:00 2001 From: Colin Walters <walters@verbum.org> Date: Thu, 01 Sep 2011 16:54:41 +0000 Subject: Use GnomeWallClock This centralizes clock handling. https://bugzilla.gnome.org/show_bug.cgi?id=657959 ---------------------------------------------------------------------- broke gnome-screensaver on OpenBSD. When trying to lock, the display hangs completely, I can still see the Desktop but I cannot click anywhere. The process then goes up to 100% cpu. Reverting the commit makes it work fine again. This is what I get when attaching gdb to the process: (gdb) bt
+ Trace 228617
Created attachment 197640 [details] backtrace Seems the bt was mangled, here's the complete one.
From the backtrace it looks like g_time_zone_new() hangs/never returns on OpenBSD. To be sure, if you attach several times the backtrace is always at line 392 of g_time_zone_new ?
(In reply to comment #2) > To be sure, if you attach several times the backtrace is always at line 392 of > g_time_zone_new ? Not always. After letting it loop for about 20 secs, I'm getting either the first backtrace I posted (g_time_zone_new()) or this new one. It still has to do with gdatetime/gtimezone though.
Created attachment 197733 [details] backtrace2
CCing walters, who wrote GnomeWallClock
This code: while (manager->priv->fading) { gtk_main_iteration (); } Is pretty horrible. I'm not seeing why it would cause this specific problem, but it may be the source of unexpected reentrancy. Are there multiple threads involved here? Can you get a "t a a bt"?
(In reply to comment #6) > Is pretty horrible. I'm not seeing why it would cause this specific problem, > but it may be the source of unexpected reentrancy. > > Are there multiple threads involved here? Can you get a "t a a bt"? Done. I attached several times and the output can be different. Side note: unexpected reentrancy can have immediate issues on OpenBSD -- at least issues that other OS may not run into right away; that is because our threads are still stuck in the 80's (userland threads playing with O_NONBLOCK on fds).
Created attachment 197775 [details] backtrace with threads
Does reverting the patch fix it? Also, even more stack traces would be useful. See if you can identify a pattern. For this purpose, I recommend: for x in $(seq 500); do gstack $(pidof gnome-screensaver) > /tmp/stack-$x.txt; sleep 3; done
(In reply to comment #9) > Does reverting the patch fix it? You mean the commit I mentioned in my original post, yes it does fix the issue (as I wrote). > Also, even more stack traces would be useful. See if you can identify a > pattern. For this purpose, I recommend: > > for x in $(seq 500); do gstack $(pidof gnome-screensaver) > /tmp/stack-$x.txt; > sleep 3; done Good idea, will do that. I'll play with gdb -batch though, no gstack here.
> Also, even more stack traces would be useful. See if you can identify a > pattern. For this purpose, I recommend: Hi Colin. Running gdb in batch mode, I came up with this backtrace "loop" (running it for about 10 minutes, it seems to enter the same paths in this order over and over...).
Created attachment 198089 [details] backtrace loop
(In reply to comment #11) > > Also, even more stack traces would be useful. See if you can identify a > > pattern. For this purpose, I recommend: > > Hi Colin. > > Running gdb in batch mode, I came up with this backtrace "loop" (running it for > about 10 minutes, it seems to enter the same paths in this order over and > over...). Ok, note though the whole point of this is to update the clock - on OpenBSD we'll hit the GnomeWallClock fallback path which simply wakes up every second, even when minutes are displayed, because we have to handle time zone changes or the system clock being set. So it's totally expected to see the same backtraces often. An easy and likely explanation for why we don't see this bug on Linux is because we have the timerfd support and so just wake up once a minute, which is a lot less likely (but still possible) to happen during a screensaver fade.
Oh this is actually really trivial to reproduce with this patch to gnome-desktop: diff --git a/libgnome-desktop/gnome-datetime-source.c b/libgnome-desktop/gnome-datetime-source.c index 05ec80a..0756410 100644 --- a/libgnome-desktop/gnome-datetime-source.c +++ b/libgnome-desktop/gnome-datetime-source.c @@ -26,6 +26,8 @@ #define GNOME_DESKTOP_USE_UNSTABLE_API #include "gnome-datetime-source.h" +#undef HAVE_TIMERFD + #ifdef HAVE_TIMERFD #include <sys/timerfd.h>
Created attachment 198152 [details] [review] GnomeWallClock: Fix non-Linux fallback code This fixes the "infinite loop in gnome-screensaver" bug. Only dispatch in cancel on set when the monotonic timeout has expired, otherwise we will drop into a tight loop.
Created attachment 198153 [details] [review] GnomeWallClock: Fix non-Linux fallback code Don't #undef HAVE_TIMERFD
(In reply to comment #16) > Created an attachment (id=198153) [details] [review] > GnomeWallClock: Fix non-Linux fallback code > > Don't #undef HAVE_TIMERFD This works like a charm, thanks!
Attachment 198153 [details] pushed as 3114767 - GnomeWallClock: Fix non-Linux fallback code