After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 660343 - use of GnomeWallClock broke gnome-screensaver on OpenBSD
use of GnomeWallClock broke gnome-screensaver on OpenBSD
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: general
unspecified
Other OpenBSD
: Normal major
: 2.30
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2011-09-28 09:13 UTC by Antoine Jacoutot
Modified: 2011-10-03 21:21 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
backtrace (3.21 KB, text/plain)
2011-09-28 09:16 UTC, Antoine Jacoutot
  Details
backtrace2 (4.87 KB, text/plain)
2011-09-29 08:16 UTC, Antoine Jacoutot
  Details
backtrace with threads (10.33 KB, text/plain)
2011-09-29 14:52 UTC, Antoine Jacoutot
  Details
backtrace loop (35.24 KB, text/plain)
2011-10-03 14:08 UTC, Antoine Jacoutot
  Details
GnomeWallClock: Fix non-Linux fallback code (1.80 KB, patch)
2011-10-03 20:37 UTC, Colin Walters
none Details | Review
GnomeWallClock: Fix non-Linux fallback code (1.62 KB, patch)
2011-10-03 20:38 UTC, Colin Walters
committed Details | Review

Description Antoine Jacoutot 2011-09-28 09:13:59 UTC
Hi.

The following commit:

----------------------------------------------------------------------
From 62c7262736e93dbeb5be80bfc5c7e9200b1f0c01 Mon Sep 17 00:00:00 2001
From: Colin Walters <walters@verbum.org>
Date: Thu, 01 Sep 2011 16:54:41 +0000
Subject: Use GnomeWallClock

This centralizes clock handling.

https://bugzilla.gnome.org/show_bug.cgi?id=657959
----------------------------------------------------------------------

broke gnome-screensaver on OpenBSD.
When trying to lock, the display hangs completely, I can still see the Desktop but I cannot click anywhere. The process then goes up to 100% cpu.
Reverting the commit makes it work fine again.

This is what I get when attaching gdb to the process:

(gdb) bt
  • #0 g_time_zone_new
    at gtimezone.c line 392
  • #1 g_date_time_new_now_local
    at gdatetime.c line 728
  • #2 update_clock
    at gnome-wall-clock.c line 176
  • #3 g_datetime_source_dispatch
    at gnome-datetime-source.c line 142
  • #4 g_main_context_dispatch
    at gmain.c line 2441
  • #5 g_main_context_iterate
    at gmain.c line 3089
  • #6 g_main_context_iteration
    at gmain.c line 3152
  • #7 gtk_main_iteration
    from /usr/local/lib/libgtk-3.so.1.0
  • #8 gs_manager_set_active
  • #9 gs_monitor_new
  • #10 gs_marshal_BOOLEAN__BOOLEAN
  • #11 g_closure_invoke
    at gclosure.c line 774
  • #12 signal_emit_unlocked_R
    at gsignal.c line 3272
  • #13 g_signal_emit_valist
    at gsignal.c line 3013
  • #14 g_signal_emit
    at gsignal.c line 3060
  • #15 gs_listener_set_active
  • #16 gs_monitor_new
  • #17 g_closure_invoke
    at gclosure.c line 774
  • #18 signal_emit_unlocked_R
    at gsignal.c line 3272
  • #19 g_signal_emit_valist
    at gsignal.c line 3003
  • #20 g_signal_emit
    at gsignal.c line 3060
  • #21 gs_listener_set_active
  • #22 dbus_connection_dispatch
    from /usr/local/lib/libdbus-1.so.9.1
  • #23 dbus_server_setup_with_g_main
    from /usr/local/lib/libdbus-glib-1.so.4.3
  • #24 g_main_context_dispatch
    at gmain.c line 2441
  • #25 g_main_context_iterate
    at gmain.c line 3089
  • #26 g_main_loop_run
    at gmain.c line 3297
  • #27 gtk_main
    from /usr/local/lib/libgtk-3.so.1.0
  • #28 main
  • #0 g_time_zone_new
    at gtimezone.c line 392
  • #1 g_date_time_new_now_local
    at gdatetime.c line 728

Comment 1 Antoine Jacoutot 2011-09-28 09:16:19 UTC
Created attachment 197640 [details]
backtrace

Seems the bt was mangled, here's the complete one.
Comment 2 Ray Strode [halfline] 2011-09-28 14:49:01 UTC
From the backtrace it looks like g_time_zone_new() hangs/never returns on OpenBSD.


To be sure, if you attach several times the backtrace is always at line 392 of g_time_zone_new ?
Comment 3 Antoine Jacoutot 2011-09-29 08:15:39 UTC
(In reply to comment #2)
> To be sure, if you attach several times the backtrace is always at line 392 of
> g_time_zone_new ?

Not always. After letting it loop for about 20 secs, I'm getting either the first backtrace I posted (g_time_zone_new()) or this new one. It still has to do with gdatetime/gtimezone though.
Comment 4 Antoine Jacoutot 2011-09-29 08:16:07 UTC
Created attachment 197733 [details]
backtrace2
Comment 5 Matthias Clasen 2011-09-29 12:08:05 UTC
CCing walters, who wrote GnomeWallClock
Comment 6 Colin Walters 2011-09-29 14:40:47 UTC
This code:

                while (manager->priv->fading) {
                        gtk_main_iteration ();
                }

Is pretty horrible.  I'm not seeing why it would cause this specific problem, but it may be the source of unexpected reentrancy.

Are there multiple threads involved here?  Can you get a "t a a bt"?
Comment 7 Antoine Jacoutot 2011-09-29 14:52:15 UTC
(In reply to comment #6)
> Is pretty horrible.  I'm not seeing why it would cause this specific problem,
> but it may be the source of unexpected reentrancy.
> 
> Are there multiple threads involved here?  Can you get a "t a a bt"?

Done. I attached several times and the output can be different.

Side note: unexpected reentrancy can have immediate issues on OpenBSD -- at least issues that other OS may not run into right away; that is because our threads are still stuck in the 80's (userland threads playing with O_NONBLOCK on fds).
Comment 8 Antoine Jacoutot 2011-09-29 14:52:36 UTC
Created attachment 197775 [details]
backtrace with threads
Comment 9 Colin Walters 2011-09-29 15:10:18 UTC
Does reverting the patch fix it?

Also, even more stack traces would be useful.  See if you can identify a pattern.  For this purpose, I recommend:

for x in $(seq 500); do gstack $(pidof gnome-screensaver) > /tmp/stack-$x.txt; sleep 3; done
Comment 10 Antoine Jacoutot 2011-09-29 17:46:43 UTC
(In reply to comment #9)
> Does reverting the patch fix it?

You mean the commit I mentioned in my original post, yes it does fix the issue (as I wrote).

> Also, even more stack traces would be useful.  See if you can identify a
> pattern.  For this purpose, I recommend:
> 
> for x in $(seq 500); do gstack $(pidof gnome-screensaver) > /tmp/stack-$x.txt;
> sleep 3; done

Good idea, will do that. I'll play with gdb -batch though, no gstack here.
Comment 11 Antoine Jacoutot 2011-10-03 14:07:57 UTC
> Also, even more stack traces would be useful.  See if you can identify a
> pattern.  For this purpose, I recommend:

Hi Colin.

Running gdb in batch mode, I came up with this backtrace "loop" (running it for about 10 minutes, it seems to enter the same paths in this order over and over...).
Comment 12 Antoine Jacoutot 2011-10-03 14:08:23 UTC
Created attachment 198089 [details]
backtrace loop
Comment 13 Colin Walters 2011-10-03 15:18:53 UTC
(In reply to comment #11)
> > Also, even more stack traces would be useful.  See if you can identify a
> > pattern.  For this purpose, I recommend:
> 
> Hi Colin.
> 
> Running gdb in batch mode, I came up with this backtrace "loop" (running it for
> about 10 minutes, it seems to enter the same paths in this order over and
> over...).

Ok, note though the whole point of this is to update the clock - on OpenBSD we'll hit the GnomeWallClock fallback path which simply wakes up every second, even when minutes are displayed, because we have to handle time zone changes or the system clock being set.

So it's totally expected to see the same backtraces often.

An easy and likely explanation for why we don't see this bug on Linux is because we have the timerfd support and so just wake up once a minute, which is a lot less likely (but still possible) to happen during a screensaver fade.
Comment 14 Colin Walters 2011-10-03 20:07:22 UTC
Oh this is actually really trivial to reproduce with this patch to gnome-desktop:

diff --git a/libgnome-desktop/gnome-datetime-source.c b/libgnome-desktop/gnome-datetime-source.c
index 05ec80a..0756410 100644
--- a/libgnome-desktop/gnome-datetime-source.c
+++ b/libgnome-desktop/gnome-datetime-source.c
@@ -26,6 +26,8 @@
 #define GNOME_DESKTOP_USE_UNSTABLE_API
 #include "gnome-datetime-source.h"
 
+#undef HAVE_TIMERFD
+
 #ifdef HAVE_TIMERFD
 #include <sys/timerfd.h>
Comment 15 Colin Walters 2011-10-03 20:37:26 UTC
Created attachment 198152 [details] [review]
GnomeWallClock: Fix non-Linux fallback code

This fixes the "infinite loop in gnome-screensaver" bug.  Only
dispatch in cancel on set when the monotonic timeout has expired,
otherwise we will drop into a tight loop.
Comment 16 Colin Walters 2011-10-03 20:38:06 UTC
Created attachment 198153 [details] [review]
GnomeWallClock: Fix non-Linux fallback code

Don't #undef HAVE_TIMERFD
Comment 17 Antoine Jacoutot 2011-10-03 21:09:19 UTC
(In reply to comment #16)
> Created an attachment (id=198153) [details] [review]
> GnomeWallClock: Fix non-Linux fallback code
> 
> Don't #undef HAVE_TIMERFD

This works like a charm, thanks!
Comment 18 Colin Walters 2011-10-03 21:21:49 UTC
Attachment 198153 [details] pushed as 3114767 - GnomeWallClock: Fix non-Linux fallback code