GNOME Bugzilla – Bug 632301
Race in g_main_loop_quit() leads to hang
Last modified: 2011-09-10 00:46:04 UTC
In g_main_loop_quit() it does a g_main_context_wakeup_unlocked(), which checks poll_waiting: if (g_thread_supported() && context->poll_waiting) { context->poll_waiting = FALSE; write (context->wake_up_pipe[1], "A", 1); } Unfortunately there is a window where poll_waiting is FALSE, but we are going to set it (and then poll) - without looking at is_running again. The poll will then never wake up and the mainloop will never quit. Here are the debug prints I added showing the sequence, the poll at the end never returns: (./test-runner-shutdown:7156): GLib-DEBUG: 0xb4c00e40 g_main_loop_quit enter (./test-runner-shutdown:7156): GLib-DEBUG: 0xb4c00e40 g_main_loop_run iteration complete (./test-runner-shutdown:7156): GLib-DEBUG: 0xb4c00e40 g_main_loop_run about to iterate (./test-runner-shutdown:7156): GLib-DEBUG: LOCK g_main_loop_quit (./test-runner-shutdown:7156): GLib-DEBUG: 0xb4c00e40 g_main_context_wakeup_unlocked poll_waiting=0 (./test-runner-shutdown:7156): GLib-DEBUG: UNLOCK g_main_loop_quit (./test-runner-shutdown:7156): GLib-DEBUG: 0xb4c00e40 g_main_loop_quit leave (./test-runner-shutdown:7156): GLib-DEBUG: UNLOCK g_main_context_iterate (./test-runner-shutdown:7156): GLib-DEBUG: LOCK g_main_context_acquire (./test-runner-shutdown:7156): GLib-DEBUG: UNLOCK g_main_context_acquire (./test-runner-shutdown:7156): GLib-DEBUG: LOCK g_main_context_iterate (./test-runner-shutdown:7156): GLib-DEBUG: UNLOCK g_main_context_iterate (./test-runner-shutdown:7156): GLib-DEBUG: LOCK g_main_context_prepare (./test-runner-shutdown:7156): GLib-DEBUG: 0xb4c00e40 g_main_context_prepare set poll_waiting=1 (./test-runner-shutdown:7156): GLib-DEBUG: UNLOCK g_main_context_prepare (./test-runner-shutdown:7156): GLib-DEBUG: LOCK g_main_context_query (./test-runner-shutdown:7156): GLib-DEBUG: UNLOCK g_main_context_query (./test-runner-shutdown:7156): GLib-DEBUG: LOCK g_main_context_poll (./test-runner-shutdown:7156): GLib-DEBUG: UNLOCK g_main_context_poll (./test-runner-shutdown:7156): GLib-DEBUG: 0xb4c00e40 g_main_context_poll polling
One fix might be to check is_running in g_main_context_prepare() within the same locked region of code that sets poll_waiting, so we guarantee that we only set poll_waiting if still is_running. Another fix might be to not look at poll_waiting in g_main_context_wakeup_unlocked(), just always unconditionally wake up. I'm not sure if this check is important somehow or just an optimization (or if it's an important optimization). I suppose that the wakeup pipe could block if overstuffed? It looks pretty simple to just make prepare() bail out if !is_running, though it does make the code a touch less elegant - right now is_running is not really leaking into the single-iteration code.
Created attachment 172504 [details] [review] If main loop is woken up prior to prepare(), do a nonblocking poll() The problem is that g_main_loop_quit() checks is_running, then unlocks at least once during which time is_running can be false, and then does prepare() which sets poll_waiting and decides to block. So then we'll block with is_running=false and never wake up.
attached kind of a hacky fix. my idea to check is_running in prepare() obviously won't work since is_running is in the loop, not the context. I have a somewhat nicer fix in mind, hang on
My nicer fix idea was to just set poll_waiting right away at the top of g_main_context_iterate() but the problem is people calling prepare, etc. directly without using g_main_context_iterate(). So it isn't a workable idea I guess. prepare() is already too late.
*** Bug 658182 has been marked as a duplicate of this bug. ***
(In reply to comment #1) > Another fix might be to not look at poll_waiting in > g_main_context_wakeup_unlocked(), just always unconditionally wake up. this is what my patch in bug 658182 does, though I'm also not sure it's right. > I'm not > sure if this check is important somehow or just an optimization (or if it's an > important optimization). It's been there at least since the mainloop rewrite for glib 2.0... > I suppose that the wakeup pipe could block if overstuffed? In glib master, this uses GWakeup now, which either uses an eventfd (which can't "fill up") or a non-blocking pipe (in which case if the pipe got full, further writes would fail with -1/EAGAIN, but it wouldn't matter because if that happens it means we're already definitely going to wake up anyway).
*** This bug has been marked as a duplicate of bug 583511 ***