GNOME Bugzilla – Bug 572861
gtester gets stuck intermittently
Last modified: 2010-04-21 12:03:49 UTC
gtester appears to intermittently get stuck where the forked process has become zombie. It's very seldom, but when run constantly in buildbot it becomes annoyingly common occurence. I was able to trigger the hanging condition with following testcase: /* gcc -Wall -O2 -g `pkg-config --cflags --libs` -o test test.c */ #include <glib.h> static void test_dummy(void) { /* all ok, incredibly fast */ } int main(int argc, char **argv) { g_test_init(&argc, &argv, NULL); g_test_add_func("/dummy", test_dummy); return g_test_run(); } $ while true; do gtester test; done Tested with 2.16.6, going to try trunk next.
Happening with trunk as well. The parent gtester is blocking on poll():
+ Trace 212811
Created attachment 129342 [details] add extra debugging to a few places With the added debugging the log for a successful case is: g_spawn_async_with_pipes.. returned from g_spawn_async_with_pipes g_child_watch_add_full check_for_child_exited == 0 check_for_child_exited count < count check_for_child_exited count < count check_for_child_exited count < count g_child_watch_signal_handler single check_for_child_exited > 0 child_watch_cb while a failed case looks like: g_spawn_async_with_pipes.. returned from g_spawn_async_with_pipes g_child_watch_add_full check_for_child_exited == 0 check_for_child_exited count < count check_for_child_exited count < count check_for_child_exited count < count check_for_child_exited count < count check_for_child_exited count < count check_for_child_exited count < count check_for_child_exited count < count check_for_child_exited count < count g_child_watch_signal_handler single
So as it would seem the signal handler is run, but poll isn't interrupted? static void g_child_watch_signal_handler (int signum) { child_watch_count ++; if (child_watch_init_state == CHILD_WATCH_INITIALIZED_THREADED) { write (child_watch_wake_up_pipe[1], "B", 1); } else { /* We count on the signal interrupting the poll in the same thread. */ } } What if the single thread case is not blocking on poll() when the signal arrives?
Adding g_timeout_add_seconds(60, return TRUE) to gtester.c(main) before the for-loop somehow avoids this. Not by unblocking after a minute as I was expecting, but from what I've been able to test it now it never hangs. Can't quite understand why. I wonder why the child watch isn't working the same way as main context wake_up_pipe (sigchild handler always writing to a pipe waking up main context) but instead does odd looking special casing between single and multiple threads?
*** This bug has been marked as a duplicate of bug 578295 ***