GNOME Bugzilla – Bug 627724
gtester hangs on startup
Last modified: 2017-03-15 12:37:29 UTC
With latest master of glib/dconf (not sure if it's relevant, but it appears on the trace), gtester hangs on startup and spins using 100% CPU forever. Trace: (gdb) bt
+ Trace 223362
Ryan suggested a call to g_bus_get_sync at the very beginning of the test program. This indeed seems to workaround the issue.
I need stack traces for all threads (t a a bt)
I have the same problem when trying to run ephy with current webkit. (gdb) thread apply all bt Hilo 3 (Thread 0xb4e47b70 (LWP 11015)):
+ Trace 223364
Yeah, similar thing.
+ Trace 223365
Looks like a problem with GType holding a lock while calling the init() function for a GObject-derived type... reassigning to gobject.
Actually looks like it is because of code in the GClassInitFunc, not the instance initializer (e.g. "static void my_type_init(MyType *instance)" function). Maybe the answer is simply "don't create object instances in GClassInitFunc functions". A workaround for WebKit could be to move the instance-creating code elsewhere, maybe the instance initializer.
(In reply to comment #5) > Actually looks like it is because of code in the GClassInitFunc, not the > instance initializer (e.g. "static void my_type_init(MyType *instance)" > function). > > Maybe the answer is simply "don't create object instances in GClassInitFunc > functions". > > A workaround for WebKit could be to move the instance-creating code elsewhere, > maybe the instance initializer. Right. This is actually a fairly recent thing, we call an init function for WebKit on demand in all class_init functions (an ill-conceived idea IMHO, but it was done a long time ago), and we just added some gsettings stuff there that calls g_settings_new(), I guess that's where the problem is coming from. I suppose we can start looking into moving it somewhere else, but I wouldn't be surprised if this hits more people in the future to be honest.
A little hard to write this down clearly in the docs: In a class init function, don't do anything that causes other threads to initialize classed types. ?
(In reply to comment #7) > A little hard to write this down clearly in the docs: > > In a class init function, don't do anything that causes other threads to > initialize classed types. > > ? Actually threading doesn't matter does it? Calling any function that results in type registration will cause a deadlock, no?
its a recursive lock, so I believe we would be fine if this happened in the same thread.
*** Bug 628889 has been marked as a duplicate of this bug. ***
Maybe, as a workaround, gdbus could do things like static volatile GType _g_volatile_type; _g_volatile_type = G_SIMPLE_ASYNC_RESULT_TYPE; _g_volatile_type = G_ASYNC_RESULT_TYPE [...] before launching the helper thread. Would need to do it for all types we know the helper thread needs for initialization (not a lot I guess). (The real fix, of course, is to fix gobject so it doesn't hold any locks while calling out to user code. But that seems a lot more complicated.)
(In reply to comment #11) > Maybe, as a workaround, gdbus could do things like > > static volatile GType _g_volatile_type; > > _g_volatile_type = G_SIMPLE_ASYNC_RESULT_TYPE; > _g_volatile_type = G_ASYNC_RESULT_TYPE > [...] > > before launching the helper thread. Would need to do it for all types we know > the helper thread needs for initialization (not a lot I guess). > > (The real fix, of course, is to fix gobject so it doesn't hold any locks while > calling out to user code. But that seems a lot more complicated.) I've now done this, see http://git.gnome.org/browse/glib/commit/?id=0b74058fa3144f85b5fefd4c81129b971010452a
Bug 660423 might be a duplicate of this bug, it has a similar backtrace, and was observed with glib 2.28.6 which has the code supposed to work around this bug.
*** Bug 670479 has been marked as a duplicate of this bug. ***
Hm. actually, no. This is a different bug. And I don't think this can be called gobject's fault; it was caused by the fact that _g_dbus_worker_new() used to block waiting for the worker thread to be fully set up, and the worker thread registered some types during setup, but the _g_dbus_worker_new() thread was already holding the gtype locks. So this should have been fixed by Colin's patch for bug 651650 (which made _g_dbus_worker_new() no longer block), and so it should be safe to remove the ensure_required_types() hack at this point.
See https://bugzilla.gnome.org/show_bug.cgi?id=674885 which shows the problem extends beyond GDBus.
*** This bug has been marked as a duplicate of bug 674885 ***