GNOME Bugzilla – Bug 613822
gobject signal connect/disconnect not thread safe
Last modified: 2010-10-21 13:28:19 UTC
We found a nasty case where you can't connect a handler and disconnect another handler from the same object at the same time because the closure stuff is not protected by a mutex in GObject. Bugs #346801 comment #8 and #403150 could also be related to this. In gobject.c: g_object_watch_closure() modifies the closure array (it can even re-alloc it) And object_remove_closure() can access it at the same time. Solution: Add a lock around accesses to the closure array. Ideally, it would be per-object. But since we suck, I guess it has to be global. Relevant extract from the stack trace: Program terminated with signal 11, Segmentation fault.
+ Trace 221073
Thread 17 (process 2012)
Created attachment 157163 [details] Test case Here's a test case using g_object_weak_ref/unref
(In reply to comment #1) > Here's a test case using g_object_weak_ref/unref (which obviously means that the closure array is not the only one affected :))
Weak ref/unref are also not thread safe, but its a different issue from the closures. That said, both should be fixed.
Created attachment 157180 [details] [review] Patch Here's the patch, please someone review and commit
Review of attachment 157180 [details] [review]: Appart from the one comment below, the rest looks good to me. ::: gobject/gobject.c @@ +2258,3 @@ * will only be notified when there is exactly one of them. */ + g_assert (tstack.n_toggle_refs == 1); What happens if one thread do a g_object_unref(), so refcount goes down to 1 and this is called) while another thread does g_object_add_toggle_ref(). Which will do g_realloc() and may invalidate the buffer. func = tstack.toggle_refs[0].notify; data = tstack.toggle_refs[0].data; object = tstack.object; UNLOCK(); func(data, object, is_last);
(In reply to comment #5) > + g_assert (tstack.n_toggle_refs == 1); > > What happens if one thread do a g_object_unref(), so refcount goes > down to 1 and this is called) while another thread does > g_object_add_toggle_ref(). Which will do g_realloc() and may > invalidate the buffer. That's exactly why in this patch I'm not using the original buffer, but a _copy_: ToggleRefStack tstack, *tstackptr; G_LOCK (toggle_refs_mutex); tstackptr = g_datalist_id_get_data (&object->qdata, quark_toggle_refs); tstack = *tstackptr; G_UNLOCK (toggle_refs_mutex);
Sorry, I missed that.. Looks good to me then.
this is the wrong approach to this (high level) problem. it would be good to read the discussion here: http://www.boost.org/doc/libs/1_40_0/doc/html/signals2/thread-safety.html since the issues are identical. it was not easy to make boost::signals2 work this way. i don't think there are shortcuts here.
(In reply to comment #8) > this is the wrong approach to this (high level) problem. [...] it was not easy > to make boost::signals2 work this way. i don't think there are shortcuts here. I don't really know how the signals in boost work (and how they different from GObject), but the rest of the signals code in gsignal.c is already locked using global locks and seems to work fine (as long as you don't care about multi-threaded performance).
(In reply to comment #8) > this is the wrong approach to this (high level) problem. it would be > good to read the discussion here: > > http://www.boost.org/doc/libs/1_40_0/doc/html/signals2/thread-safety.html > > since the issues are identical. it was not easy to make > boost::signals2 work this way. i don't think there are shortcuts > here. I haven't had the time to look into the Boost doc in detail yet, but note that in this bug (at least in two of the three cases) we're not protecting the emission of signals or anything, we're just protecting the access to a shared array, so that you can e.g. call g_object_weak_ref() twice from two different threads. The emission itself is done when the object is destroyed.
ping
(In reply to comment #8) > this is the wrong approach to this (high level) problem. This bug is still present and I cannot see the relation to the problem discussed in the Boost documentation. Again, this has nothing to do with signals, here we're just protecting access to shared arrays.
connecting to a signal involves modifying the state of an array or list. emitting a signal means traversing the array or list. this means that the problem is isomorphous to the one faced by Boost when attempting to providing safe concurrent access to an array/list of closures.
At least one of us is misunderstanding or overlooking something, so let's try to find out what happens :) g_object_weak_ref() adds a callback to an internal array. g_object_weak_unref() removes that callback. Both functions modify the same array, and that's all they do. Nothing more, nothing less. There are no signal emissions here. Since they can be called concurrently the array has to be protected with a mutex. The only other place in the code where that array is used (which is when callbacks are actually invoked) is when the object is being destroyed, but at that point there are no concurrency problems.
Comment on attachment 157180 [details] [review] Patch commit b7616114c6c1884c3a183a4d83156bdf2151b731
This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.