After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 613822 - gobject signal connect/disconnect not thread safe
gobject signal connect/disconnect not thread safe
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: gobject
unspecified
Other Linux
: Normal normal
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2010-03-24 16:35 UTC by Olivier Crête
Modified: 2010-10-21 13:28 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Test case (881 bytes, text/plain)
2010-03-26 13:29 UTC, Alberto Garcia
  Details
Patch (4.95 KB, patch)
2010-03-26 15:21 UTC, Alberto Garcia
committed Details | Review

Description Olivier Crête 2010-03-24 16:35:17 UTC
We found a nasty case where you can't connect a handler and disconnect
another handler from the same object at the same time because the closure stuff is not protected by a mutex in GObject.

Bugs #346801 comment #8 and #403150 could also be related to this.

In gobject.c:

g_object_watch_closure() modifies the closure array (it can even re-alloc it)

And object_remove_closure() can access it at the same time.

Solution: Add a lock around accesses to the closure array. Ideally, it would be per-object. But since we suck, I guess it has to be global.


Relevant extract from the stack trace:

Program terminated with signal 11, Segmentation fault.

Thread 17 (process 2012)

  • #0 realloc
    from /lib/libc.so.6
  • #1 IA__g_realloc
    at /home/bifh2/fremantle-arm-prereleased.cs2007q3/work/glib2.0-2.20.3/glib/gmem.c line 170
  • #2 IA__g_object_watch_closure
    at /home/bifh2/fremantle-arm-prereleased.cs2007q3/work/glib2.0-2.20.3/gobject/gobject.c line 3046
  • #3 IA__g_cclosure_new_object
    at /home/bifh2/fremantle-arm-prereleased.cs2007q3/work/glib2.0-2.20.3/gobject/gobject.c line 3105
  • #4 IA__g_signal_connect_object
    at /home/bifh2/fremantle-arm-prereleased.cs2007q3/work/glib2.0-2.20.3/gobject/gobject.c line 2932
  • #5 _element_added_callback
    at fs-element-added-notifier.c line 368

Comment 1 Alberto Garcia 2010-03-26 13:29:49 UTC
Created attachment 157163 [details]
Test case

Here's a test case using g_object_weak_ref/unref
Comment 2 Alberto Garcia 2010-03-26 13:40:40 UTC
(In reply to comment #1)
> Here's a test case using g_object_weak_ref/unref

(which obviously means that the closure array is not the only one affected :))
Comment 3 Olivier Crête 2010-03-26 13:41:36 UTC
Weak ref/unref are also not thread safe, but its a different issue from the closures. That said, both should be fixed.
Comment 4 Alberto Garcia 2010-03-26 15:21:07 UTC
Created attachment 157180 [details] [review]
Patch

Here's the patch, please someone review and commit
Comment 5 Olivier Crête 2010-03-26 15:57:38 UTC
Review of attachment 157180 [details] [review]:

Appart from the one comment below, the rest looks good to me.

::: gobject/gobject.c
@@ +2258,3 @@
    * will only be notified when there is exactly one of them.
    */
+  g_assert (tstack.n_toggle_refs == 1);

What happens if one thread do a g_object_unref(), so refcount goes down to 1 and this is called) while another thread does g_object_add_toggle_ref(). Which will do g_realloc() and may invalidate the buffer.

func = tstack.toggle_refs[0].notify;
data = tstack.toggle_refs[0].data;
object = tstack.object;
UNLOCK();
func(data, object, is_last);
Comment 6 Alberto Garcia 2010-03-26 16:04:44 UTC
(In reply to comment #5)
> +  g_assert (tstack.n_toggle_refs == 1);
>
> What happens if one thread do a g_object_unref(), so refcount goes
> down to 1 and this is called) while another thread does
> g_object_add_toggle_ref(). Which will do g_realloc() and may
> invalidate the buffer.

That's exactly why in this patch I'm not using the original buffer,
but a _copy_:

  ToggleRefStack tstack, *tstackptr;

  G_LOCK (toggle_refs_mutex);
  tstackptr = g_datalist_id_get_data (&object->qdata, quark_toggle_refs);
  tstack = *tstackptr;
  G_UNLOCK (toggle_refs_mutex);
Comment 7 Olivier Crête 2010-03-26 16:12:06 UTC
Sorry, I missed that.. Looks good to me then.
Comment 8 Paul Davis 2010-03-26 19:59:30 UTC
this is the wrong approach to this (high level) problem. it would be good to read the discussion here:

http://www.boost.org/doc/libs/1_40_0/doc/html/signals2/thread-safety.html

since the issues are identical. it was not easy to make boost::signals2 work this way. i don't think there are shortcuts here.
Comment 9 Olivier Crête 2010-03-26 20:13:16 UTC
(In reply to comment #8)
> this is the wrong approach to this (high level) problem. [...] it was not easy > to make boost::signals2 work this way. i don't think there are shortcuts here.

I don't really know how the signals in boost work (and how they different from GObject), but the rest of the signals code in gsignal.c is already locked using global locks and seems to work fine (as long as you don't care about multi-threaded performance).
Comment 10 Alberto Garcia 2010-03-26 21:22:39 UTC
(In reply to comment #8)
> this is the wrong approach to this (high level) problem. it would be
> good to read the discussion here:
>
> http://www.boost.org/doc/libs/1_40_0/doc/html/signals2/thread-safety.html
>
> since the issues are identical. it was not easy to make
> boost::signals2 work this way. i don't think there are shortcuts
> here.

I haven't had the time to look into the Boost doc in detail yet, but
note that in this bug (at least in two of the three cases) we're not
protecting the emission of signals or anything, we're just protecting
the access to a shared array, so that you can e.g. call
g_object_weak_ref() twice from two different threads. The emission
itself is done when the object is destroyed.
Comment 11 Alberto Garcia 2010-05-19 19:34:30 UTC
ping
Comment 12 Alberto Garcia 2010-10-11 16:34:41 UTC
(In reply to comment #8)
> this is the wrong approach to this (high level) problem.

This bug is still present and I cannot see the relation to the problem
discussed in the Boost documentation.

Again, this has nothing to do with signals, here we're just protecting
access to shared arrays.
Comment 13 Paul Davis 2010-10-11 16:38:45 UTC
connecting to a signal involves modifying the state of an array or list. emitting a signal means traversing the array or list. this means that the problem is isomorphous to the one faced by Boost when attempting to providing safe concurrent access to an array/list of closures.
Comment 14 Alberto Garcia 2010-10-11 17:12:31 UTC
At least one of us is misunderstanding or overlooking something, so
let's try to find out what happens :)

g_object_weak_ref() adds a callback to an internal array.

g_object_weak_unref() removes that callback.

Both functions modify the same array, and that's all they do.
Nothing more, nothing less. There are no signal emissions here.

Since they can be called concurrently the array has to be protected
with a mutex.

The only other place in the code where that array is used (which is
when callbacks are actually invoked) is when the object is being
destroyed, but at that point there are no concurrency problems.
Comment 15 Javier Jardón (IRC: jjardon) 2010-10-21 13:27:41 UTC
Comment on attachment 157180 [details] [review]
Patch

commit b7616114c6c1884c3a183a4d83156bdf2151b731
Comment 16 Javier Jardón (IRC: jjardon) 2010-10-21 13:28:19 UTC
This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.