After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 704568 - GDBus: threading issue in g_dbus_connection_remove_filter
GDBus: threading issue in g_dbus_connection_remove_filter
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: gdbus
unspecified
Other Linux
: Normal normal
: ---
Assigned To: David Zeuthen (not reading bugmail)
gtkdev
Depends on:
Blocks: 754078
 
 
Reported: 2013-07-19 15:54 UTC by Michal Hruby
Modified: 2015-08-25 12:54 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
gdbus: take the lock in g_dbus_connection_add_filter() later (1.14 KB, patch)
2015-08-20 15:35 UTC, Thomas Haller
accepted-commit_now Details | Review
gdbus: avoid duplicate filter ids in g_dbus_connection_add_filter() (1.39 KB, patch)
2015-08-20 15:35 UTC, Thomas Haller
reviewed Details | Review
gdbus: avoid race during g_dbus_connection_remove_filter() (7.24 KB, patch)
2015-08-20 15:35 UTC, Thomas Haller
none Details | Review
gdbus: fix race condition in connection filter freeing (9.57 KB, patch)
2015-08-21 21:43 UTC, Dan Winship
committed Details | Review

Description Michal Hruby 2013-07-19 15:54:21 UTC
Removing a GDBus filter is not thread-safe, if a DBus message is being processed and at that point g_dbus_connection_remove_filter() is called, the filter will be removed and the data freed using the provided free function. The problem is that at that point the filter function can still be invoked with (now invalid) data.

See this snippet from on_worker_message_received:

  ...
  CONNECTION_LOCK (connection);
  num_filters = connection->filters->len;
  filters = g_new0 (FilterCallback, num_filters);
  for (n = 0; n < num_filters; n++)
    {
      FilterData *data = connection->filters->pdata[n];
      filters[n].func = data->filter_function;
      filters[n].user_data = data->user_data;
    }
  CONNECTION_UNLOCK (connection);

<---- g_dbus_connection_remove_filter() called at this point from different thread, user_data is freed ---->

  /* then call the filters in order (without holding the lock) */
  for (n = 0; n < num_filters; n++)
    {
      message = filters[n].func (connection,
                                 message,
                                 TRUE,
                                 filters[n].user_data);
      if (message == NULL)
        break;
      g_dbus_message_lock (message);
    }

Perhaps a simple solution to this is to free the user_data in idle callback of the reader thread.
Comment 1 Allison Karlitskaya (desrt) 2013-08-01 07:42:15 UTC
MOAR MUTEXES!!!
Comment 2 Thomas Haller 2015-08-20 15:35:08 UTC
Created attachment 309732 [details] [review]
gdbus: take the lock in g_dbus_connection_add_filter() later
Comment 3 Thomas Haller 2015-08-20 15:35:14 UTC
Created attachment 309733 [details] [review]
gdbus: avoid duplicate filter ids in g_dbus_connection_add_filter()
Comment 4 Thomas Haller 2015-08-20 15:35:19 UTC
Created attachment 309734 [details] [review]
gdbus: avoid race during g_dbus_connection_remove_filter()

If the filter is being invoked during g_dbus_connection_remove_filter(),
don't destroy the user-data right away. Instead mark the filter to
be removed and let the worker thread invoke the callback instead.

That changes behavior in that in face of a race, we invoke the callback
no longer on the thread that calls g_dbus_connection_remove_filter().
Comment 5 Dan Winship 2015-08-21 21:43:41 UTC
Created attachment 309847 [details] [review]
gdbus: fix race condition in connection filter freeing

If you called g_dbus_connection_remove_filter() on a filter while it
was running (or about to be run) in another thread, its GDestroyNotify
would be run immediately, potentially causing the filter thread to
crash.

Fix this by refcounting the filters, and using the existing mechanism
for running a GDestroyNotify in another thread in the case where the
the gdbus thread is the one that frees it.

Also, add a bit of documentation explaining this (and add a related
clarification to g_dbus_connection_signal_subscribe()).

============

Alternate approach; I think this is a little cleaner.
Comment 6 Colin Walters 2015-08-23 16:50:15 UTC
Review of attachment 309847 [details] [review]:

::: gio/gdbusconnection.c
@@ +2166,2 @@
   guint                       id;
+  guint                       ref_count;

Doesn't this need to be volatile and use atomics?  It looks like it's used in both on_worker_message_received() which is in the worker thread, and in remove_filter, which is from the user thread.
Comment 7 Dan Winship 2015-08-24 13:37:07 UTC
ref_count is only ever examined/modified while holding CONNECTION_LOCK so it should be fine. (see also bug 705152 comment 7 and later)
Comment 8 Colin Walters 2015-08-24 18:20:29 UTC
(In reply to Dan Winship from comment #7)
> ref_count is only ever examined/modified while holding CONNECTION_LOCK so it
> should be fine. (see also bug 705152 comment 7 and later)

Ah, makes sense yes.
Comment 9 Dan Winship 2015-08-24 20:30:56 UTC
Attachment 309847 [details] pushed as 7da3922 - gdbus: fix race condition in connection filter freeing
Comment 10 Colin Walters 2015-08-24 20:31:59 UTC
Review of attachment 309732 [details] [review]:

Sure.
Comment 11 Colin Walters 2015-08-24 20:38:55 UTC
Review of attachment 309733 [details] [review]:

Did you actually hit this?

See also https://bugzilla.gnome.org/show_bug.cgi?id=687098
It might be worth breaking that out into more reusable code mapping guint -> object.

Your implementation here has quite a major performance cliff.
Comment 12 Colin Walters 2015-08-24 20:39:24 UTC
Comment on attachment 309734 [details] [review]
gdbus: avoid race during g_dbus_connection_remove_filter()

Marking this as obsolete in favor of danw's patch.
Comment 13 Colin Walters 2015-08-25 12:54:33 UTC
Moved the filter overflow patch into https://bugzilla.gnome.org/show_bug.cgi?id=754078