GNOME Bugzilla – Bug 704568
GDBus: threading issue in g_dbus_connection_remove_filter
Last modified: 2015-08-25 12:54:33 UTC
Removing a GDBus filter is not thread-safe, if a DBus message is being processed and at that point g_dbus_connection_remove_filter() is called, the filter will be removed and the data freed using the provided free function. The problem is that at that point the filter function can still be invoked with (now invalid) data. See this snippet from on_worker_message_received: ... CONNECTION_LOCK (connection); num_filters = connection->filters->len; filters = g_new0 (FilterCallback, num_filters); for (n = 0; n < num_filters; n++) { FilterData *data = connection->filters->pdata[n]; filters[n].func = data->filter_function; filters[n].user_data = data->user_data; } CONNECTION_UNLOCK (connection); <---- g_dbus_connection_remove_filter() called at this point from different thread, user_data is freed ----> /* then call the filters in order (without holding the lock) */ for (n = 0; n < num_filters; n++) { message = filters[n].func (connection, message, TRUE, filters[n].user_data); if (message == NULL) break; g_dbus_message_lock (message); } Perhaps a simple solution to this is to free the user_data in idle callback of the reader thread.
MOAR MUTEXES!!!
Created attachment 309732 [details] [review] gdbus: take the lock in g_dbus_connection_add_filter() later
Created attachment 309733 [details] [review] gdbus: avoid duplicate filter ids in g_dbus_connection_add_filter()
Created attachment 309734 [details] [review] gdbus: avoid race during g_dbus_connection_remove_filter() If the filter is being invoked during g_dbus_connection_remove_filter(), don't destroy the user-data right away. Instead mark the filter to be removed and let the worker thread invoke the callback instead. That changes behavior in that in face of a race, we invoke the callback no longer on the thread that calls g_dbus_connection_remove_filter().
Created attachment 309847 [details] [review] gdbus: fix race condition in connection filter freeing If you called g_dbus_connection_remove_filter() on a filter while it was running (or about to be run) in another thread, its GDestroyNotify would be run immediately, potentially causing the filter thread to crash. Fix this by refcounting the filters, and using the existing mechanism for running a GDestroyNotify in another thread in the case where the the gdbus thread is the one that frees it. Also, add a bit of documentation explaining this (and add a related clarification to g_dbus_connection_signal_subscribe()). ============ Alternate approach; I think this is a little cleaner.
Review of attachment 309847 [details] [review]: ::: gio/gdbusconnection.c @@ +2166,2 @@ guint id; + guint ref_count; Doesn't this need to be volatile and use atomics? It looks like it's used in both on_worker_message_received() which is in the worker thread, and in remove_filter, which is from the user thread.
ref_count is only ever examined/modified while holding CONNECTION_LOCK so it should be fine. (see also bug 705152 comment 7 and later)
(In reply to Dan Winship from comment #7) > ref_count is only ever examined/modified while holding CONNECTION_LOCK so it > should be fine. (see also bug 705152 comment 7 and later) Ah, makes sense yes.
Attachment 309847 [details] pushed as 7da3922 - gdbus: fix race condition in connection filter freeing
Review of attachment 309732 [details] [review]: Sure.
Review of attachment 309733 [details] [review]: Did you actually hit this? See also https://bugzilla.gnome.org/show_bug.cgi?id=687098 It might be worth breaking that out into more reusable code mapping guint -> object. Your implementation here has quite a major performance cliff.
Comment on attachment 309734 [details] [review] gdbus: avoid race during g_dbus_connection_remove_filter() Marking this as obsolete in favor of danw's patch.
Moved the filter overflow patch into https://bugzilla.gnome.org/show_bug.cgi?id=754078