After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 753874 - [review] fix races in manager/prop_filter()
[review] fix races in manager/prop_filter()
Status: RESOLVED FIXED
Product: NetworkManager
Classification: Platform
Component: general
git master
Other Linux
: Normal normal
: ---
Assigned To: NetworkManager maintainer(s)
NetworkManager maintainer(s)
Depends on:
Blocks:
 
 
Reported: 2015-08-20 12:31 UTC by Thomas Haller
Modified: 2015-08-21 15:32 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Thomas Haller 2015-08-20 12:31:17 UTC
please review
Comment 1 Thomas Haller 2015-08-20 12:32:21 UTC
th/prop-filter-bgo753874
Comment 2 Dan Winship 2015-08-20 13:03:43 UTC
> manager: don't invoke non-thread-safe operations during prop_filter()

>+			reply = g_dbus_message_new_method_error (pfd->message,
>+			                                         NM_PERM_DENIED_ERROR,
>+			                                         (error_message = "Object doesn't exist."));

NM_PERM_DENIED_ERROR isn't the right error here. It should be 'org.freedesktop.DBus.Error.UnknownObject'. But alternatively, you could just add a mutex to NMBusManager so that you can make the call from the filter thread, and then just let gdbus return the error itself if the object isn't there.


> manager: check for object type in do_set_property_check()

(Likewise, the error here should be 'org.freedesktop.DBus.Error.InvalidArgs')



> manager: fix race subscribing prop_filter()

I think this is a bug in g_dbus_connection_remove_filter(); it ought to be guaranteed that if you remove a filter while it's being run, that the GDestroyNotify doesn't run until after the filter finishes running.

And this patch does not fix the problem; if you remove the filter after on_worker_message_received() has copied the filter list but before it invokes our filter, then the user_data will get g_slice_free()ed, and so the g_weak_ref_get() in prop_filter() will read freed memory.
Comment 3 Thomas Haller 2015-08-20 13:45:38 UTC
(In reply to Dan Winship from comment #2)
> > manager: don't invoke non-thread-safe operations during prop_filter()
> 
> >+			reply = g_dbus_message_new_method_error (pfd->message,
> >+			                                         NM_PERM_DENIED_ERROR,
> >+			                                         (error_message = "Object doesn't exist."));
> 
> NM_PERM_DENIED_ERROR isn't the right error here. It should be
> 'org.freedesktop.DBus.Error.UnknownObject'. But alternatively, you could
> just add a mutex to NMBusManager so that you can make the call from the
> filter thread, and then just let gdbus return the error itself if the object
> isn't there.

Yes, we could make NMBusManager thread-safe. But I think this is simpler, isn't it? We have very few functions that are used multi-threaded (prop_filter() being one of them). It seems simpler to keep the thread-safe API as small as possible.

Error fixed.


> > manager: check for object type in do_set_property_check()
> 
> (Likewise, the error here should be 'org.freedesktop.DBus.Error.InvalidArgs')

fixed.

> > manager: fix race subscribing prop_filter()
> 
> I think this is a bug in g_dbus_connection_remove_filter(); it ought to be
> guaranteed that if you remove a filter while it's being run, that the
> GDestroyNotify doesn't run until after the filter finishes running.
> 
> And this patch does not fix the problem; if you remove the filter after
> on_worker_message_received() has copied the filter list but before it
> invokes our filter, then the user_data will get g_slice_free()ed, and so the
> g_weak_ref_get() in prop_filter() will read freed memory.

Oh right. This is a race and indeed a bug in g_dbus_connection_remove_filter() because you cannot unsubscribe safely. I added a semi-workaround for that issue.

I didn't see that issue. The patch was for another race: even if g_dbus_connection_remove_filter() would delay calling the destroy-function until all filters ran, it could still invoke the filter with a disposed/disposing/post-finalized NMManager instance.



Repushed. How about now?
Comment 4 Thomas Haller 2015-08-20 16:25:03 UTC
(In reply to Thomas Haller from comment #3)
> (In reply to Dan Winship from comment #2)
> > > manager: fix race subscribing prop_filter()
> > 
> > I think this is a bug in g_dbus_connection_remove_filter(); it ought to be
> > guaranteed that if you remove a filter while it's being run, that the
> > GDestroyNotify doesn't run until after the filter finishes running.

FTR: it's bug 704568.
Comment 5 Dan Winship 2015-08-21 14:58:24 UTC
(In reply to Thomas Haller from comment #3)
> I didn't see that issue. The patch was for another race: even if
> g_dbus_connection_remove_filter() would delay calling the destroy-function
> until all filters ran, it could still invoke the filter with a
> disposed/disposing/post-finalized NMManager instance.

Ah, right.

Though unlikely, the new code could still hit g_dbus_connection_remove_filter()'s race condition though:

          (worker thread)
          on_worker_message_received()
            filters[n] = connection->filters[n]

  (main thread)
  g_dbus_connection_remove_filter()
    _set_prop_filter_free()
    g_idle_add (_set_prop_filter_free2)
  ...
  _set_prop_filter_free2()
    g_slice_free()

          message = filters[n].func (...)
            g_weak_ref_get()

So maybe it's not worth trying to work around this while the bug still exists in glib, and you should just drop the last fixup.

Everything else looks good.