After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 665733 - GDBusConnection holds lock while calling destroynotify
GDBusConnection holds lock while calling destroynotify
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: gdbus
unspecified
Other All
: Normal normal
: ---
Assigned To: David Zeuthen (not reading bugmail)
gtkdev
Depends on:
Blocks:
 
 
Reported: 2011-12-07 15:10 UTC by Allison Karlitskaya (desrt)
Modified: 2011-12-12 16:58 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Allison Karlitskaya (desrt) 2011-12-07 15:10:11 UTC
On g_dbus_connection_unregister_object(), the user_data is freed while holding the connection lock.  If the destroynotify attempts to re-enter the connection then it deadlocks.

We either need to have a recursive lock here, or not make callbacks while holding locks.

Normally I'd suggest that "not make callbacks while holding locks" is the obviously correct answer, but after looking at the code it seems like it might be difficult -- so recursive locks ftw?
Comment 1 David Zeuthen (not reading bugmail) 2011-12-07 15:12:59 UTC
(In reply to comment #0)
> On g_dbus_connection_unregister_object(), the user_data is freed while holding
> the connection lock.  If the destroynotify attempts to re-enter the connection
> then it deadlocks.
> 
> We either need to have a recursive lock here, or not make callbacks while
> holding locks.

I think just deleting the following fragment from call_destroy_notify() in gdbusconnection.c should do the trick

  if ((context == current_context) ||
      (current_context == NULL && context == g_main_context_default ()))
    {
      callback (user_data);
    }
  else

It might cause a test or two to fall over, but I think that just might be test suite problems.

> Normally I'd suggest that "not make callbacks while holding locks" is the
> obviously correct answer, but after looking at the code it seems like it might
> be difficult -- so recursive locks ftw?

I don't like recursive locks.
Comment 2 Allison Karlitskaya (desrt) 2011-12-07 15:14:57 UTC
> I think just deleting the following fragment from call_destroy_notify() in
> gdbusconnection.c should do the trick
> 
>   if ((context == current_context) ||
>       (current_context == NULL && context == g_main_context_default ()))
>     {
>       callback (user_data);
>     }
>   else

Definitely.  I discarded that possibility because I considered it slightly ugly.

> I don't like recursive locks.

That is the correct reply. :)
Comment 3 David Zeuthen (not reading bugmail) 2011-12-07 15:31:35 UTC
OK, fixed here: http://git.gnome.org/browse/glib/commit/?id=70dacf83d23ed468ff60972fd166769482d7195f
Comment 4 Allison Karlitskaya (desrt) 2011-12-11 17:03:55 UTC
I'm not crazy about this solution.  It's starting to cause lots of weird problems in code that wasn't expecting that it would work this way.

It's also a little bit icky in the sense that if you call _unregister_object() and then drop your main context (as could happen if the thread exporting the object is exiting now) then the destroynotify will never be run...

Could we figure out a way to defer the free until after dropping the lock but before _unregister_object() returns?
Comment 5 Simon McVittie 2011-12-12 15:03:48 UTC
(In reply to comment #4)
> Could we figure out a way to defer the free until after dropping the lock but
> before _unregister_object() returns?

See libdbus for an example of how ugly this can get. (Admittedly, it's about 17 times worse in libdbus, because libdbus pretends to be OOM-safe, so you have to allocate all the linked-list nodes you could possibly need in advance... which means performance in the common, non-OOM case is pessimal.)
Comment 6 David Zeuthen (not reading bugmail) 2011-12-12 16:58:14 UTC
As discussed on IRC, we decided to go with the current solution so closing agian.