After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 671988 - Quickly registering / unregistering objects on bus causes crash
Quickly registering / unregistering objects on bus causes crash
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: gdbus
2.31.x
Other Linux
: Normal major
: ---
Assigned To: David Zeuthen (not reading bugmail)
gtkdev
Depends on:
Blocks:
 
 
Reported: 2012-03-13 13:59 UTC by Marco Trevisan (Treviño)
Modified: 2012-03-19 21:15 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
BAMF crash on call_in_idle_cb (8.88 KB, text/plain)
2012-03-13 22:18 UTC, Marco Trevisan (Treviño)
  Details
gdbusconnection: get the interface vtable after getting the connection (1.50 KB, patch)
2012-03-19 19:15 UTC, Marco Trevisan (Treviño)
none Details | Review

Description Marco Trevisan (Treviño) 2012-03-13 13:59:28 UTC
When quickly registering and un-registering objects on the bus using gdbus, it possible to get crashes due to the assert in call_in_idle_cb.

Basically that function is called even when there's an invalid vtable, and the assert makes everything crash.
I guess that the idle that calls it should be removed when the vtable is unregistered.

I'm attaching a stacktrace taken from launchpad bug http:://pad.lv/926208
The problem has been noticed in bamfdaemon but also in other packages.
Comment 1 Marco Trevisan (Treviño) 2012-03-13 22:18:17 UTC
Created attachment 209654 [details]
BAMF crash on call_in_idle_cb

I forgot to attach the stacktrace. Here you are.
Comment 2 David Zeuthen (not reading bugmail) 2012-03-16 16:00:50 UTC
Fixing up component (I only watch bugs assigned to gdbus).
Comment 3 David Zeuthen (not reading bugmail) 2012-03-16 16:02:29 UTC
I will need to look at the source code to see what's going on, please paste a link to an online source viewer for the version this is happening with.
Comment 4 Marco Trevisan (Treviño) 2012-03-16 16:06:35 UTC
The problem it's happening on the current BAMF version (but not only, just google for that callback function), you can get that at:
https://code.launchpad.net/~unity-team/bamf/trunk

The most relevant gdbus code is generated by gdbus-codegen, so I'd prefer not introduce workarounds in BAMF itself, but keep using that higher level code.
Comment 5 Allison Karlitskaya (desrt) 2012-03-19 14:14:36 UTC
David: could it be possible that the problem is that the message gets dispatched by the worker into the main thread before the object is unregistered and doesn't get checked again before actually being dispatched (ie: similar to the cancellable stuff we solved recently)?
Comment 6 David Zeuthen (not reading bugmail) 2012-03-19 16:02:05 UTC
(In reply to comment #5)
> David: could it be possible that the problem is that the message gets
> dispatched by the worker into the main thread before the object is unregistered
> and doesn't get checked again before actually being dispatched (ie: similar to
> the cancellable stuff we solved recently)?

No, we do check whether the object is still registered _just_ before doing the callback to the user, see

 http://git.gnome.org/browse/glib/tree/gio/gdbusconnection.c?id=2.31.20#n4668

so this whole thing looks like memory corruption to me. I asked the reported on IRC to check for valgrind, haven't heard back...
Comment 7 David Zeuthen (not reading bugmail) 2012-03-19 16:12:16 UTC
(also note that the vtable is copied at object registration time.. so there is no chance that we're accessing something the user has now freed...)
Comment 8 Marco Trevisan (Treviño) 2012-03-19 17:27:02 UTC
I've ran bamfdaemon in valgrind, there were some read errors (some of them fixed now) reported in this launchpad bug: http://pad.lv/929468

I've fixed the errors related to bamf itself, but there you can also see one more related to gdbus:

==21512== Invalid read of size 4
==21512== at 0x47F92C5: call_in_idle_cb (gdbusconnection.c:4678)
==21512== by 0x495B4AF: g_idle_dispatch (gmain.c:4629)
==21512== by 0x495DA49: g_main_context_dispatch (gmain.c:2510)
==21512== by 0x495DE54: g_main_context_iterate.isra.21 (gmain.c:3118)
==21512== by 0x495E29A: g_main_loop_run (gmain.c:3312)
==21512== by 0x4A434D2: (below main) (libc-start.c:226)
==21512== Address 0x7a45c70 is 0 bytes inside a block of size 12 free'd
==21512== at 0x402B06C: free (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==21512== by 0x4963B7A: standard_free (gmem.c:98)
==21512== by 0x4963CEF: g_free (gmem.c:252)
==21512== by 0x47F40EB: exported_interface_free (gdbusconnection.c:3881)
==21512== by 0x494C297: g_hash_table_remove_node (ghash.c:484)
==21512== by 0x494CA19: g_hash_table_remove_internal (ghash.c:1274)
==21512== by 0x47FC93A: g_dbus_connection_unregister_object (gdbusconnection.c:5059)
==21512== by 0x481319A: remove_connection_locked (gdbusinterfaceskeleton.c:739)
==21512== by 0x48148D2: g_dbus_interface_skeleton_unexport (gdbusinterfaceskeleton.c:969)
==21512== by 0x806F89B: bamf_view_dispose (bamf-view.c:681)
==21512== by 0x48C6921: g_object_unref (gobject.c:2971)
==21512== by 0x48C453B: g_cclosure_marshal_VOID__VOID (gmarshal.c:85)
==21512== by 0x48C2F8B: g_closure_invoke (gclosure.c:774)
==21512== by 0x48D4844: signal_emit_unlocked_R (gsignal.c:3302)
==21512== by 0x48DC0C1: g_signal_emit_valist (gsignal.c:3033)


So, maybe the problem happens during g_dbus_interface_skeleton_unexport that is called in the dispose function of the skeleton's extended class.
Maybe is not even really needed to call that, since it should happen automagically when destroying that class (that extens the skeleton, as said)... But the read error seems to stay there (?).

If you can, please check if bamf-view.c [1] is doing something wrong.

[1] http://is.gd/0j9QT3
Comment 9 Allison Karlitskaya (desrt) 2012-03-19 17:36:38 UTC
(In reply to comment #6)
> No, we do check whether the object is still registered _just_ before doing the
> callback to the user, see
> 
>  http://git.gnome.org/browse/glib/tree/gio/gdbusconnection.c?id=2.31.20#n4668

The first thing the code you linked to does is to pull the vtable out of the invocation's qdata.  The invocation doesn't have its own copy of the vtable -- it's set here:

  http://git.gnome.org/browse/glib/tree/gio/gdbusconnection.c?id=2.31.20#n4798

And presumably it's getting GDBus's internal copy of the vtable (the one that is freed when the object is unregistered).

This seems to explain the problem rather conclusively...
Comment 10 David Zeuthen (not reading bugmail) 2012-03-19 17:40:18 UTC
Interesting. Presumably we can just move

  vtable = g_object_get_data (G_OBJECT (invocation), "g-dbus-interface-vtable");
  g_assert (vtable != NULL && vtable->method_call != NULL);

down after the check. Marco: any chance you can try this and see if the problem goes away?
Comment 11 Marco Trevisan (Treviño) 2012-03-19 17:44:56 UTC
I look, even if for me is quite hard to reproduce the issue, but I'll try.
Comment 12 Marco Trevisan (Treviño) 2012-03-19 19:15:43 UTC
Created attachment 210108 [details] [review]
gdbusconnection: get the interface vtable after getting the connection

This patch does what you asked, I had no crash so far, but for me it was hard to reproduce them even without it...
Comment 13 David Zeuthen (not reading bugmail) 2012-03-19 21:15:39 UTC
OK, I've committed this with a slightly more informative commit message and a reference to this bug

 http://git.gnome.org/browse/glib/commit/?id=322c6e93444e74ae99b73f1ae7e3b55563470431

Thanks,
David