GNOME Bugzilla – Bug 703045
race condition in g_dbus_connection_call_internal()
Last modified: 2017-05-11 14:12:50 UTC
I saw this in valgrind when testing SyncEvolution. Introducing multithreading led to this problem: Second thread calls g_object_unref() on EBookClientView: ==4577== Invalid read of size 4 ==4577== at 0x8F5B200: g_dbus_connection_call_internal (gdbusconnection.c:5420) ==4577== by 0x8F68813: g_dbus_proxy_call_internal (gdbusproxy.c:2775) ==4577== by 0x8F6A233: g_dbus_proxy_call (gdbusproxy.c:3017) ==4577== by 0x734828A: proxy_method_call (e-gdbus-templates.c:1962) ==4577== by 0x734C92C: e_gdbus_proxy_method_call_void (e-gdbus-templates.c:1972) ==4577== by 0x5298ABB: book_client_view_dispose (e-book-client-view.c:832) ==4577== by 0x9620733: g_object_unref (gobject.c:2987) ==4577== by 0x724E55: SyncEvo::EvolutionContactSource::listAllItems(std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > >&) (EvolutionContactSource.h:39) ==4577== by 0x823834: SyncEvo::SyncSourceRevisions::detectChanges(SyncEvo::ConfigNode&, SyncEvo::SyncSourceRevisions::ChangeMode) (SyncSource.cpp:1168) ==4577== by 0x8DBF0F: SyncEvo::TrackingSyncSource::beginSync(std::string const&, std::string const&) (TrackingSyncSource.cpp:113) ==4577== by 0x81AEDF: SyncEvo::SyncSourceSession::startDataRead(char const*, char const*) (SyncSource.cpp:516) ==4577== by 0x7C84F7: boost::function2<unsigned short, char const*, char const*>::operator()(char const*, char const*) const (function_template.hpp:760) ==4577== by 0x7E126A: SyncEvo::OperationWrapperSwitch<unsigned short ()(char const*, char const*), 2, unsigned short>::operator()(SyncEvo::SyncSource&, char const*, char const*) const (SyncSource.h:980) ==4577== by 0x8472BC: SyncEvolution_StartDataRead (SynthesisDBPlugin.cpp:618) ==4577== by 0x94B2CC: sysync::TPluginApiDS::apiReadSyncSet(bool) (pluginapids.cpp:1063) ==4577== by 0x96EF67: sysync::TCustomImplDS::implStartDataRead() (customimplds.cpp:1708) ==4577== by 0x9BA5F9: sysync::TStdLogicDS::performStartSync() (stdlogicds.cpp:302) ==4577== by 0x9BAAF8: sysync::StartSyncThreadFunc(sysync::TThreadObject*, unsigned long) (stdlogicds.cpp:453) ==4577== by 0x901730: PosixThreadFunc (platform_thread.cpp:50) ==4577== by 0x9D84B4F: start_thread (pthread_create.c:304) ==4577== by 0xA4FEA7C: clone (clone.S:112) Main thread dispatches glib events: ==4577== Address 0x19df7b58 is 24 bytes inside a block of size 48 free'd ==4577== at 0x4C27D4E: free (vg_replace_malloc.c:427) ==4577== by 0x8F0B447: g_simple_async_result_finalize (gsimpleasyncresult.c:257) ==4577== by 0x96207C7: g_object_unref (gobject.c:3024) ==4577== by 0x8F5C71D: g_dbus_connection_call_done (gdbusconnection.c:5342) ==4577== by 0x8F0C276: g_simple_async_result_complete (gsimpleasyncresult.c:775) ==4577== by 0x8F0C378: complete_in_idle_cb (gsimpleasyncresult.c:787) ==4577== by 0x9AA4104: g_main_context_dispatch (gmain.c:2715) ==4577== by 0x9AA4437: g_main_context_iterate.isra.24 (gmain.c:3290) ==4577== by 0x9AA44F3: g_main_context_iteration (gmain.c:3351) ==4577== by 0x865ABD: SyncEvo::CondTimedWaitGLib(pthread_cond_t*, pthread_mutex_t*, bool&, long) (SyncContext.cpp:3008) ==4577== by 0x9019BE: sysync::TThreadObject::waitfor(int) (platform_thread.cpp:214) I think g_dbus_proxy_call_internal() is at fault here. Patch will follow.
Created attachment 247718 [details] [review] avoid risky read operation
I guess it's plausible this could happen but only if you're running the thread-default mainloop from another thread. Which is a very weird thing to do, but, hey, IIRC I've seen the Evolution codebase do things like this. IIRC Ryan did the CallState changes so touched the code last. Adding as Cc. The patch you proposed should not be committed. We really want to G_DBUS_DEBUG=call to work as advertised. I think we should just add some simple reference counting to CallState instead.
I can confirm this read-after-free issue. I have a little different call stack, but this is the same bug. Invalid read of size 4 at 0x5AB91BD: g_dbus_connection_call_internal (gdbusconnection.c:5571) by 0x5ABB868: g_dbus_connection_call (gdbusconnection.c:5845) by 0x5AC0F86: has_connection (gdbusnameowning.c:431) by 0x5AC1AB2: g_bus_own_name_on_connection (gdbusnameowning.c:550) by 0x73FDC5: XXX::setName ... by 0x6494B3: XXX::runTest Address 0x1a0245b8 is 24 bytes inside a block of size 48 free'd at 0x4C28577: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) by 0x5FFBF7E: g_free (gmem.c:197) by 0x60125CA: g_slice_free1 (gslice.c:1124) by 0x5A65A37: g_simple_async_result_finalize (gsimpleasyncresult.c:259) by 0x5D70FCA: g_object_unref (gobject.c:3197) by 0x5AB949E: g_dbus_connection_call_done (gdbusconnection.c:5493) by 0x5A66786: g_simple_async_result_complete (gsimpleasyncresult.c:777) by 0x5A667E8: complete_in_idle_cb (gsimpleasyncresult.c:789) by 0x5FF62A5: g_main_context_dispatch (gmain.c:3066) by 0x5FF6627: g_main_context_iterate.isra.24 (gmain.c:3713) by 0x5FF6A39: g_main_loop_run (gmain.c:3907)
Ryan, any comment? Apparently I am not the only one running into this.
The bug #748263 seems to be a duplicate of this bug, though it contains two different patches. (In reply to David Zeuthen (not reading bugmail) from comment #2) > I guess it's plausible this could happen but only if you're running the > thread-default mainloop from another thread. Which is a very weird thing to > do, but, hey, IIRC I've seen the Evolution codebase do things like this. It'll be useful to have some pointers. Such things shouldn't be done and the evolution/-data-server code should be fixed to not do that. The bug #748263 comment #11 contains an information from an address sanitizer. If I read it correctly then the catch up is that all the work is done in a dedicated thread, which doesn't have set a thread-default main context, thus the main() loop is used for idle callbacks (and such), thus the GTask itself finished in a free main() loop, while the dedicated thread is waiting for the thread scheduler to continue to run. There is no g_main_context_iterate() or similar involved here.
Created attachment 351646 [details] [review] 0001-gdbusconnection-Add-refcounting-on-CallState-to-avoi.patch
Milan, since this is biting Evolution the most it seems, can you review please? (And in general, you're a smart guy who knows C...don't be afraid to jump in and write some glib patches sometimes ;) Writing patches for glib isn't my full time job either, I just do it with a portion of my time since it's a crucial dependency for my projects)
Oh wait sorry, you did in https://bugzilla.gnome.org/show_bug.cgi?id=748263 and it got stalled. Sorry, I take that back! I'll dup this one on that one, since there's more discussion there.
*** This bug has been marked as a duplicate of bug 748263 ***