GNOME Bugzilla – Bug 683830
gconf should initialize DBus Glib for threading
Last modified: 2018-08-17 13:58:42 UTC
Created attachment 224056 [details] [review] Patch to fix this issue GConf with the DBus backend has been crashing a couple of multi-threaded applications due to the 1.6.4 libdbus update, with a stacktrace that looks similar to: _dbus_watch_invalidate (watch=0x0) at ../../dbus/dbus-watch.c:154 free_watches (transport=0x7f12ee9c9ed0) at ../../dbus/dbus-transport-socket.c:83 socket_disconnect (transport=0x7f12ee9c9ed0) at ../../dbus/dbus-transport-socket.c:987 _dbus_transport_disconnect (transport=0x7f12ee9c9ed0) at ../../dbus/dbus-transport.c:507 _dbus_transport_disconnect (transport=0x7f12ee9c9ed0) at ../../dbus/dbus-transport.c:498 An example is in https://bugzilla.gnome.org/show_bug.cgi?id=659756. The workaround used for that bug was to pull in libdbus just to call dbus_g_thread_init(). However, this is ugly, and requires an explicit dependency on libdbus for applications that use gconf, even if they do not use dbus directly. This patch calls dbus_g_thread_init() before establishing the connection to DBus to avoid the race condition that leads up to the above stack trace.
Comment on attachment 224056 [details] [review] Patch to fix this issue dbus_g_thread_init can't be called multi times, so calling it from a library will break applications that call it explicitly.
Yes it can. It just needs to be marked as such in documentation. Please see https://bugs.freedesktop.org/show_bug.cgi?id=54770.
well it's still going to be a source of bugs. The problem is you can't use dbus-glib before initializing threading if you initialize threading later (since you'll start calling pthread functions on fake mutexes that were created before threading was initialized etc).
Ugh, okay, that sounds bad. On the other hand, having every non-libdbus-using application add an explicit libdbus dependency just to call dbus_g_thread_init() doesn't seem particularly sane.
no good answer here. One idea would be to port GConf to gdbus, but that's an awful lot of churn for a deprecated library that's on its way out the door.
That is true. I looked at GConf's code -- it doesn't look like it uses threads. In fact, it looks like all the threading's getting done in gdbus. Could gdbus initialize libdbus for threads then? Or perhaps could we make libdbus default to thread-safe operation, with an opt-out function?
I don't think it's a big enough problem to solve really. The only app that i've heard of running into problems has been evolution, and it's already taken steps to fix things. In hindsight, we probably shouldn't have moved gconf to dbus when it was already on its death bed, but probably easier to just fix up the apps that run into issues at this point.
(In reply to comment #6) > That is true. I looked at GConf's code -- it doesn't look like it uses threads. > In fact, it looks like all the threading's getting done in gdbus. dbus-glib, which is what GConf uses, is not GDBus. > Could gdbus > initialize libdbus for threads then? GDBus does not use libdbus: it's a reimplementation of the DBus protocol. > Or perhaps could we make libdbus default > to thread-safe operation, with an opt-out function? this would obviously be a backward compatibility break for libdbus/dbus-glib.
(In reply to comment #7) > I don't think it's a big enough problem to solve really. The only app that > i've heard of running into problems has been evolution, and it's already taken > steps to fix things. In hindsight, we probably shouldn't have moved gconf to > dbus when it was already on its death bed, but probably easier to just fix up > the apps that run into issues at this point. And colord, and gwibber, and Banshee, and Tomboy. Can we move gconf back to whatever it was using before? Also, there weren't issues with gconf + dbus in Ubuntu 12.04 -- these were only introduced in 12.10, due to the libdbus upgrade, if my suspicions are correct.
(In reply to comment #8) > (In reply to comment #6) > > That is true. I looked at GConf's code -- it doesn't look like it uses threads. > > In fact, it looks like all the threading's getting done in gdbus. > > dbus-glib, which is what GConf uses, is not GDBus. Yes, I'm quite well aware of that. Since the crash was happening in libdbus, I figured it might be that dbus-glib and gdbus were fighting over libdbus without locking. > > Could gdbus > > initialize libdbus for threads then? > > GDBus does not use libdbus: it's a reimplementation of the DBus protocol. It is? Why did we need yet another reimplementation of the DBus protocol? > > Or perhaps could we make libdbus default > > to thread-safe operation, with an opt-out function? > > this would obviously be a backward compatibility break for libdbus/dbus-glib. Why so? From what I can see, only the following changes need to be made: 1. Enable thread-safe operation by default 2. Make dbus_g_thread_init() no-op, and deprecate it 3. Add a new function to allow turning off of thread-safety. The above steps can be done without causing an ABI break. As for behavioural changes, applications that do not require thread-safety won't be affected, and applications which were crashing due to this issue in the first place would no longer crash. Which leaves applications that require thread-safety to be turned off. The question is, why would you require that? The only thing I can possibly think of would be related to performance, but if you're using DBus in such a way that your bottleneck is lock-contention within libdbus, I think you've got some way more serious issues here. How many applications fall into this category anyway?
(In reply to comment #10) > > GDBus does not use libdbus: it's a reimplementation of the DBus protocol. > > It is? Why did we need yet another reimplementation of the DBus protocol? There were a number of reasons (some actually dealing with fundamental problems in the way libdbus managed multiple threads actually), but that's probably a discussion for a different forum. > > > Or perhaps could we make libdbus default > > > to thread-safe operation, with an opt-out function? > > > > this would obviously be a backward compatibility break for libdbus/dbus-glib. > > Why so? From what I can see, only the following changes need to be made: > 1. Enable thread-safe operation by default > 2. Make dbus_g_thread_init() no-op, and deprecate it > 3. Add a new function to allow turning off of thread-safety. Can you file this suggestion on bugzilla.freedesktop.org ? That way Simon, Colin, etc can chime in.
Here you go: https://bugs.freedesktop.org/show_bug.cgi?id=54972
(In reply to comment #3) > The problem is you can't use > dbus-glib before initializing threading if you initialize threading later > (since you'll start calling pthread functions on fake mutexes that were created > before threading was initialized etc). This is meant to work: libdbus keeps a list of the locations of all of its mutexes, and when you initialize threading, libdbus goes through the list replacing them with real mutexes. However, dbus-glib isn't thread-safe. Using the dbus-glib main loop glue to dispatch libdbus *might* work; the dbus-glib specialized GType system (and everything built on it, like exported objects and DBusGProxy) certainly won't. (In reply to comment #10) > 1. Enable thread-safe operation by default As I commented on the freedesktop bug, we can't. dbus_threads_init_default() allocates memory, and libdbus claims to support systems and environments where malloc() can fail. Some uses of mutexes in libdbus are in functions that can fail, so in theory, those functions could dbus_threads_init_default() first, and fail if that fails (i.e. lazy initialization). However, some uses are in functions that have no way to signal failure; those functions can't really use dbus_threads_init_default(). dbus_g_thread_init() currently ignores the return value from dbus_threads_init_default(), which is clearly wrong: it should g_error ("out of memory") if dbus_threads_init_default() fails.
Hi I can confirm this bug happening on fedora 18 (x86_64) with banshee. Here's the original bug report I wrote which includes a stack trace of banshee: https://bugzilla.redhat.com/show_bug.cgi?id=867133 For the record, on debian/ubuntu they have seem to patched gconf so apps that worked before wont crash, here's ubuntu bug with the patch: https://bugs.launchpad.net/ubuntu/+source/banshee/+bug/1048341 Cheers,
(In reply to comment #14) > For the record, on debian/ubuntu they have seem to patched gconf so apps that > worked before wont crash, here's ubuntu bug with the patch: For the record, this is a Ubuntu-only change so far, and after reading this discussion I’m not going to import it in Debian.
Any reason why not using dbus_threads_init_default (). According to documentation it's allowed to call it multiple times. http://dbus.freedesktop.org/doc/api/html/group__DBusThreads.html#ga33b6cf3b8f1e41bad5508f84758818a7
(In reply to comment #16) > Any reason why not using dbus_threads_init_default (). > According to documentation it's allowed to call it multiple times. > http://dbus.freedesktop.org/doc/api/html/group__DBusThreads.html#ga33b6cf3b8f1e41bad5508f84758818a7 Yes, the reason is that if you call it after using the dbus-glib API from somewhere else, it will trigger crashes.
I think the central issue here is that to be thread-safe, dbus_threads_init_default() must be called before starting your second thread; but if your only use of libdbus is via long dependency chains and you're not writing in C, you can't necessarily achieve that. In the short term, it would be best for Banshee (and other affected applications) to call dbus_threads_init_default() as early as possible, before starting a second thread (in their main()-equivalent, if C# has one). Making libdbus thread-safe-by-default would be great, *if* someone can do it without breaking API/ABI, which is by no means trivial. I think the first step would be to make the atomic operation fallback, on Unix platforms that don't support atomic operations, use a global pthread_mutex_t rather than going via the DBusMutex abstraction.
this bug also affects rawstudio since Fedora 18 , now I see this bug on Fedora 20
GConf has been deprecated since 2011. GConf is not under active development anymore. Its codebase has been archived: https://gitlab.gnome.org/Archive/gconf/commits/master dconf and gsettings are its successors. See https://developer.gnome.org/gio/stable/ch34.html and https://developer.gnome.org/GSettings/ for porting info. Closing this report as WONTFIX as part of Bugzilla Housekeeping to reflect reality. Feel free to open a task in GNOME Gitlab if the issue described in this task still applies to a recent + supported version of dconf/gsettings. Thanks!