GNOME Bugzilla – Bug 723276
Locale related crash in EDS from openismus-work-3-8 branch
Last modified: 2014-02-14 09:50:29 UTC
Created attachment 267621 [details] [review] Patch that avoids crash when changing locale There is a crash in EDS as originally reported by Patrick Ohly and it is reproducible using the latest version of testpim.py from http://cgit.freedesktop.org/SyncEvolution/syncevolution/ and EDS from openismus-work-3-8 branch, found at https://git.gnome.org/browse/evolution-data-server/log/?h=openismus-work-3-8 . It seems that this happens because when locales change, there is an attempt to access members of a book's private structure while it was or is in course of being freed in another thread. With the attached patch I have not been able to reproduce the crash any more. Crash back trace:
+ Trace 233097
$1 = (EDataBook *) 0x8280c60 (gdb) p book->priv $2 = (EDataBookPrivate *) 0xaaaaaaaa Thank you, Alexandru
It is not clear to me what the root cause of this bug is and thus what the proper fix needs to look like. Can someone explain that to me? The invalid pointer was handed to data_book_localed_appeared and thus data_book_localed_ready/data_book_locale_changed as part of the g_bus_watch_name() watch. data_book_finalize() does destroy that watch in data_book_finalize, so at least data_book_localed_appeared should be safe. In addition to that watch, EDataBook also tracks the Locale1Proxy construction with book->priv->localed_cancel. That operation should have been canceled. Is there perhaps some missing error checking here? static void data_book_localed_ready (GObject *source_object, GAsyncResult *res, gpointer user_data) { EDataBook *book = (EDataBook *)user_data; GError *error = NULL; book->priv->localed_proxy = e_dbus_locale1_proxy_new_finish (res, &error); if (book->priv->localed_proxy == NULL) { g_warning ("Error fetching localed proxy: %s", error->message); g_error_free (error); } data_book_localed_ready uses the book pointer before checking whether the operation succeeded. If it was canceled, accessing "book" is suspect and probably must be avoided. Or perhaps the operation couldn't be canceled in time (race condition between threads, operation already complete but not finalized when EDataBook destroys itself)? The patch introduces a custom pointer tracking mechanism. If we have to go down that route, then I think a generic mechanism from glib would be better, like GWeakRef. For reference, this problem showed up in combination with some automated tests in SyncEvolution. See https://bugs.freedesktop.org/show_bug.cgi?id=59571#c20
Looking at the stacktrace - this looks wrong, it's the old locale change handling which was in EDataBook. We changed this to be handled in EDataBookFactory, which is the right thing to do since there is absolutely only ever one data book factory in the addressbook process, the EDataBookFactory effectively has ownership of every EDataBook on the server process. With this setup there is much less confusion and separate EDataBook instances competing to watch the locale1 bus name. Has this been tested on the fixed upstream code at all ?
(In reply to comment #0) > Created an attachment (id=267621) [details] [review] > Patch that avoids crash when changing locale > > There is a crash in EDS as originally reported by Patrick Ohly and it is > reproducible using the latest version of testpim.py from > http://cgit.freedesktop.org/SyncEvolution/syncevolution/ and EDS from > openismus-work-3-8 branch Ah, I'm sorry I missed this comment as I was reading this quickly. The fix I made for handling locale changes was only ever done in master and I did not have time to land these changes on openismus-work-3-8 I would recommend looking at the following commit and trying to apply the appropriate port to openismus-work-3-8 if you need to work with that 3.8 based branch. commit c8914fc768a69360cef64a65424e05a2f39d25b3 Author: Tristan Van Berkom <tristanvb@openismus.com> Date: Sat Nov 30 01:19:04 2013 +0900 EDataBookFactor / EDataBook: Migrated locale handling to EDataBookFactory The EDataBookFactory is the server side singleton, this is the right place to handle locale changes, now each EDataBook is set into the correct locale by the factory, when books are created, and when the locale change notification arrives.
Tristan, thanks for pointing out the fix for the problem from the master branch. Alexandru, I agree with Tristan, that commit needs to be back-ported.
Created attachment 268307 [details] [review] Patch pointed by Tristan ported to openismus-work-3-8 branch
I have attached a patch that contains the changes done by Tristan ported to the openismus-3-8 branch. I did not reproduce the crash with this patch.
So, is this patch fine? As I said, on my side I was unable to reproduce the crash.
(In reply to comment #7) > So, is this patch fine? As I said, on my side I was unable to reproduce the > crash. Looks all around good to me - you seem to have caught the ref-counting issue with e_book_backend_set_locale() which I fixed in a subsequent patch, I was a bit worried you might miss that one. There is one ref-counting bug in your patch: +gboolean +e_data_book_set_locale (EDataBook *book, + const gchar *locale, + GCancellable *cancellable, + GError **error) +{ + EBookBackend *backend; + gboolean success; + + g_return_val_if_fail (E_IS_DATA_BOOK (book), FALSE); + + backend = e_data_book_get_backend (book); + success = e_book_backend_set_locale (backend, + locale, + cancellable, + error); + + if (success) + e_dbus_address_book_set_locale (book->priv->dbus_interface, locale); + + g_object_unref (backend); + + return success; +} In the master version we use e_data_book_ref_backend() (which doesnt exist in the 3.8 branch) - since you are using e_data_book_get_backend() you must not unref the backend (this will surely lead to crashes).
When updatng the patch, please include a commit message. The message should reference this bug and the commits from the master branch which were cherry-picked. Something like this: addressbook: fix localed race condition <problem description here> (cherry picked from commit Fixes BGO #723276
Created attachment 268920 [details] [review] Patch pointed by Tristan ported to openismus-work-3-8 branch Thanks for the observations, I have updated the patch accordingly
I tested the updated patch with testLocalePhone, without running into the issue I had earlier. I pushed the patch to openismus-work-3-9. => resolved. Thanks everyone.