After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 723276 - Locale related crash in EDS from openismus-work-3-8 branch
Locale related crash in EDS from openismus-work-3-8 branch
Status: RESOLVED FIXED
Product: evolution-data-server
Classification: Platform
Component: general
unspecified
Other Linux
: Normal normal
: ---
Assigned To: Alexandru Costache
Evolution QA team
Depends on:
Blocks:
 
 
Reported: 2014-01-30 10:57 UTC by Alexandru Costache
Modified: 2014-02-14 09:50 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch that avoids crash when changing locale (5.69 KB, patch)
2014-01-30 10:57 UTC, Alexandru Costache
none Details | Review
Patch pointed by Tristan ported to openismus-work-3-8 branch (18.62 KB, patch)
2014-02-06 16:44 UTC, Alexandru Costache
needs-work Details | Review
Patch pointed by Tristan ported to openismus-work-3-8 branch (17.97 KB, patch)
2014-02-12 15:19 UTC, Alexandru Costache
committed Details | Review

Description Alexandru Costache 2014-01-30 10:57:13 UTC
Created attachment 267621 [details] [review]
Patch that avoids crash when changing locale

There is a crash in EDS as originally reported by Patrick Ohly and it is reproducible using the latest version of testpim.py from http://cgit.freedesktop.org/SyncEvolution/syncevolution/ and EDS from openismus-work-3-8 branch, found at https://git.gnome.org/browse/evolution-data-server/log/?h=openismus-work-3-8 . It seems that this happens because when locales change, there is an attempt to access members of a book's private structure while it was or is in course of being freed in another thread. With the attached patch I have not been able to reproduce the crash any more.

Crash back trace:

  • #0 data_book_locale_changed
    at e-data-book.c line 2046
  • #1 data_book_localed_ready
    at e-data-book.c line 2077
  • #2 g_simple_async_result_complete
    from /usr/lib/libgio-2.0.so.0
  • #3 complete_in_idle_cb
    from /usr/lib/libgio-2.0.so.0
  • #4 g_idle_dispatch
    from /usr/lib/libglib-2.0.so.0
  • #5 g_main_context_dispatch
    from /usr/lib/libglib-2.0.so.0
  • #6 g_main_context_iterate.clone.4
    from /usr/lib/libglib-2.0.so.0
  • #7 g_main_loop_run
    from /usr/lib/libglib-2.0.so.0
  • #8 dbus_server_run_server
    at e-dbus-server.c line 222
  • #9 ffi_call_SYSV
    from /usr/lib/libffi.so.5
  • #10 ffi_call
    from /usr/lib/libffi.so.5
  • #11 g_cclosure_marshal_generic_va
    from /usr/lib/libgobject-2.0.so.0
  • #12 g_type_class_meta_marshalv
    from /usr/lib/libgobject-2.0.so.0
  • #13 _g_closure_invoke_va
    from /usr/lib/libgobject-2.0.so.0
  • #14 g_signal_emit_valist
    from /usr/lib/libgobject-2.0.so.0
  • #15 g_signal_emit
    from /usr/lib/libgobject-2.0.so.0
  • #16 e_dbus_server_run
    at e-dbus-server.c line 414
  • #17 main
    at evolution-addressbook-factory.c line 132
$1 = (EDataBook *) 0x8280c60
(gdb) p book->priv
$2 = (EDataBookPrivate *) 0xaaaaaaaa

Thank you,
Alexandru
Comment 1 Patrick Ohly 2014-01-30 12:12:10 UTC
It is not clear to me what the root cause of this bug is and thus what the proper fix needs to look like. Can someone explain that to me?

The invalid pointer was handed to data_book_localed_appeared and thus 
data_book_localed_ready/data_book_locale_changed as part of the g_bus_watch_name() watch. data_book_finalize() does destroy that watch in data_book_finalize, so at least data_book_localed_appeared should be safe.

In addition to that watch, EDataBook also tracks the Locale1Proxy construction with book->priv->localed_cancel. That operation should have been canceled.

Is there perhaps some missing error checking here?

static void
data_book_localed_ready (GObject *source_object,
			 GAsyncResult *res,
			 gpointer user_data)
{
	EDataBook *book = (EDataBook *)user_data;
	GError *error = NULL;

	book->priv->localed_proxy = e_dbus_locale1_proxy_new_finish (res, &error);

	if (book->priv->localed_proxy == NULL) {
		g_warning ("Error fetching localed proxy: %s", error->message);
		g_error_free (error);
	}

data_book_localed_ready uses the book pointer before checking whether the operation succeeded. If it was canceled, accessing "book" is suspect and probably must be avoided.

Or perhaps the operation couldn't be canceled in time (race condition between threads, operation already complete but not finalized when EDataBook destroys itself)?

The patch introduces a custom pointer tracking mechanism. If we have to go down that route, then I think a generic mechanism from glib would be better, like GWeakRef.

For reference, this problem showed up in combination with some automated tests in SyncEvolution. See https://bugs.freedesktop.org/show_bug.cgi?id=59571#c20
Comment 2 Tristan Van Berkom 2014-01-31 09:15:39 UTC
Looking at the stacktrace - this looks wrong, it's the old locale change
handling which was in EDataBook.

We changed this to be handled in EDataBookFactory, which is the right
thing to do since there is absolutely only ever one data book factory
in the addressbook process, the EDataBookFactory effectively has ownership
of every EDataBook on the server process.

With this setup there is much less confusion and separate EDataBook
instances competing to watch the locale1 bus name.

Has this been tested on the fixed upstream code at all ?
Comment 3 Tristan Van Berkom 2014-01-31 09:19:35 UTC
(In reply to comment #0)
> Created an attachment (id=267621) [details] [review]
> Patch that avoids crash when changing locale
> 
> There is a crash in EDS as originally reported by Patrick Ohly and it is
> reproducible using the latest version of testpim.py from
> http://cgit.freedesktop.org/SyncEvolution/syncevolution/ and EDS from
> openismus-work-3-8 branch

Ah, I'm sorry I missed this comment as I was reading this quickly.

The fix I made for handling locale changes was only ever done in
master and I did not have time to land these changes on openismus-work-3-8

I would recommend looking at the following commit and trying to apply
the appropriate port to openismus-work-3-8 if you need to work with
that 3.8 based branch.

commit c8914fc768a69360cef64a65424e05a2f39d25b3
Author: Tristan Van Berkom <tristanvb@openismus.com>
Date:   Sat Nov 30 01:19:04 2013 +0900

    EDataBookFactor / EDataBook: Migrated locale handling to EDataBookFactory
    
    The EDataBookFactory is the server side singleton, this is the right
    place to handle locale changes, now each EDataBook is set into the
    correct locale by the factory, when books are created, and when the
    locale change notification arrives.
Comment 4 Patrick Ohly 2014-01-31 09:25:42 UTC
Tristan, thanks for pointing out the fix for the problem from the master branch.

Alexandru, I agree with Tristan, that commit needs to be back-ported.
Comment 5 Alexandru Costache 2014-02-06 16:44:16 UTC
Created attachment 268307 [details] [review]
Patch pointed by Tristan ported to openismus-work-3-8 branch
Comment 6 Alexandru Costache 2014-02-06 16:58:58 UTC
I have attached a patch that contains the changes done by Tristan ported to the openismus-3-8 branch. I did not reproduce the crash with this patch.
Comment 7 Alexandru Costache 2014-02-12 08:40:43 UTC
So, is this patch fine? As I said, on my side I was unable to reproduce the crash.
Comment 8 Tristan Van Berkom 2014-02-12 09:47:30 UTC
(In reply to comment #7)
> So, is this patch fine? As I said, on my side I was unable to reproduce the
> crash.

Looks all around good to me - you seem to have caught the ref-counting issue
with e_book_backend_set_locale() which I fixed in a subsequent patch, I was
a bit worried you might miss that one.

There is one ref-counting bug in your patch:

+gboolean
+e_data_book_set_locale (EDataBook *book,
+			const gchar *locale,
+			GCancellable *cancellable,
+			GError **error)
+{
+	EBookBackend *backend;
+	gboolean success;
+
+	g_return_val_if_fail (E_IS_DATA_BOOK (book), FALSE);
+
+	backend = e_data_book_get_backend (book);
+	success = e_book_backend_set_locale (backend,
+					     locale,
+					     cancellable,
+					     error);
+
+	if (success)
+		e_dbus_address_book_set_locale (book->priv->dbus_interface, locale);
+
+	g_object_unref (backend);
+
+	return success;
+}

In the master version we use e_data_book_ref_backend() (which doesnt exist
in the 3.8 branch) - since you are using e_data_book_get_backend() you must
not unref the backend (this will surely lead to crashes).
Comment 9 Patrick Ohly 2014-02-12 12:43:23 UTC
When updatng the patch, please include a commit message. The message should reference this bug and the commits from the master branch which were cherry-picked. Something like this:

addressbook: fix localed race condition

<problem description here>

(cherry picked from commit 

Fixes BGO #723276
Comment 10 Alexandru Costache 2014-02-12 15:19:53 UTC
Created attachment 268920 [details] [review]
Patch pointed by Tristan ported to openismus-work-3-8 branch

Thanks for the observations, I have updated the patch accordingly
Comment 11 Patrick Ohly 2014-02-14 09:50:00 UTC
I tested the updated patch with testLocalePhone, without running into the issue I had earlier. I pushed the patch to openismus-work-3-9.

=> resolved. Thanks everyone.