After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 777307 - race condition between gdbus signal callback and g_bus_unwatch_name - can access freed memory
race condition between gdbus signal callback and g_bus_unwatch_name - can acc...
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: gdbus
unspecified
Other All
: Normal normal
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2017-01-16 08:02 UTC by insun.pyo
Modified: 2017-07-21 06:04 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Delegate g_free(client) from its context (1.47 KB, patch)
2017-01-16 08:02 UTC, insun.pyo
none Details | Review
gdbus: Fix atomic accesses to global name watch ID (1.64 KB, patch)
2017-02-03 10:04 UTC, Philip Withnall
committed Details | Review
workaround code (11.16 KB, patch)
2017-02-06 02:38 UTC, insun.pyo
rejected Details | Review
gdbus: Fix race in name watching on connection teardown (4.96 KB, patch)
2017-02-06 12:38 UTC, Philip Withnall
committed Details | Review

Description insun.pyo 2017-01-16 08:02:05 UTC
Created attachment 343528 [details] [review]
Delegate g_free(client) from its context

The on_name_owner_changed() can access the variable "client" in free memory, when another thread call g_bus_unwatch_name() asynchronously.

The emit_signal_instance_in_idle_cb () checks signal_instance->connection->map_id_to_signal_data with connection lock, and executes real signal handler.

In the case of gdbusnamewatching, g_bus_unwaching_name() call g_dbus_connection_signal_unsubscribe() and free client variable. 

----------- scenario -------------
1. [A thread] At emit_signal_instance_in_idle_cb, lock connection
2. [A thread] emit_signal_instance_in_idle_cb() check signal and set "has_subscription = TRUE;"
3. [A thread] unlock connection

4. [B thread] At g_bus_unwatch_name(), calls g_dbus_connection_signal_unsubscribe().
5. [B thread] do "g_free (client);"

6. [A thread] At emit_signal_instance_in_idle_cb, do "signal_instance->callback".
7. [A thread] on_name_owner_changed is called, and access client variable.
              But client is already freed on 5 steps.
Comment 1 Colin Walters 2017-01-16 22:24:34 UTC
I believe there's a bug here, but what you wrote implies that something is accessing "client" when its refcount is zero, which I think we should fix instead somehow.
Comment 2 Milan Crha 2017-02-02 09:55:14 UTC
I guess the below is the same report. It's with 2.50.2, when running unit tests for evolution-data-server:

> ==20828==ERROR: AddressSanitizer: heap-use-after-free on address 0x60b000007ba0 at pc 0x7f4940354499 bp 0x7f492ba93d80 sp 0x7f492ba93d70
> READ of size 8 at 0x60b000007ba0 thread T1
>     #0 0x7f4940354498 in on_connection_disconnected glib-2.50.2/gio/gdbusnamewatching.c:259
>     #1 0x7f493cbd3c57 in ffi_call_unix64 (/lib64/libffi.so.6+0x5c57)
>     #2 0x7f493cbd36b9 in ffi_call (/lib64/libffi.so.6+0x56b9)
>     #3 0x7f493fa2880c in g_cclosure_marshal_generic glib-2.50.2/gobject/gclosure.c:1490
>     #4 0x7f493fa24938 in g_closure_invoke glib-2.50.2/gobject/gclosure.c:804
>     #5 0x7f493fa71348 in signal_emit_unlocked_R glib-2.50.2/gobject/gsignal.c:3635
>     #6 0x7f493fa6f10b in g_signal_emit_valist glib-2.50.2/gobject/gsignal.c:3391
>     #7 0x7f493fa6ff07 in g_signal_emit glib-2.50.2/gobject/gsignal.c:3447
>     #8 0x7f49403255dd in emit_closed_in_idle glib-2.50.2/gio/gdbusconnection.c:1358
>     #9 0x7f493f37f048 in g_idle_dispatch glib-2.50.2/glib/gmain.c:5545
>     #10 0x7f493f375abc in g_main_dispatch glib-2.50.2/glib/gmain.c:3203
>     #11 0x7f493f379f4c in g_main_context_dispatch glib-2.50.2/glib/gmain.c:3856
>     #12 0x7f493f37a522 in g_main_context_iterate glib-2.50.2/glib/gmain.c:3929
>     #13 0x7f493f37b074 in g_main_loop_run glib-2.50.2/glib/gmain.c:4125
>     #14 0x7f493e983913 in source_registry_object_manager_thread evolution-data-server/src/libedataserver/e-source-registry.c:1167
>     #15 0x7f493f3fb049 in g_thread_proxy glib-2.50.2/glib/gthread.c:784
>     #16 0x7f49411506c9 in start_thread (/lib64/libpthread.so.0+0x76c9)
>     #17 0x7f493e15bf6e in clone (/lib64/libc.so.6+0x107f6e)
> 
> 0x60b000007ba0 is located 80 bytes inside of 104-byte region [0x60b000007b50,0x60b000007bb8)
> freed by thread T0 here:
>     #0 0x7f4942145af0 in free (/usr/lib64/libasan.so.3+0xc6af0)
>     #1 0x7f493f39047c in g_free glib-2.50.2/glib/gmem.c:189
>     #2 0x7f49403536fe in client_unref glib-2.50.2/gio/gdbusnamewatching.c:108
>     #3 0x7f4940356c7f in g_bus_unwatch_name glib-2.50.2/gio/gdbusnamewatching.c:845
>     #4 0x404310 in main evolution-data-server/src/calendar/libedata-cal/evolution-calendar-factory-subprocess.c:220
>     #5 0x7f493e074400 in __libc_start_main (/lib64/libc.so.6+0x20400)
> 
> previously allocated by thread T0 here:
>     #0 0x7f4942146010 in calloc (/usr/lib64/libasan.so.3+0xc7010)
>     #1 0x7f493f390386 in g_malloc0 glib-2.50.2/glib/gmem.c:124
>     #2 0x7f493f3906c6 in g_malloc0_n glib-2.50.2/glib/gmem.c:355
>     #3 0x7f4940355858 in g_bus_watch_name glib-2.50.2/gio/gdbusnamewatching.c:563
>     #4 0x40429f in main evolution-data-server/src/calendar/libedata-cal/evolution-calendar-factory-subprocess.c:198
>     #5 0x7f493e074400 in __libc_start_main (/lib64/libc.so.6+0x20400)
> 
> Thread T1 created by T0 here:
>     #0 0x7f49420b0488 in pthread_create (/usr/lib64/libasan.so.3+0x31488)
>     #1 0x7f493f46862b in g_system_thread_new glib-2.50.2/glib/gthread-posix.c:1170
>     #2 0x7f493f3fb31f in g_thread_new_internal glib-2.50.2/glib/gthread.c:874
>     #3 0x7f493f3fb179 in g_thread_new glib-2.50.2/glib/gthread.c:827
>     #4 0x7f493e984e55 in source_registry_initable_init evolution-data-server/src/libedataserver/e-source-registry.c:1385
>     #5 0x7f49401fd473 in g_initable_init glib-2.50.2/gio/ginitable.c:112
>     #6 0x7f493e986201 in e_source_registry_new_sync evolution-data-server/src/libedataserver/e-source-registry.c:1767
>     #7 0x7f493ee3d289 in subprocess_factory_initable_init evolution-data-server/src/libebackend/e-subprocess-factory.c:160
>     #8 0x7f49401fd473 in g_initable_init glib-2.50.2/gio/ginitable.c:112
>     #9 0x7f49401fd732 in g_initable_new_valist glib-2.50.2/gio/ginitable.c:228
>     #10 0x7f49401fd5a4 in g_initable_new glib-2.50.2/gio/ginitable.c:146
>     #11 0x7f4940caff74 in e_subprocess_cal_factory_new evolution-data-server/src/calendar/libedata-cal/e-subprocess-cal-factory.c:174
>     #12 0x404241 in main evolution-data-server/src/calendar/libedata-cal/evolution-calendar-factory-subprocess.c:191
>     #13 0x7f493e074400 in __libc_start_main (/lib64/libc.so.6+0x20400)
> 
> SUMMARY: AddressSanitizer: heap-use-after-free glib-2.50.2/gio/gdbusnamewatching.c:259 in on_connection_disconnected
Comment 3 Philip Withnall 2017-02-03 10:04:35 UTC
Created attachment 344842 [details] [review]
gdbus: Fix atomic accesses to global name watch ID
Comment 4 Philip Withnall 2017-02-03 10:08:50 UTC
This patch fixes an unrelated issue which I spotted while looking at gdbusnamewatching.c.

The thread safety problem in this bug is not trivial to fix, mostly because calls to on_connection_disconnected() and on_name_owner_changed() could effectively happen in any thread (they are emitted in the main thread for the GDBusConnection, which could be different from the thread where g_bus_watch_name() was called).

I think we need to either:
 - do all reads and writes of the Client struct in the stored Client->main_context (so they’re serialised by the main context); or
 - add a lock to the entire struct (so they’re serialised by the lock).

We could fix the immediate problem here by changing on_connection_disconnected() to look up the Client struct from map_id_to_client, just like g_bus_unwatch_name() does. But that won’t fix the other problems in the code, where functions like call_vanished_handler(), which can be run in any thread, change state in the Client.
Comment 5 Philip Withnall 2017-02-05 13:58:41 UTC
Review of attachment 344842 [details] [review]:

Allison says: this one’s fine, obvs.
Comment 6 Philip Withnall 2017-02-05 14:01:54 UTC
Comment on attachment 344842 [details] [review]
gdbus: Fix atomic accesses to global name watch ID

Attachment 344842 [details] pushed as c131865 - gdbus: Fix atomic accesses to global name watch ID
Comment 7 insun.pyo 2017-02-06 02:38:11 UTC
Created attachment 345006 [details] [review]
workaround code
Comment 8 Philip Withnall 2017-02-06 12:36:48 UTC
Review of attachment 345006 [details] [review]:

I don’t think that introducing new public API is the right thing to do here. The two typical patterns for handling this kind of thing are:
 a) Hold a strong reference in the signal callback closure itself (i.e. call ref() on the pointer passed in as the user_data to g_dbus_connection_signal_subscribe()). This forces the user_data to stay alive until after the signal is explicitly disconnected.
 b) Hold a weak reference in the signal callback closure and atomically resolve this to a strong reference or NULL in the callback — i.e. grab a strong reference on the user_data if it still exists; otherwise bail out of the callback. This means the callback can be gracefully executed even after the object is destroyed.

I think approach b) is the better approach here, as it means we don’t have to complicate the reference counting and finalisation code for Client by working out when to explicitly disconnect the signal handlers. We can implement weak references on the Client by atomically looking it up in the map_id_to_client hash table. Patch coming.
Comment 9 Philip Withnall 2017-02-06 12:38:54 UTC
Created attachment 345030 [details] [review]
gdbus: Fix race in name watching on connection teardown

If g_dbus_unwatch_name() is called from one thread at the same time as
the GDBusConnection is emitting ::disconnected in another thread, there
will be a race and the handler for ::disconnected may end up using
memory after it’s freed.

Fix this by serialising through the map_id_to_client, so that
on_connection_disconnected() atomically gets a strong reference to the
Client, or NULL.
Comment 10 Matthias Clasen 2017-05-31 17:21:30 UTC
Review of attachment 345030 [details] [review]:

Makes sense to me.

::: gio/gdbusnamewatching.c
@@ +251,3 @@
 
+/* Return a reference to the #Client for @watcher_id, or %NULL if it’s been
+ * unwatched. This is safe to call from any thread. */

Personal pet peeve of mine: I prefer multiline comments to be lined up as

/* bla
 * bla bla
 * bla
 */

Feel free to ignore
Comment 11 Philip Withnall 2017-06-01 10:53:14 UTC
Thanks for the review. Pushed without changing the comment because I can’t be bothered and am on the fence about formatting. Noted it for future though.

Attachment 345030 [details] pushed as 4dd1b17 - gdbus: Fix race in name watching on connection teardown
Comment 12 Milan Crha 2017-07-21 06:04:39 UTC
Downstream bug report with glib2-2.52.3:
https://bugzilla.redhat.com/show_bug.cgi?id=1473339

You might consider adding the change to the stable branch as well, unless you think it's too risky for it and requires more testing.