After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 788368 - Race condition in GDbusObjectManagerClient
Race condition in GDbusObjectManagerClient
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: gio
unspecified
Other Linux
: Normal normal
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2017-09-30 09:37 UTC by Beniamino Galvani
Modified: 2017-10-11 15:36 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
[PATCH] gio: fix race condition in GDBusObjectManagerClient (1.07 KB, patch)
2017-09-30 09:39 UTC, Beniamino Galvani
none Details | Review
[PATCH v2] gio: fix race condition in GDBusObjectManagerClient (1.61 KB, patch)
2017-10-02 11:48 UTC, Beniamino Galvani
committed Details | Review

Description Beniamino Galvani 2017-09-30 09:37:49 UTC
We see frequent crashes of nmcli in the GLib code when the object
manager client is initialized. For example:

Thread 1:
#0  in _int_malloc (av=av@entry=0x7f8470000020, bytes=bytes@entry=128) at malloc.c:3782
#1  in __GI___libc_malloc (bytes=128) at malloc.c:2897
#2  in g_realloc (mem=0x0, n_bytes=128) at gmem.c:159
#3  in g_ptr_array_maybe_expand (array=array@entry=0x7f847800b4e0, len=len@entry=1) at garray.c:1118
#4  in g_ptr_array_add (array=0x7f847800b4e0, data=0x7f847000aec0) at garray.c:1382
#5  in g_main_context_check (context=context@entry=0x7f847000ad60, max_priority=0, fds=fds@entry=0x7f847000afb0, n_fds=n_fds@entry=1) at gmain.c:3817
#6  in g_main_context_iterate (context=0x7f847000ad60, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at gmain.c:3924
#7  in g_main_loop_run (loop=0x7f847000ac60) at gmain.c:4123
#8  in initable_init (initable=0x7f8470009a50, cancellable=0x0, error=0x7f8486763a98) at gdbusproxy.c:1949
#9  in g_initable_new_valist (object_type=<optimized out>, first_property_name=first_property_name@entry=0x7f8490ccaf90 "g-connection", var_args=var_args@entry=0x7f8486763910, cancellable=cancellable@entry=0x0, error=error@entry=0x7f8486763a98) at ginitable.c:228
#10 in g_initable_new (object_type=object_type@entry=94165753751728, cancellable=cancellable@entry=0x0, error=error@entry=0x7f8486763a98, first_property_name=first_property_name@entry=0x7f8490ccaf90 "g-connection") at ginitable.c:146
#11 in add_interfaces (manager=manager@entry=0x55a4ac4e3460 [GDBusObjectManagerClient], object_path=0x7f847c06d5e0 "/org/freedesktop/NetworkManager/IP4Config/75", ifaces_and_properties=<optimized out>, name_owner=name_owner@entry=0x7f8478019310 ":1.1680") at gdbusobjectmanagerclient.c:1495
#12 in process_get_all_result (manager=0x55a4ac4e3460 [GDBusObjectManagerClient], value=<optimized out>, name_owner=0x7f8478019310 ":1.1680") at gdbusobjectmanagerclient.c:1647
#13 in initable_init (initable=0x55a4ac4e3460, cancellable=0x0, error=0x7f8486763d20) at gdbusobjectmanagerclient.c:1400
#14 in async_init_thread (task=0x55a4ac4e9840 [GTask], source_object=<optimized out>, task_data=<optimized out>, cancellable=<optimized out>) at gasyncinitable.c:257
...

Thread 2:
#0  in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  in __GI_abort () at abort.c:90
#2  in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7f848ffd4608 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:196
#3  in _int_free (ar_ptr=0x7f8478000020, ptr=<optimized out>, str=0x7f848ffd46a0 "free(): invalid next size (normal)", action=3) at malloc.c:5023
#4  in _int_free (av=0x7f8478000020, p=<optimized out>, have_lock=0) at malloc.c:3845
#5  in g_free (mem=0x7f847801ca30) at gmem.c:189
#6  in g_hash_table_resize (hash_table=hash_table@entry=0x55a4ac4cb580 = {...}) at ghash.c:619
#7  in g_hash_table_insert_node (hash_table=0x55a4ac4cb580 = {...}) at ghash.c:645
#8  in g_hash_table_insert_node (hash_table=hash_table@entry=0x55a4ac4cb580 = {...}, node_index=node_index@entry=4, key_hash=key_hash@entry=3982537675, new_key=new_key@entry=0x55a4ac4e97c0, new_value=new_value@entry=0x55a4ac4e2160, keep_new_key=keep_new_key@entry=0, reusing_key=reusing_key@entry=0) at ghash.c:978
#9  in g_hash_table_insert_internal (hash_table=0x55a4ac4cb580 = {...}, key=0x55a4ac4e97c0, value=0x55a4ac4e2160, keep_new_key=0) at ghash.c:1229
#10 in add_interfaces (manager=manager@entry=0x55a4ac4e3460 [GDBusObjectManagerClient], object_path=0x55a4ac4e9640 "/org/freedesktop/NetworkManager/IP6Config/85", ifaces_and_properties=<optimized out>, name_owner=0x7f8478019310 ":1.1680") at gdbusobjectmanagerclient.c:1558
#11 in on_control_proxy_g_signal (proxy=<optimized out>, sender_name=<optimized out>, signal_name=0x55a4ac4e7bb0 "InterfacesAdded", parameters=0x7f8478009070, user_data=0x55a4ac4e3460) at gdbusobjectmanagerclient.c:1672
#12 in ffi_call_unix64 () at ../src/x86/unix64.S:76
#13 in ffi_call (cif=cif@entry=0x7ffd3f42aea0, fn=<optimized out>, rvalue=0x7ffd3f42ae10, avalue=avalue@entry=0x7ffd3f42ad90) at ../src/x86/ffi64.c:522
#18 in <emit signal ??? on instance 0x7f8478004400 [GDBusProxy]> (instance=instance@entry=0x7f8478004400, signal_id=<optimized out>, detail=detail@entry=0) at gsignal.c:3447
    #14 in g_cclosure_marshal_generic (closure=0x7f8478019f70, return_gvalue=0x0, n_param_values=<optimized out>, param_values=<optimized out>, invocation_hint=<optimized out>, marshal_data=0x0) at gclosure.c:1490
    #15 in g_closure_invoke (closure=0x7f8478019f70, return_value=return_value@entry=0x0, n_param_values=4, param_values=param_values@entry=0x7ffd3f42b0a0, invocation_hint=invocation_hint@entry=0x7ffd3f42b040) at gclosure.c:804
    #16 in signal_emit_unlocked_R (node=node@entry=0x7f8478018290, detail=detail@entry=0, instance=instance@entry=0x7f8478004400, emission_return=emission_return@entry=0x0, instance_and_params=instance_and_params@entry=0x7ffd3f42b0a0) at gsignal.c:3635
    #17 in g_signal_emit_valist (instance=<optimized out>, signal_id=<optimized out>, detail=<optimized out>, var_args=var_args@entry=0x7ffd3f42b260) at gsignal.c:3391
#19 in on_signal_received (connection=<optimized out>, sender_name=0x7f847c0721f0 ":1.1680", object_path=<optimized out>, interface_name=<optimized out>, signal_name=0x7f847c0722a0 "InterfacesAdded", parameters=0x7f8478009070, user_data=0x7f8478015d60) at gdbusproxy.c:917
#20 in emit_signal_instance_in_idle_cb (data=data@entry=0x7f847c074950) at gdbusconnection.c:3705
#21 in g_idle_dispatch (source=0x7f847c04d0d0, callback=0x7f8490c4a740 <emit_signal_instance_in_idle_cb>, user_data=0x7f847c074950) at gmain.c:5543
...

It seems to me that the call to:

  g_hash_table_insert (manager->priv->map_object_path_to_object_proxy, ...)

in add_interfaces() in not protected by locks and can race with other
operations on the hash table in other threads.
Comment 1 Beniamino Galvani 2017-09-30 09:39:37 UTC
Created attachment 360695 [details] [review]
[PATCH] gio: fix race condition in GDBusObjectManagerClient
Comment 2 Colin Walters 2017-09-30 13:24:54 UTC
Review of attachment 360695 [details] [review]:

Some searching in the history led me to https://git.gnome.org/browse/glib/commit/?id=4e7ef619ebe025bd97d2a3ef8dd8c1b7baa903b2
and from the looks of it, indeed this was just an oversight.
Comment 3 Philip Withnall 2017-09-30 19:42:49 UTC
Review of attachment 360695 [details] [review]:

I’d prefer it if the g_hash_table_insert() call were moved higher up in the function, inside the existing critical section. Introducing a second critical section to add_interfaces() seems like it has more potential to cause problems.
Comment 4 Beniamino Galvani 2017-10-02 11:48:06 UTC
Created attachment 360761 [details] [review]
[PATCH v2] gio: fix race condition in GDBusObjectManagerClient

(In reply to Philip Withnall from comment #3)
> I’d prefer it if the g_hash_table_insert() call were moved higher up in the
> function, inside the existing critical section. Introducing a second
> critical section to add_interfaces() seems like it has more potential to
> cause problems.

Makes sense, patch updated.
Comment 5 Philip Withnall 2017-10-02 12:26:42 UTC
Review of attachment 360761 [details] [review]:

That looks good to me. Does it fix the nmcli problem you were seeing?
Comment 6 Beniamino Galvani 2017-10-02 13:49:35 UTC
(In reply to Philip Withnall from comment #5)
> That looks good to me. Does it fix the nmcli problem you were seeing?

We have a test suite [1] that runs nightly on internal infrastructure
and in September we got 10 crash reports that look related to this
issue.

Now tests are running with the patched version; I have to wait some
days to confirm that the fix is working.

[1] https://github.com/NetworkManager/NetworkManager-ci
Comment 7 Philip Withnall 2017-10-02 14:07:39 UTC
Ah, excellent. Let’s wait a week or two and see if any failures appear there before pushing to master. Thanks a lot, that CI stuff is very helpful.
Comment 8 Philip Withnall 2017-10-11 08:46:46 UTC
Beniamino, have any problems showed up in the CI testing over the last week?
Comment 9 Beniamino Galvani 2017-10-11 15:22:50 UTC
(In reply to Philip Withnall from comment #8)
> Beniamino, have any problems showed up in the CI testing over the last week?

No, I haven't seen any problem since we switched to the patched version.
Comment 10 Philip Withnall 2017-10-11 15:31:37 UTC
(In reply to Beniamino Galvani from comment #9)
> (In reply to Philip Withnall from comment #8)
> > Beniamino, have any problems showed up in the CI testing over the last week?
> 
> No, I haven't seen any problem since we switched to the patched version.

OK, I’ll push it then. Thanks a lot for the patch and for testing. :-)