After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 748996 - GNetworkAddress is not thread safe
GNetworkAddress is not thread safe
Status: RESOLVED OBSOLETE
Product: glib
Classification: Platform
Component: gio
2.42.x
Other Linux
: Normal normal
: ---
Assigned To: gtkdev
gtkdev
: 741238 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2015-05-06 08:34 UTC by Milan Crha
Modified: 2018-05-24 17:49 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
proposed glib patch (6.12 KB, patch)
2015-05-06 09:06 UTC, Milan Crha
rejected Details | Review

Description Milan Crha 2015-05-06 08:34:39 UTC
I sometimes get a runtime warning
> GLib-GIO-CRITICAL **: g_network_address_set_addresses: assertion 'addresses != NULL && addr->priv->sockaddrs == NULL' failed
when running evolution from scratch. it's not there always, and it's sometimes only once there, other times multiple times there.

I tried to investigate it and it turned out that the 'addresses' is not NULL, but also 'addr->priv->sockaddrs' is not NULL, aka it's already set. This setting is done in Thread 15 and Thread 8 at the same time, as can be seen here (I added there some debugging and locking around the backtrace capturing, thus the lines do not match actual source files):

Thread 8 (Thread 0x7f4360ff9700 (LWP 6406))

  • #0 syscall
    at ../sysdeps/unix/sysv/linux/x86_64/syscall.S line 38
  • #1 g_mutex_lock_slowpath
    at gthread-posix.c line 1308
  • #2 g_mutex_lock
    at gthread-posix.c line 1332
  • #3 g_network_address_set_addresses
    at gnetworkaddress.c line 237
  • #4 g_network_address_address_enumerator_next
    at gnetworkaddress.c line 912
  • #5 g_socket_address_enumerator_next
    at gsocketaddressenumerator.c line 83
  • #6 g_proxy_address_enumerator_next
    at gproxyaddressenumerator.c line 210
  • #7 g_socket_address_enumerator_next
    at gsocketaddressenumerator.c line 83
  • #8 g_network_monitor_base_can_reach
    at gnetworkmonitorbase.c line 194
  • #9 g_network_monitor_can_reach
    at gnetworkmonitor.c line 139
  • #10 camel_network_service_can_reach_sync
    at camel-network-service.c line 930
  • #11 store_maybe_connect_sync
    at camel-store.c line 314
  • #12 camel_store_get_folder_info_sync
  • #13 store_get_folder_info_thread
    at camel-store.c line 1676
  • #14 g_task_thread_pool_thread
    at gtask.c line 1215
  • #15 g_thread_pool_thread_proxy
    at gthreadpool.c line 307
  • #16 g_thread_proxy
    at gthread.c line 764
  • #17 start_thread
    at pthread_create.c line 310
  • #18 clone
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S line 109

Comment 1 Milan Crha 2015-05-06 08:36:17 UTC
I can add some locking on the camel side, but it still feels more correct to make the GNetworkAddress thread safe, than adding thread safety around the network monitor usages in each application.
Comment 2 Milan Crha 2015-05-06 09:06:23 UTC
Created attachment 302960 [details] [review]
proposed glib patch

for glib;

This adds locking around addr->priv->sockaddrs, which fixes the issue spotted in evolution.
Comment 3 Dan Winship 2015-05-16 14:35:06 UTC
Comment on attachment 302960 [details] [review]
proposed glib patch

GLib objects are not expected to be thread-safe unless explicitly documented as such, so this is theoretically NOTABUG.

But if it's useful and doesn't complicate things too much...

This could be done with atomic ops rather than mutexes though (and if two threads try to set it at the same time you just throw away the second result).

Also, if GNetworkAddress is going to be thread-safe, then GNetworkService needs to be too.
Comment 4 Milan Crha 2015-10-15 08:32:50 UTC
I think this thread unsafety can be a cause for a downstream crash report:
https://bugzilla.redhat.com/show_bug.cgi?id=1271786

Could you comment on it, what your opinion is, please?
Comment 5 Dan Winship 2015-10-15 12:41:16 UTC
Yes, it looks like camel reuses the same GNetworkAddress in multiple threads, which is illegal. Though given that there's no "g_socket_connectable_copy()", it would be impossible for it not to... So I guess we should make all of the built-in connectables thread-safe.
Comment 6 Milan Crha 2016-01-26 16:21:04 UTC
(In reply to Dan Winship from comment #5)
> Yes, it looks like camel reuses the same GNetworkAddress in multiple
> threads, which is illegal. Though given that there's no
> "g_socket_connectable_copy()", it would be impossible for it not to... So I
> guess we should make all of the built-in connectables thread-safe.

The thread safety would be very nice (or provide the copy function, because it's impossible to case-by-case it, especially considering g_network_address_new_loopback() behaviour, though the thread safety is much better than the copy function).

Also, any progress on this, please?

I made a simple workaround on the Camel side, when possible for evolution-data-server 3.19.90+ and 3.18.5+.
Comment 7 Dan Winship 2016-01-26 16:30:12 UTC
(In reply to Milan Crha from comment #6)
> Also, any progress on this, please?

I was hoping you'd submit an updated patch... (and in "git format-patch" format.)

(In reply to Dan Winship from comment #3)
> This could be done with atomic ops rather than mutexes though (and if two
> threads try to set it at the same time you just throw away the second
> result).

(In reply to Dan Winship from comment #5)
> So I guess we should make all of the built-in connectables thread-safe.
Comment 8 Milan Crha 2016-01-26 18:21:39 UTC
Oh, I misunderstood it then. It seemed to me that you added a new patch (wrong reading on my side, I'm sorry) and you are waiting for a review from someone else, someone from glib.
Comment 9 Michael Catanzaro 2017-05-10 01:07:23 UTC
I think you should just add a copy function. GLib objects should probably only be threadsafe when there's a compelling reason for it. There is a cost to extra locking....
Comment 10 Philip Withnall 2017-05-10 08:40:37 UTC
(In reply to Michael Catanzaro from comment #9)
> I think you should just add a copy function. GLib objects should probably
> only be threadsafe when there's a compelling reason for it. There is a cost
> to extra locking....

Correct, I’d want to see a copy function here rather than adding locking inside GLib. The general position of GLib on locking is that it should be done outside GLib unless it really has to be; otherwise the cost of locking is imposed on all users of GLib, rather than just the few apps which are threaded.
Comment 11 Philip Withnall 2017-05-10 08:41:32 UTC
Review of attachment 302960 [details] [review]:

No locking please.

::: gio/gnetworkaddress.c
@@ +105,3 @@
     {
+      g_list_free_full (addr->priv->sockaddrs, g_object_unref);
+      addr->priv->sockaddrs = NULL;

If you want to split this g_list_free_full() change out into a separate patch, that would be useful.
Comment 12 Milan Crha 2017-05-10 12:28:58 UTC
(In reply to Philip Withnall from comment #10)
> ... the few apps which are threaded.

I do not agree with the above quoted statement. As long as an application uses GLib sync and async APIs it is threaded, indirectly, even if itself doesn't create threads directly. The GTask/GSimpleAsyncRequest is used transparently in the background of GLib/glib-networking without bothering users with those tiny details. That makes GNetworkAdress somewhat special. Just my personal opinion.

The copy function won't change much, when one thread runs the query as in comment #0 and another calls "create a copy", no? Unless you mean to have a copy function and still let the API user do locking on its own.

Also, having 'copy' function on a reference-based object is rather unusual (then a change in the "copied" object doesn't change "all instances", as it is usually done on GObject descendants). The 'usual' word is crucial, as there are surely exceptions. If you want to add GNetworkAddress to that group of exceptions, then disregard this paragraph. :)
Comment 13 Philip Withnall 2017-05-10 12:51:18 UTC
Sorry, I didn’t read through the backtrace properly before.

The bug here seems to be that libcamel is using a non-thread-safe API (GNetworkMonitor) from two threads. It doesn’t matter that g_network_monitor_get() returns the same network monitor instance for both threads: it’s the caller’s responsibility to add a lock around usage of it.

There’s an argument to be made that GNetworkMonitor should be thread safe since it’s typically used as a global singleton (through g_network_monitor_get()). However, once you start making it threadsafe, you have to start making various other classes threadsafe too, and it all balloons into a lot of locking which is pointless overhead for most applications.

If people are going to make that argument, my suggestion would be to not add locking to GNetworkMonitor, but to make g_network_monitor_get() return a per-thread monitor singleton instead.

(In reply to Milan Crha from comment #12)
> (In reply to Philip Withnall from comment #10)
> > ... the few apps which are threaded.
> 
> I do not agree with the above quoted statement. As long as an application
> uses GLib sync and async APIs it is threaded, indirectly, even if itself
> doesn't create threads directly.

Yes, but most of the GLib APIs are not used from these worker threads, or are used with appropriate external locking.

> Unless you mean to have a
> copy function and still let the API user do locking on its own.

Yes.
Comment 14 Dan Winship 2017-05-10 13:50:14 UTC
(In reply to Philip Withnall from comment #13)
> The bug here seems to be that libcamel is using a non-thread-safe API
> (GNetworkMonitor) from two threads. It doesn’t matter that
> g_network_monitor_get() returns the same network monitor instance for both
> threads: it’s the caller’s responsibility to add a lock around usage of it.

No, that's wrong. All global singletons must be thread safe, because they might be used by multiple libraries in the app that know nothing about each other.

> If people are going to make that argument, my suggestion would be to not add
> locking to GNetworkMonitor, but to make g_network_monitor_get() return a
> per-thread monitor singleton instead.

That would be an API break. (It is currently safe [modulo bugs] to pass a GNetworkMonitor to another thread.)
Comment 15 Philip Withnall 2017-05-10 14:15:11 UTC
(In reply to Dan Winship from comment #14)
> (In reply to Philip Withnall from comment #13)
> > The bug here seems to be that libcamel is using a non-thread-safe API
> > (GNetworkMonitor) from two threads. It doesn’t matter that
> > g_network_monitor_get() returns the same network monitor instance for both
> > threads: it’s the caller’s responsibility to add a lock around usage of it.
> 
> No, that's wrong. All global singletons must be thread safe, because they
> might be used by multiple libraries in the app that know nothing about each
> other.
> 
> > If people are going to make that argument, my suggestion would be to not add
> > locking to GNetworkMonitor, but to make g_network_monitor_get() return a
> > per-thread monitor singleton instead.
> 
> That would be an API break. (It is currently safe [modulo bugs] to pass a
> GNetworkMonitor to another thread.)

Argh, you’re right. That means adding locking to GNetworkMonitor then. Looking more closely at how GNetworkMonitor modifies the passed-in GSocketConnectable, it potentially also means making the GSocketConnectable threadsafe, as you and Milan said before. If we can find a way to make the monitor threadsafe without having to also make every GSocketConnectable instance threadsafe, that would be good, since there could be implementations of GSocketConnectable coming from outside GLib.
Comment 16 Michael Catanzaro 2018-02-09 20:31:14 UTC
*** Bug 741238 has been marked as a duplicate of this bug. ***
Comment 17 Michael Catanzaro 2018-02-09 20:41:46 UTC
*** Bug 741238 has been marked as a duplicate of this bug. ***
Comment 18 Michael Catanzaro 2018-02-09 20:42:52 UTC
(In reply to Dan Winship from comment #14)
> No, that's wrong. All global singletons must be thread safe, because they
> might be used by multiple libraries in the app that know nothing about each
> other.

I've never heard this rule before; rather, I'd assumed that the global singletons should only be accessed from the main thread. This really deserves to be documented... somewhere.
Comment 19 Philip Withnall 2018-02-12 13:05:37 UTC
(In reply to Michael Catanzaro from comment #18)
> (In reply to Dan Winship from comment #14)
> > No, that's wrong. All global singletons must be thread safe, because they
> > might be used by multiple libraries in the app that know nothing about each
> > other.
> 
> I've never heard this rule before; rather, I'd assumed that the global
> singletons should only be accessed from the main thread. This really
> deserves to be documented... somewhere.

Patch welcome.
Comment 20 Milan Crha 2018-02-12 13:23:55 UTC
(In reply to Michael Catanzaro from comment #18)
> This really deserves to be documented... somewhere.

Documentation is not any fix.

I agree with Dan here (also see comment #5), the related structures should be made thread safe. I also cannot make a copy of the default GNetworkMonitor structure, because I do not know its type (and it's quite bad idea to dig for it with any glib foo, just to try to workaround some issue in the glib).
Comment 21 Philip Withnall 2018-02-12 13:49:45 UTC
(In reply to Milan Crha from comment #20)
> (In reply to Michael Catanzaro from comment #18)
> > This really deserves to be documented... somewhere.
> 
> Documentation is not any fix.

It’s a fix for the fact that people don’t realise the global singletons are thread-safe.

> I agree with Dan here (also see comment #5), the related structures should
> be made thread safe. I also cannot make a copy of the default
> GNetworkMonitor structure, because I do not know its type (and it's quite
> bad idea to dig for it with any glib foo, just to try to workaround some
> issue in the glib).

Yes, that also needs to be done.
Comment 22 Michael Catanzaro 2018-03-21 00:50:18 UTC
Adjusting component to gio in preparation for GitLab migration
Comment 23 GNOME Infrastructure Team 2018-05-24 17:49:06 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/glib/issues/1033.