GNOME Bugzilla – Bug 785110
Thread-safety issues with using the same SoupSession synchronously from multiple threads
Last modified: 2018-06-21 05:13:02 UTC
See attached patch to the get.c and run it like this against the simple_httpd example: > (update the port) > ./get -q http://127.0.0.1:36577/get.c When keeping this running forever, the following warning is printed sometimes > (get:818): libsoup-WARNING **: (soup-connection.c:660):soup_connection_set_state: runtime check failed: (state == SOUP_CONNECTION_IDLE || state == SOUP_CONNECTION_DISCONNECTED) When stopping the server while it is running, everything is exploding spectacularly (but not every time): > (get:3053): GLib-CRITICAL **: g_main_context_ref: assertion 'g_atomic_int_get (&context->ref_count) > 0' failed > /get.c: 4 Could not connect: Connection refused > /get.c: 4 Could not connect: Connection refused > > (get:3053): GLib-CRITICAL **: g_main_context_ref: assertion 'g_atomic_int_get (&context->ref_count) > 0' failed > > (get:3053): GLib-CRITICAL **: g_main_context_unref: assertion 'g_atomic_int_get (&context->ref_count) > 0' failed > (get:3116): GLib-GObject-CRITICAL **: g_object_unref: assertion 'G_IS_OBJECT (object)' failed > fish: “./get -q http://127.0.0.1:36577…” terminated by signal SIGSEGV (Address boundary error) The same also happens (every few runs) when starting it without the server running at all. With a single thread (commenting out that for loop), this doesn't seem to happen.
Created attachment 355922 [details] [review] Update get.c to run requests forever from multiple threads
Backtrace for the first one: (get:4696): libsoup-WARNING **: (soup-connection.c:660):soup_connection_set_state: runtime check failed: (state == SOUP_CONNECTION_IDLE || state == SOUP_CONNECTION_DISCONNECTED) Thread 2 "get" received signal SIGTRAP, Trace/breakpoint trap. [Switching to Thread 0x7ffff2086700 (LWP 4700)] _g_log_abort (breakpoint=breakpoint@entry=1) at ././glib/gmessages.c:549 549 ././glib/gmessages.c: No such file or directory. (gdb) thread apply all bt
+ Trace 237666
Thread 7 (Thread 0x7fffebfff700 (LWP 4705))
Thread 2 (Thread 0x7ffff2086700 (LWP 4700))
Backtrace of the second (get:5665): GLib-CRITICAL **: g_main_context_ref: assertion 'g_atomic_int_get (&context->ref_count) > 0' failed (get:5665): GLib-CRITICAL **: g_main_context_ref: assertion 'g_atomic_int_get (&context->ref_count) > 0' failed /get.c: 4 Could not connect: Connection refused Thread 10 "get" received signal SIGTRAP, Trace/breakpoint trap. [Switching to Thread 0x7fffea7fc700 (LWP 5677)] _g_log_abort (breakpoint=breakpoint@entry=1) at ././glib/gmessages.c:549 549 ././glib/gmessages.c: No such file or directory. (gdb) thread apply all bt
+ Trace 237667
Thread 11 (Thread 0x7fffe9084700 (LWP 5678))
Thread 10 (Thread 0x7fffea7fc700 (LWP 5677))
Thread 9 (Thread 0x7fffeaffd700 (LWP 5676))
Thread 8 (Thread 0x7fffeb7fe700 (LWP 5675))
Thread 7 (Thread 0x7fffebfff700 (LWP 5674))
Thread 6 (Thread 0x7ffff0883700 (LWP 5673))
Thread 5 (Thread 0x7ffff1084700 (LWP 5672))
Thread 4 (Thread 0x7ffff1885700 (LWP 5671))
Thread 3 (Thread 0x7fffe9885700 (LWP 5670))
Thread 2 (Thread 0x7ffff2086700 (LWP 5669))
Thread 1 (Thread 0x7ffff7fa9e00 (LWP 5665))
Some details for the second (the main context has a refcount of 0):
+ Trace 237668
$1 = (SoupSocketProperties *) 0x7fffd4023ac0 (gdb) print props->async_context $2 = (GMainContext *) 0x555555776300 (gdb) print *props->async_context $3 = { mutex = { p = 0x7fffec0014c0, i = {3959428288, 32767} }, cond = { p = 0x0, i = {0, 0} }, owner = 0x7fffec0049f0, owner_count = 36577, waiters = 0x7fffec004a10, ref_count = 0, sources = 0x0, pending_dispatches = 0xa779c, timeout = 0, next_id = 0, source_lists = 0x0, in_check_or_prepare = 0, need_wakeup = 35, poll_records = 0x1, n_poll_records = 3959436048, cached_poll_array = 0x7fffec004de0, cached_poll_array_size = 1433887552, wakeup = 0x7ffff7b7c6a1, wake_up_rec = { fd = 0, events = 0, revents = 0 }, poll_changed = 0, poll_func = 0x7fffec004a30, time = 36577, ---Type <return> to continue, or q <return> to quit--- time_is_fresh = -335525296 }
And details for the first (note that this is without stopping the server, so it's probably unexpected indeed that the remote is disconnected... but nonetheless it seems like something that can happen in real situations):
+ Trace 237669
state=SOUP_CONNECTION_REMOTE_DISCONNECTED) at soup-connection.c:659 659 g_warn_if_fail (state == SOUP_CONNECTION_IDLE || (gdb) print state $1 = SOUP_CONNECTION_REMOTE_DISCONNECTED
Could you try with the patch from bug #762138 comment #3, please? It will address the second part of this bug, but I'm not sure about the other parts. In case it'll fix it for you completely, then this bug can be closed as a duplicate of the older bug report.
So far it looks all good here, but I'll let it running over night to be sure.
(get:19856): libsoup-WARNING **: (soup-connection.c:670):soup_connection_set_state: runtime check failed: (state == SOUP_CONNECTION_IDLE || state == SOUP_CONNECTION_DISCONNECTED) Happened again
Thanks for a quick testing. In that case let's keep this for the first issue, the bug #762138 for the second.
Yeah, the other bug did not happen over night also. Only the warning from comment 8 quite a few times.
*** Bug 796607 has been marked as a duplicate of this bug. ***
This MR fixed the remaining issue : https://gitlab.gnome.org/GNOME/libsoup/merge_requests/7
After a couple rounds of discussion, I've merged that MR. Glad you found that it fixed bug #762138 as well. My main concern now is there are still other objects in libsoup which do not use threadsafe refcounting: SoupMessageBody, SoupMessageHeaders, SoupMessageQueue, and SoupServer. Generally, it's expected that structs and objects in GNOME are not threadsafe, except that reffing/unreffing is threadsafe. I understand that SoupMessage and SoupServer are not expected to be threadsafe, unlike SoupSession, and I also understand that using atomic operations can have unexpected costs, but it seems like generally a good idea for ref/unref operations to be atomic unless there is a compelling reason otherwise, right?
(In reply to Michael Catanzaro from comment #13) >...it seems like generally a good idea for ref/unref operations to be atomic > unless there is a compelling reason otherwise, right? +1 from my side. It'll be better if libsoup uses atomic ref/unref on its non-GObject-derived structures. And on those you named I do not think there would be any significant performance impact (those are not ref/unref-ed that often anyway, are they?).
(In reply to Milan Crha from comment #14) > (In reply to Michael Catanzaro from comment #13) > >...it seems like generally a good idea for ref/unref operations to be atomic > > unless there is a compelling reason otherwise, right? right :) > > +1 from my side. It'll be better if libsoup uses atomic ref/unref on its > non-GObject-derived structures. And on those you named I do not think there > would be any significant performance impact (those are not ref/unref-ed that > often anyway, are they?). It shouldn't cause a performance impact (we use that same non-gobject-based-but-atomic-refcounted technique in GStreamer and it hasn't been an issue :) ). I'll have a look and propose new issue and merge-request for those.