GNOME Bugzilla – Bug 684238
SoupConnection:state thread-safety issues
Last modified: 2012-11-15 14:20:06 UTC
Reproduced on 638aa6e28d708152cafe7ddbced952f5af2e6237 : (plop:28931): libsoup-CRITICAL **: soup_socket_get_iostream: assertion `SOUP_IS_SOCKET (sock)' failed (plop:28931): GLib-GObject-CRITICAL **: g_object_ref: assertion `G_IS_OBJECT (object)' failed [Thread 0x7fffeccc3700 (LWP 28957) exited] Program received signal SIGSEGV, Segmentation fault.
+ Trace 230862
Thread 140737085699840 (LWP 28959)
Ok reproduced on top of master : Backtrace from first error message : (aldgate-pc:18767): GLib-GIO-CRITICAL **: g_socket_condition_check: assertion `G_IS_SOCKET (socket)' failed Program received signal SIGTRAP, Trace/breakpoint trap.
+ Trace 230875
Thread 140737166194432 (LWP 18779)
Private structure from within soup_connection_get_state : (gdb) p *priv $4 = {socket = 0x0, remote_uri = 0x2aea680, proxy_uri = 0x0, proxy_resolver = 0x0, use_gproxyresolver = 0, tlsdb = 0x0, ssl = 0, ssl_strict = 0, ssl_fallback = 0, async_context = 0x0, use_thread_context = 0, cur_item = 0x0, state = SOUP_CONNECTION_DISCONNECTED, unused_timeout = 0, io_timeout = 0, idle_timeout = 0, idle_timeout_src = 0x0, reusable = 1}
Adding bt all : (aldgate-pc:9010): GLib-GIO-CRITICAL **: g_socket_condition_check: assertion `G_IS_SOCKET (socket)' failed Program received signal SIGTRAP, Trace/breakpoint trap.
+ Trace 230923
Thread 40 (Thread 0x7fffecc3e700 (LWP 9054))
Created attachment 225328 [details] [review] maybe-workaround This should fix the specific case in the backtrace, though it's not a complete fix.
Created attachment 225583 [details] [review] SoupConnection: fix a race condition with non-keepalive messages When a SoupSession sets a connection back to IDLE on a non-keepalive message, the connection then disconnects itself. However, it had been briefly setting its state to IDLE before disconnecting. Although this wasn't supposed to be observable (it doesn't emit a notification), in a SoupSessionSync, it could be observed by another thread, which might then try to claim the connection to send another request on, causing problems when the first thread then disconnected it. Fix this by inlining clear_current_item() into soup_connection_set_state() and reordering the code to not change priv->state until after deciding whether or not it's going to disconnect.
original actually broke more than it fixed. This patch should fix this specific race condition, and it small enough to be safe to 2.40.1; there are still other potential race conditions caused by the fact that SoupConnection:state can change when the session isn't holding conn_lock, but fixing the problem in general will take a lot more changes. Can you test this patch to see if it fixes the crash for you?
Sure, thanks for that fix. However I don't reproduce easily, so if it doesn't happen for a week it'll probably be good.
Similar downstream bug report from evoltuion-3.2.3 (rather old): https://bugzilla.redhat.com/show_bug.cgi?id=867069
+ Trace 231051
Thread 1 (Thread 0x7ff5c9427700 (LWP 6325))
Hey there, I almost forgot to get back to that bug. Basically I haven't had a single crash with the attached patch. So I guess we can close the issue. Thanks Dan.
committed the fix (but unfortunately forgot about this bug pre-2.40.1, so it will have to wait until 2.40.2). Leaving the bug open for the larger problem (comment 6) Milan: that bug is using SoupSessionAsync, so it shouldn't be this bug (since you're not allowed to use an async session from multiple threads at once).
(In reply to comment #10) > Milan: that bug is using SoupSessionAsync, so it shouldn't be this bug (since > you're not allowed to use an async session from multiple threads at once). Oh, it seemed similar to me, at least from the crashing function (comment #1) point of view. Looking into the backtrace itself, in comment #8, it is from EWS, and there had been done various fixes with thread-accessing libsoup's async session already, thus it should be fixed in 3.6.x of evolution-ews already, only between multiple other bugs.
(In reply to comment #6) > there are still other > potential race conditions caused by the fact that SoupConnection:state can > change when the session isn't holding conn_lock, but fixing the problem in > general will take a lot more changes. This was fixed in master 2 weeks ago; I forgot that this bug was still open.