GNOME Bugzilla – Bug 651146
Incorrect SoupConnection assignment on redirects
Last modified: 2012-04-11 12:58:51 UTC
Every time I opened some url (I cannot mention it here as it's some Igalia's internal URL) I was hitting this assertion: libsoup-CRITICAL **: set_current_item: assertion `priv->cur_item == NULL' failed After some investigation it turned out that the problem is the following: 1) SoupMsg1 got redirected 2) A queued message (SoupMsg2) asks for a connection 2.1) As SoupMsg1 got redirected its SoupConnection is marked as IDLE 2.2) The reference from SoupConnection to SoupMessageQueueItem is cleared 2.3) the current SoupConnection assignment algorithm finds that the SoupConnection is in IDLE state and then assign it to SoupMsg2 All this is OK but, the problem is that the SoupMessageQueueItem for SoupMsg1 still keeps a reference to the SoupConnection, so it will not get a new connection when calling soup_session_get_connection() and thus both messages will share the same connection. This is not a problem in debug builds as the assertion returns FALSE and the connection is not really shared, but for release builds SoupMessages will effectively use the same SoupConnection.
Created attachment 188667 [details] [review] Patch
Comment on attachment 188667 [details] [review] Patch item->conn should be getting cleared at the point where the connection gets marked IDLE (also, a test case would be great...)
Created attachment 188867 [details] [review] Patch Still need to figure out the test case
This looks like it could also fix 650923? =)
(In reply to comment #4) > This looks like it could also fix 650923? =) I think so, I was getting that error also sometimes in the same situation, and your description of the problem looks like quite related. Have you tried the patch?
Just did, it seems to be fixed by your patch, yes =D \o/
*** Bug 650923 has been marked as a duplicate of this bug. ***
Comment on attachment 188867 [details] [review] Patch this looks right (assuming "make check" still passes with it) (at some point, in some branch, I had a "soup_message_queue_item_clear_connection()" method that did the IDLE/unref/NULL thing, and then used that everywhere appropriate, but I guess that never made it to master...) I would really really like to have a test case for this though...
Comment on attachment 188867 [details] [review] Patch ok, tried writing a test, but it would be very tricky, since soup_session_pause_message() won't work at that spot. so ok, commit it
*** Bug 640096 has been marked as a duplicate of this bug. ***
(In reply to comment #9) > (From update of attachment 188867 [details] [review]) > ok, tried writing a test, but it would be very tricky, since > soup_session_pause_message() won't work at that spot. so ok, commit it Unfortunately I couldn't devote too much time to libsoup these days, I have not even tried to write that test. Thx for taking care of it. Committed c56c66aa2625bc95f54f8a1972e13503f454a51a
*** Bug 652368 has been marked as a duplicate of this bug. ***
This broke "make check". We both should have tested that... Anyway, https-via-proxy is now broken, because the connection gets released in between the CONNECT and the following request. I think the correct fix here is that rather than unreffing and NULLing item->conn from soup_message_io_stop(), we should be moving the "soup_connection_set_state (io->item->conn, SOUP_CONNECTION_IDLE)" *out* of soup_message_io_stop() (and into SoupSessionAsync/Sync). Need to figure out what to do in the abnormal io_clean/stop cases though.
(In reply to comment #13) > This broke "make check". We both should have tested that... :( my bad. I totally forgot about checking the test results... > I think the correct fix here is that rather than unreffing and NULLing > item->conn from soup_message_io_stop(), we should be moving the > "soup_connection_set_state (io->item->conn, SOUP_CONNECTION_IDLE)" *out* of > soup_message_io_stop() (and into SoupSessionAsync/Sync). Need to figure out > what to do in the abnormal io_clean/stop cases though. Which cases?
(In reply to comment #14) > > soup_message_io_stop() (and into SoupSessionAsync/Sync). Need to figure out > > what to do in the abnormal io_clean/stop cases though. > > Which cases? If we're removing the SOUP_CONNECTION_IDLE/unref from soup_connection_io_stop(), then we need to make sure that every code path that would previously have passed through there now does the IDLE/unref somewhere else. soup_message_io_stop() is currently only called from soup_message_io_cleanup(). soup_message_io_cleanup() is called from: - soup_message_io_finished() (the normal case, in which the IDLE/unref will happen from the completion_cb()) - new_iostate() - soup_message_finalize() - soup_http_input_stream_seek() The simple fix may be to add a gboolean param to soup_message_io_cleanup() saying whether or not to drop the connection.
(In reply to comment #15) > (In reply to comment #14) > > > soup_message_io_stop() (and into SoupSessionAsync/Sync). Need to figure out > > > what to do in the abnormal io_clean/stop cases though. > > > > Which cases? > > If we're removing the SOUP_CONNECTION_IDLE/unref from > soup_connection_io_stop(), then we need to make sure that every code path that > would previously have passed through there now does the IDLE/unref somewhere > else. OK you mean the other situations where it's called, I got confused with the "abnormal" word :)
(In reply to comment #15) > (In reply to comment #14) > The simple fix may be to add a gboolean param to soup_message_io_cleanup() > saying whether or not to drop the connection. I don't think this will work because it is not possible to know the right value for the boolean argument for all cases. For example, soup_message_io_cleanup() is called from soup_message_io_finished() that is part of many different code paths. I think we could try another approach. The problem is that 2 different queue items can share a SoupConnection because we are not removing the connection from the queue item in some cases. What about checking in soup_session_get_connection() (in the if block at the beginning) if the connection is being used by some other item, and then in those cases, do not return TRUE and ask for another connection? I know that we weren't fixing the underlying problem, but I found quite difficult to change that critical code without breaking anything.
Created attachment 191820 [details] [review] Patch I was thinking in something like this (which passes the tests BTW)
Any comment Dan?
(In reply to comment #19) > Any comment Dan? It seems like it just pushes the race condition around; there is still a period of time when msg1's queueitem claims it owns the connection, but it doesn't really. I'd feel better about the patch if we had a test case for it. I tried to create one based on your description from comment 0, and I came up with a test that fails, but it's not failing in quite the way you describe, and your patch doesn't fix it.
Created attachment 193442 [details] [review] SoupSession: make pause/unpause work in any state
Created attachment 193443 [details] [review] redirect-test: add a test for connections getting wrongly reused
Created attachment 197742 [details] [review] redirect-test: add a test for connections getting wrongly reused Basically the same test but with a couple of important changes: 1) instead of unpausing msg1 on msg2's "wrote-headers" we do it on "got-body". That way we ensure that msg2 properly finishes before msg1, and thus it's not cancelled by soup_test_session_abort. This is not strictly needed by the test (as we can plainly ignore the outcome of the msg2). 2) I added a check just after unpausing msg1 that tests if both messages are actually sharing the same connection. That's the place to do it because msg2 is running and msg1 was previously restarted. This test still needs Dan's patch to make pause/unpause work in any state.
Created attachment 197743 [details] [review] SoupSession: set the connection to IDLE on unqueuing the messages With this new patch SoupConnections are only set to IDLE state when unqueuing the messages. That way we can be sure that 2 running messages will ever share a connection. The previously submitted test works fine with this patch and fails with current master. I had to fix also a bug in SoupSessionSync that appeared after removing the soup_connection_set_status(IDLE) from soup_message_io_stop. Instead of checking for !=disconnected connections (because set_status(IDLE) was causing non-keepalive connections to disconnect) we have to check that the IO is not in progress (which probes that soup_message_io_stop was called).
Yay! I tweaked the test case to not need to include private headers and committed everything. To be included in 2.36.1.
(In reply to comment #25) > Yay! > > I tweaked the test case to not need to include private headers and > committed everything. To be included in 2.36.1. Cool, smart tweak BTW
This change, in particular the early return for paused messages in process_queue_item() causes messages to be leaked when aborting an async SoupSession, see bug #673905.