After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 651146 - Incorrect SoupConnection assignment on redirects
Incorrect SoupConnection assignment on redirects
Status: RESOLVED FIXED
Product: libsoup
Classification: Core
Component: Misc
unspecified
Other Linux
: Normal normal
: ---
Assigned To: libsoup-maint@gnome.bugs
libsoup-maint@gnome.bugs
: 640096 650923 652368 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2011-05-26 13:25 UTC by Sergio Villar
Modified: 2012-04-11 12:58 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch (1014 bytes, patch)
2011-05-26 13:33 UTC, Sergio Villar
needs-work Details | Review
Patch (1.13 KB, patch)
2011-05-30 09:59 UTC, Sergio Villar
accepted-commit_now Details | Review
Patch (1.15 KB, patch)
2011-07-12 16:15 UTC, Sergio Villar
none Details | Review
SoupSession: make pause/unpause work in any state (5.72 KB, patch)
2011-08-08 19:41 UTC, Dan Winship
committed Details | Review
redirect-test: add a test for connections getting wrongly reused (4.81 KB, patch)
2011-08-08 19:41 UTC, Dan Winship
none Details | Review
redirect-test: add a test for connections getting wrongly reused (5.48 KB, patch)
2011-09-29 10:47 UTC, Sergio Villar
none Details | Review
SoupSession: set the connection to IDLE on unqueuing the messages (2.04 KB, patch)
2011-09-29 10:51 UTC, Sergio Villar
committed Details | Review

Description Sergio Villar 2011-05-26 13:25:52 UTC
Every time I opened some url (I cannot mention it here as it's some Igalia's internal URL) I was hitting this assertion:

libsoup-CRITICAL **: set_current_item: assertion `priv->cur_item == NULL' failed

After some investigation it turned out that the problem is the following:

1) SoupMsg1 got redirected
2) A queued message (SoupMsg2) asks for a connection
2.1) As SoupMsg1 got redirected its SoupConnection is marked as IDLE
2.2) The reference from SoupConnection to SoupMessageQueueItem is cleared 
2.3) the current SoupConnection assignment algorithm finds that the SoupConnection is in IDLE state and then assign it to SoupMsg2

All this is OK but, the problem is that the SoupMessageQueueItem for SoupMsg1 still keeps a reference to the SoupConnection, so it will not get a new connection when calling soup_session_get_connection() and thus both messages will share the same connection.

This is not a problem in debug builds as the assertion returns FALSE and the connection is not really shared, but for release builds SoupMessages will effectively use the same SoupConnection.
Comment 1 Sergio Villar 2011-05-26 13:33:26 UTC
Created attachment 188667 [details] [review]
Patch
Comment 2 Dan Winship 2011-05-27 18:22:39 UTC
Comment on attachment 188667 [details] [review]
Patch

item->conn should be getting cleared at the point where the connection gets marked IDLE

(also, a test case would be great...)
Comment 3 Sergio Villar 2011-05-30 09:59:13 UTC
Created attachment 188867 [details] [review]
Patch

Still need to figure out the test case
Comment 4 Gustavo Noronha (kov) 2011-05-30 13:49:29 UTC
This looks like it could also fix 650923? =)
Comment 5 Sergio Villar 2011-05-30 14:51:19 UTC
(In reply to comment #4)
> This looks like it could also fix 650923? =)

I think so, I was getting that error also sometimes in the same situation, and your description of the problem looks like quite related. Have you tried the patch?
Comment 6 Gustavo Noronha (kov) 2011-05-31 13:39:11 UTC
Just did, it seems to be fixed by your patch, yes =D \o/
Comment 7 Gustavo Noronha (kov) 2011-05-31 13:39:55 UTC
*** Bug 650923 has been marked as a duplicate of this bug. ***
Comment 8 Dan Winship 2011-05-31 14:04:52 UTC
Comment on attachment 188867 [details] [review]
Patch

this looks right (assuming "make check" still passes with it)

(at some point, in some branch, I had a "soup_message_queue_item_clear_connection()" method that did the IDLE/unref/NULL thing, and then used that everywhere appropriate, but I guess that never made it to master...)

I would really really like to have a test case for this though...
Comment 9 Dan Winship 2011-06-07 20:33:32 UTC
Comment on attachment 188867 [details] [review]
Patch

ok, tried writing a test, but it would be very tricky, since soup_session_pause_message() won't work at that spot. so ok, commit it
Comment 10 Dan Winship 2011-06-07 20:44:19 UTC
*** Bug 640096 has been marked as a duplicate of this bug. ***
Comment 11 Sergio Villar 2011-06-08 06:51:59 UTC
(In reply to comment #9)
> (From update of attachment 188867 [details] [review])
> ok, tried writing a test, but it would be very tricky, since
> soup_session_pause_message() won't work at that spot. so ok, commit it

Unfortunately I couldn't devote too much time to libsoup these days, I have not even tried to write that test. Thx for taking care of it.

Committed c56c66aa2625bc95f54f8a1972e13503f454a51a
Comment 12 Dan Winship 2011-06-13 16:38:38 UTC
*** Bug 652368 has been marked as a duplicate of this bug. ***
Comment 13 Dan Winship 2011-06-13 16:44:28 UTC
This broke "make check". We both should have tested that...

Anyway, https-via-proxy is now broken, because the connection gets released in between the CONNECT and the following request.

I think the correct fix here is that rather than unreffing and NULLing item->conn from soup_message_io_stop(), we should be moving the "soup_connection_set_state (io->item->conn, SOUP_CONNECTION_IDLE)" *out* of soup_message_io_stop() (and into SoupSessionAsync/Sync). Need to figure out what to do in the abnormal io_clean/stop cases though.
Comment 14 Sergio Villar 2011-06-15 09:38:13 UTC
(In reply to comment #13)
> This broke "make check". We both should have tested that...

:( my bad. I totally forgot about checking the test results...

> I think the correct fix here is that rather than unreffing and NULLing
> item->conn from soup_message_io_stop(), we should be moving the
> "soup_connection_set_state (io->item->conn, SOUP_CONNECTION_IDLE)" *out* of
> soup_message_io_stop() (and into SoupSessionAsync/Sync). Need to figure out
> what to do in the abnormal io_clean/stop cases though.

Which cases?
Comment 15 Dan Winship 2011-06-15 13:15:19 UTC
(In reply to comment #14)
> > soup_message_io_stop() (and into SoupSessionAsync/Sync). Need to figure out
> > what to do in the abnormal io_clean/stop cases though.
> 
> Which cases?

If we're removing the SOUP_CONNECTION_IDLE/unref from soup_connection_io_stop(), then we need to make sure that every code path that would previously have passed through there now does the IDLE/unref somewhere else.

soup_message_io_stop() is currently only called from soup_message_io_cleanup().

soup_message_io_cleanup() is called from:

    - soup_message_io_finished() (the normal case, in which the IDLE/unref will happen from the completion_cb())

    - new_iostate()

    - soup_message_finalize()

    - soup_http_input_stream_seek()

The simple fix may be to add a gboolean param to soup_message_io_cleanup() saying whether or not to drop the connection.
Comment 16 Sergio Villar 2011-06-15 15:11:59 UTC
(In reply to comment #15)
> (In reply to comment #14)
> > > soup_message_io_stop() (and into SoupSessionAsync/Sync). Need to figure out
> > > what to do in the abnormal io_clean/stop cases though.
> > 
> > Which cases?
> 
> If we're removing the SOUP_CONNECTION_IDLE/unref from
> soup_connection_io_stop(), then we need to make sure that every code path that
> would previously have passed through there now does the IDLE/unref somewhere
> else.

OK you mean the other situations where it's called, I got confused with the "abnormal" word :)
Comment 17 Sergio Villar 2011-07-08 17:59:57 UTC
(In reply to comment #15)
> (In reply to comment #14)
> The simple fix may be to add a gboolean param to soup_message_io_cleanup()
> saying whether or not to drop the connection.

I don't think this will work because it is not possible to know the right value for the boolean argument for all cases. For example, soup_message_io_cleanup() is called from soup_message_io_finished() that is part of many different code paths.

I think we could try another approach. The problem is that 2 different queue items can share a SoupConnection because we are not removing the connection from the queue item in some cases. What about checking in soup_session_get_connection() (in the if block at the beginning) if the connection is being used by some other item, and then in those cases, do not return TRUE and ask for another connection?

I know that we weren't fixing the underlying problem, but I found quite difficult to change that critical code without breaking anything.
Comment 18 Sergio Villar 2011-07-12 16:15:57 UTC
Created attachment 191820 [details] [review]
Patch

I was thinking in something like this (which passes the tests BTW)
Comment 19 Sergio Villar 2011-08-05 17:24:28 UTC
Any comment Dan?
Comment 20 Dan Winship 2011-08-08 19:40:59 UTC
(In reply to comment #19)
> Any comment Dan?

It seems like it just pushes the race condition around; there is still a period of time when msg1's queueitem claims it owns the connection, but it doesn't really.

I'd feel better about the patch if we had a test case for it. I tried to create one based on your description from comment 0, and I came up with a test that fails, but it's not failing in quite the way you describe, and your patch doesn't fix it.
Comment 21 Dan Winship 2011-08-08 19:41:24 UTC
Created attachment 193442 [details] [review]
SoupSession: make pause/unpause work in any state
Comment 22 Dan Winship 2011-08-08 19:41:26 UTC
Created attachment 193443 [details] [review]
redirect-test: add a test for connections getting wrongly reused
Comment 23 Sergio Villar 2011-09-29 10:47:00 UTC
Created attachment 197742 [details] [review]
redirect-test: add a test for connections getting wrongly reused

Basically the same test but with a couple of important changes:

1) instead of unpausing msg1 on msg2's "wrote-headers" we do it on "got-body". That way we ensure that msg2 properly finishes before msg1, and thus it's not cancelled by soup_test_session_abort. This is not strictly needed by the test (as we can plainly ignore the outcome of the msg2).

2) I added a check just after unpausing msg1 that tests if both messages are actually sharing the same connection. That's the place to do it because msg2 is running and msg1 was previously restarted.

This test still needs Dan's patch to make pause/unpause work in any state.
Comment 24 Sergio Villar 2011-09-29 10:51:29 UTC
Created attachment 197743 [details] [review]
SoupSession: set the connection to IDLE on unqueuing the messages

With this new patch SoupConnections are only set to IDLE state when unqueuing the messages. That way we can be sure that 2 running messages will ever share a connection.

The previously submitted test works fine with this patch and fails with current master. I had to fix also a bug in SoupSessionSync that appeared after removing the soup_connection_set_status(IDLE) from soup_message_io_stop. Instead of checking for !=disconnected connections (because set_status(IDLE) was causing non-keepalive connections to disconnect) we have to check that the IO is not in progress (which probes that soup_message_io_stop was called).
Comment 25 Dan Winship 2011-09-29 14:38:48 UTC
Yay!

I tweaked the test case to not need to include private headers and
committed everything. To be included in 2.36.1.
Comment 26 Sergio Villar 2011-09-29 15:18:37 UTC
(In reply to comment #25)
> Yay!
> 
> I tweaked the test case to not need to include private headers and
> committed everything. To be included in 2.36.1.

Cool, smart tweak BTW
Comment 27 Sven Neumann 2012-04-11 12:58:51 UTC
This change, in particular the early return for paused messages in process_queue_item() causes messages to be leaked when aborting an async SoupSession, see bug #673905.