After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 640414 - Try to force keep-alive for HTTP/1.0 connections
Try to force keep-alive for HTTP/1.0 connections
Status: RESOLVED FIXED
Product: libsoup
Classification: Core
Component: HTTP Transport
2.33.x
Other Linux
: Normal normal
: ---
Assigned To: libsoup-maint@gnome.bugs
libsoup-maint@gnome.bugs
Depends on:
Blocks:
 
 
Reported: 2011-01-24 13:31 UTC by Sergio Villar
Modified: 2011-02-16 10:08 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch (3.71 KB, patch)
2011-01-24 13:34 UTC, Sergio Villar
reviewed Details | Review
Patch v2 (3.35 KB, patch)
2011-01-25 09:35 UTC, Sergio Villar
accepted-commit_now Details | Review

Description Sergio Villar 2011-01-24 13:31:52 UTC
Trying to keep connections alive is a good idea if we plan to issue multiple consecutive requests to the same host in a row, as new connections would not have to be created, we can just reuse the old ones.

This is the default behaviour for HTTP/1.1, but that is not the case for servers using HTTP/1.0. You might think that nobody is using that but huge websites as Wikipedia actually do.
Comment 1 Sergio Villar 2011-01-24 13:34:35 UTC
Created attachment 179164 [details] [review]
Patch

This patch adds a new field to SoupSessionHost that tracks the HTTP version used by the remote host. If HTTP/1.0 is used then we add the "Connection: keep-alive" header to the following requests to that host.
Comment 2 Sergio Villar 2011-01-24 13:40:13 UTC
Just some numbers. I tracked the amount of new connections performing the following test:

1) Went to http://www.wikipedia.org
2) Clicked on Portuguese version
3) Clicked on the featured article of the day
4) Clicked on the first link of the article

This threw the following numbers:
   * current version:  152 new connections
   * patched version:  33 new connections
Comment 3 Dan Winship 2011-01-24 22:49:40 UTC
Comment on attachment 179164 [details] [review]
Patch

>+		soup_message_headers_append (item->msg->request_headers,
>+					     "Connection", "keep-alive");

The standard "spelling" is "Keep-Alive" with capital letters, and there are probably some stupid servers out there that will only recognize it that way.

Additionally, you should make sure it's not already set:

        conn_header = soup_message_headers_get_list (item->msg->request_headers);
        if (!conn_header || (!soup_header_contains (conn_header, "Keep-Alive") &&
                             !soup_header_contains (conn_header, "close")))
                soup_message_headers_append (...)



Beyond that... I'm not sure it's really worth doing the whole keeping-track-of-1.1-vs-1.0 thing: the "Connection: Keep-Alive" is harmless (other than being a few additional bytes) when sent to an HTTP/1.1 server... So we could just add the header unconditionally.

If we are keeping the 1.1-vs-1.0 split, then:

>+	/* The session will listen to changes in http version if
>+	 * needed when the message is started.
>+	 */
>+	g_signal_handlers_disconnect_by_func (msg, http_version_changed, session);

redirect_handler is definitely the wrong place for this; you should connect to the signal from queue_message() rather than from soup_session_send_queue_item(), and you should use the SoupMessageQueueItem rather than the SoupSession as the user_data. And then you don't have to worry about removing it because soup_session_unqueue_item() will do it for you.

And then you should extend do_msg_reuse_test() in misc-test.c to verify that the handler does get disconnected in this case. (Although I think that currently only tests SoupMessage signals and won't notice if a GObject::notify signal handler is leaked...)

>+	g_object_get (msg, SOUP_MESSAGE_HTTP_VERSION, &version, NULL);

Use the direct accessor (soup_message_get_http_version()). It's faster.

>+	/* Do never upgrade the HTTP version for a host. Note that all
>+	 * the response fields (including the http version) are reset
>+	 * just before sending a message.
>+	 */

Yeah... so really, what you want is to connect to got-headers, not notify::http-version. (Also, grammar note: Lose the word "Do" in that comment.)
Comment 4 Sergio Villar 2011-01-25 08:41:14 UTC
(In reply to comment #3)
> (From update of attachment 179164 [details] [review])
> >+		soup_message_headers_append (item->msg->request_headers,
> >+					     "Connection", "keep-alive");
> 
> The standard "spelling" is "Keep-Alive" with capital letters, and there are
> probably some stupid servers out there that will only recognize it that way.
> 
> Additionally, you should make sure it's not already set:
> 
>         conn_header = soup_message_headers_get_list
> (item->msg->request_headers);
>         if (!conn_header || (!soup_header_contains (conn_header, "Keep-Alive")
> &&
>                              !soup_header_contains (conn_header, "close")))
>                 soup_message_headers_append (...)

Good point

> Beyond that... I'm not sure it's really worth doing the whole
> keeping-track-of-1.1-vs-1.0 thing: the "Connection: Keep-Alive" is harmless
> (other than being a few additional bytes) when sent to an HTTP/1.1 server... So
> we could just add the header unconditionally.

Me neither. That's the first thing I thought, for HTTP/1.1 we would be just sending a couple more bytes than needed. Actually firefox always send that header unconditionally. Chrome only sends it for HTTP/1.0
 
> If we are keeping the 1.1-vs-1.0 split, then:
> 
> >+	/* The session will listen to changes in http version if
> >+	 * needed when the message is started.
> >+	 */
> >+	g_signal_handlers_disconnect_by_func (msg, http_version_changed, session);
> 
> redirect_handler is definitely the wrong place for this; you should connect to
> the signal from queue_message() rather than from
> soup_session_send_queue_item(), and you should use the SoupMessageQueueItem
> rather than the SoupSession as the user_data. And then you don't have to worry
> about removing it because soup_session_unqueue_item() will do it for you.

The disconnection in redirect handler was just used to avoid double connection, but indeed it'd be better to do it how you say.

> And then you should extend do_msg_reuse_test() in misc-test.c to verify that
> the handler does get disconnected in this case. (Although I think that
> currently only tests SoupMessage signals and won't notice if a GObject::notify
> signal handler is leaked...)

I'll check
 
> >+	g_object_get (msg, SOUP_MESSAGE_HTTP_VERSION, &version, NULL);
> 
> Use the direct accessor (soup_message_get_http_version()). It's faster.
> 
> >+	/* Do never upgrade the HTTP version for a host. Note that all
> >+	 * the response fields (including the http version) are reset
> >+	 * just before sending a message.
> >+	 */
> 
> Yeah... so really, what you want is to connect to got-headers, not
> notify::http-version. (Also, grammar note: Lose the word "Do" in that comment.)

Yeah that was also my first idea but I changed my mind afterwards, don't recall the reason tough :)
Comment 5 Sergio Villar 2011-01-25 09:35:28 UTC
Created attachment 179274 [details] [review]
Patch v2

Second version of the patch with the requested changes.

The logic to keep track of HTTP/1.0 hosts is really straightforward and IMHO will not cause a noticeable impact on performance. So we could keep it instead of unconditionally adding a new header, which is not that bad by itself, but will save us some network traffic.

Changing the test is not needed now as we connect to "got-headers"
Comment 6 Dan Winship 2011-02-15 20:23:04 UTC
Comment on attachment 179274 [details] [review]
Patch v2

> The logic to keep track of HTTP/1.0 hosts is really straightforward and IMHO
> will not cause a noticeable impact on performance.

OK, I always worry about adding more cases that have to lock mutexes, but it's probably not a big deal.
Comment 7 Sergio Villar 2011-02-16 10:08:50 UTC
Committed 382ae1da1a23631e5c1274129077be4fcfcf84c4