After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 646959 - SoupSession never forgets a host
SoupSession never forgets a host
Status: RESOLVED FIXED
Product: libsoup
Classification: Core
Component: Misc
unspecified
Other Linux
: Normal normal
: ---
Assigned To: libsoup-maint@gnome.bugs
libsoup-maint@gnome.bugs
Depends on:
Blocks:
 
 
Reported: 2011-04-06 21:24 UTC by Xan Lopez
Modified: 2011-09-19 22:04 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch (4.36 KB, patch)
2011-08-26 17:27 UTC, Sergio Villar
reviewed Details | Review
Patch (4.19 KB, patch)
2011-09-08 17:32 UTC, Sergio Villar
reviewed Details | Review
Patch (3.43 KB, patch)
2011-09-19 07:55 UTC, Sergio Villar
accepted-commit_now Details | Review

Description Xan Lopez 2011-04-06 21:24:47 UTC
Today I couldn't open the new gnome.org in Epiphany without restarting the browser, apparently because SoupSession will never forget a host once it learns about it. Dan (wisely) says this is bad, and other browsers handle things fine, so we should fix this.
Comment 1 Dan Winship 2011-04-07 11:48:31 UTC
Right, so in more detail:

SoupSession keeps a hash table of SoupHosts, keyed by scheme+host+port, which are used in particular to keep track of open connections to a host. And each SoupHost also contains a SoupAddress, which is used when opening new connections to that host. And since no SoupHost ever goes away, that means that IP addresses are never re-resolved.

(In addition to the not-re-resolving problem, there's also a "the hash table grows without bounds" problem.)

So I think when a SoupHost has 0 connections, we should set two timeouts, one short-ish and one long-ish, both of which get removed if a new connection is started before they trigger. The short timeout would free-and-regenerate the session's SoupAuth (which will cause it to be re-resolved the next time a connection is made). The long timeout would free the SoupHost itself.

(Actually, I guess we only need one timeout, and we'd set up the short one first, and then set up the long one from the short one's callback.)

Although, this breaks DNS pinning (qv http://en.wikipedia.org/wiki/DNS_rebinding). Do you know what other WebKit ports do here?
Comment 2 Sergio Villar 2011-08-26 17:27:27 UTC
Created attachment 194866 [details] [review]
Patch

Something like this?

I set the timeouts to 5 and 10 minutes. That's of course a totally arbitrary decision.
Comment 3 Dan Winship 2011-08-27 15:13:18 UTC
Comment on attachment 194866 [details] [review]
Patch

>+	guint        lease_timeout;

not sure I like "lease". How about "keep_alive", like in the timeout name.

Also, you can't use g_timeout_add/g_source_remove, because they only work with the default GMainContext. You have to use soup_timeout_add(), and pass the session's priv->async_context, and then keep the GSource returned from that. Hm... the refcounting in those methods is a little broken. You'll have to ref the source since you're keeping a pointer to it, and then unref it when you're done with it.

>+	if (host->lease_timeout) {
>+		g_source_remove (host->lease_timeout);
>+		host->lease_timeout = 0;
>+	}

and this would be

    g_source_destroy (host->keep_alive_source);
    g_source_unref (host->keep_alive_source);

There's no need to set the variable to 0/NULL, because host gets freed at the end anyway.

>+typedef struct {
>+	SoupSessionHost *host;
>+	SoupSession *session;
>+	guint ref_count;
>+} SessionHostHelper;

No need for that. Just add a session field to SoupSessionHost. (I don't think you need the ref count.)
Comment 4 Sergio Villar 2011-09-08 17:32:05 UTC
Created attachment 196017 [details] [review]
Patch

New version addressing Dan's requests
Comment 5 Dan Winship 2011-09-16 01:48:07 UTC
Comment on attachment 196017 [details] [review]
Patch

OK, looks good, except that I changed my mind a little... there really isn't any point in keeping the SoupHost around after freeing the SoupAddress; it's trivial to recreate it later if we need it. So can you get rid of the first timeout, and just have a single timeout and free it when that expires?

>+	g_source_unref (host->keep_alive_src);
>+	host->keep_alive_src = NULL;

This should go in free_host() instead. And g_source_destroy() it as well. (Not needed for this case, but consider the case when a SoupSession is finalized while a keep_alive_src is active.)

>+	g_hash_table_remove (priv->hosts, host->uri);

and you should put a comment above that line pointing out that it results in host being freed, since I missed that the first time I read it and thought that you were leaking the host...
Comment 6 Sergio Villar 2011-09-19 07:55:29 UTC
Created attachment 196901 [details] [review]
Patch
Comment 7 Dan Winship 2011-09-19 22:04:49 UTC
i committed this so it could make it into .92