GNOME Bugzilla – Bug 646959
SoupSession never forgets a host
Last modified: 2011-09-19 22:04:49 UTC
Today I couldn't open the new gnome.org in Epiphany without restarting the browser, apparently because SoupSession will never forget a host once it learns about it. Dan (wisely) says this is bad, and other browsers handle things fine, so we should fix this.
Right, so in more detail: SoupSession keeps a hash table of SoupHosts, keyed by scheme+host+port, which are used in particular to keep track of open connections to a host. And each SoupHost also contains a SoupAddress, which is used when opening new connections to that host. And since no SoupHost ever goes away, that means that IP addresses are never re-resolved. (In addition to the not-re-resolving problem, there's also a "the hash table grows without bounds" problem.) So I think when a SoupHost has 0 connections, we should set two timeouts, one short-ish and one long-ish, both of which get removed if a new connection is started before they trigger. The short timeout would free-and-regenerate the session's SoupAuth (which will cause it to be re-resolved the next time a connection is made). The long timeout would free the SoupHost itself. (Actually, I guess we only need one timeout, and we'd set up the short one first, and then set up the long one from the short one's callback.) Although, this breaks DNS pinning (qv http://en.wikipedia.org/wiki/DNS_rebinding). Do you know what other WebKit ports do here?
Created attachment 194866 [details] [review] Patch Something like this? I set the timeouts to 5 and 10 minutes. That's of course a totally arbitrary decision.
Comment on attachment 194866 [details] [review] Patch >+ guint lease_timeout; not sure I like "lease". How about "keep_alive", like in the timeout name. Also, you can't use g_timeout_add/g_source_remove, because they only work with the default GMainContext. You have to use soup_timeout_add(), and pass the session's priv->async_context, and then keep the GSource returned from that. Hm... the refcounting in those methods is a little broken. You'll have to ref the source since you're keeping a pointer to it, and then unref it when you're done with it. >+ if (host->lease_timeout) { >+ g_source_remove (host->lease_timeout); >+ host->lease_timeout = 0; >+ } and this would be g_source_destroy (host->keep_alive_source); g_source_unref (host->keep_alive_source); There's no need to set the variable to 0/NULL, because host gets freed at the end anyway. >+typedef struct { >+ SoupSessionHost *host; >+ SoupSession *session; >+ guint ref_count; >+} SessionHostHelper; No need for that. Just add a session field to SoupSessionHost. (I don't think you need the ref count.)
Created attachment 196017 [details] [review] Patch New version addressing Dan's requests
Comment on attachment 196017 [details] [review] Patch OK, looks good, except that I changed my mind a little... there really isn't any point in keeping the SoupHost around after freeing the SoupAddress; it's trivial to recreate it later if we need it. So can you get rid of the first timeout, and just have a single timeout and free it when that expires? >+ g_source_unref (host->keep_alive_src); >+ host->keep_alive_src = NULL; This should go in free_host() instead. And g_source_destroy() it as well. (Not needed for this case, but consider the case when a SoupSession is finalized while a keep_alive_src is active.) >+ g_hash_table_remove (priv->hosts, host->uri); and you should put a comment above that line pointing out that it results in host being freed, since I missed that the first time I read it and thought that you were leaking the host...
Created attachment 196901 [details] [review] Patch
i committed this so it could make it into .92