After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 757248 - use DNS for deciding auto-search
use DNS for deciding auto-search
Status: RESOLVED OBSOLETE
Product: epiphany
Classification: Core
Component: General
3.27.x
Other Linux
: Normal normal
: ---
Assigned To: Epiphany Maintainers
Epiphany Maintainers
: 741294 771990 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2015-10-28 11:35 UTC by kapouer
Modified: 2018-08-03 20:41 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Add check for dns name resolution (1.65 KB, patch)
2018-02-10 22:18 UTC, Jan-Michael Brummer
needs-work Details | Review

Description kapouer 2015-10-28 11:35:51 UTC
The official list is there
http://data.iana.org/TLD/tlds-alpha-by-domain.txt

and could be used by epiphany to decide wether to prepend http
before or to search for it.

This is related to
https://bugzilla.gnome.org/show_bug.cgi?id=681022
Comment 1 Michael Catanzaro 2015-10-28 14:52:06 UTC
OK, I have no better solution.

There are multiple other bugs about this, but I can't immediately find them.
Comment 2 Michael Catanzaro 2016-08-08 21:10:37 UTC
Looks like we already use something similar, via soup_ltd_domain_is_public_suffix:

https://publicsuffix.org/

It's maintained in libsoup, where it hasn't been updated in three years:

https://git.gnome.org/browse/libsoup/tree/data/effective_tld_names.dat

It carries an explicit warning that it's wrong to use it to decide whether to auto-search:

"""Some people use the PSL to determine what is a valid domain name and what isn't. This is dangerous, particularly in these days where new gTLDs are arriving at a rapid pace, if your software does not regularly receive PSL updates, because it will erroneously think new gTLDs are not valid. The DNS is the proper source for this information. If you must use it for this purpose, please do not bake static copies of the PSL into your software with no update mechanism."""

Indeed, the list in libsoup has not been updated in three years.
Comment 3 Michael Catanzaro 2016-08-08 21:11:14 UTC
(In reply to Michael Catanzaro from comment #2)
> Looks like we already use something similar, via
> soup_ltd_domain_is_public_suffix:

Twiddled that, I mean soup_tld_domain_is_public_suffix()
Comment 4 Michael Catanzaro 2016-08-08 21:16:59 UTC
(In reply to Michael Catanzaro from comment #2)
> Indeed, the list in libsoup has not been updated in three years.

Bug #769650
Comment 5 Michael Catanzaro 2016-08-08 21:20:25 UTC
(In reply to Michael Catanzaro from comment #2)
> The DNS is the proper source for this information.

You know, this should have been really blindingly obvious... but I didn't think of that.
Comment 6 Michael Catanzaro 2016-09-26 17:29:32 UTC
*** Bug 771990 has been marked as a duplicate of this bug. ***
Comment 7 Jan-Michael Brummer 2018-02-10 22:18:33 UTC
Created attachment 368225 [details] [review]
Add check for dns name resolution

Patch adds dns name resolution check.
Comment 8 Michael Catanzaro 2018-02-11 19:40:58 UTC
*** Bug 741294 has been marked as a duplicate of this bug. ***
Comment 9 Michael Catanzaro 2018-02-11 21:02:55 UTC
Review of attachment 368225 [details] [review]:

::: embed/ephy-embed-utils.c
@@ +207,3 @@
+  resolver = g_resolver_get_default ();
+
+  list = g_resolver_lookup_by_name (resolver, address, NULL, NULL);

We can't hang the UI process to do a DNS lookup. This needs to use g_resolver_lookup_by_name_async.

And that's going to complicate everything, because then we need to turn ephy_embed_utils_address_is_valid into an asynchronous function. That will make it much harder to call. Fortunately, it's only used in one place: ephy_embed_utils_normalize_or_autosearch_address. But then we have to turn that into an async function, and it's used in a couple more places (ephy-location-controller.c, ephy-notebook.c, ephy-web-view-test.c.) This might take some effort to get right.

Lastly, we have one privacy problem: this is a dangerous information leak if the user is using certain SOCKS proxies where DNS resolution is expected to be performed by the proxy rather than locally (e.g. tor). In that case, it's better to fail and force the user to manually type in the URI scheme "https://" rather than make a DNS request. So we should check if a proxy is in use before doing this. And that also has to be done asynchronously. Look in Source/WebCore/platform/network/soup/DNSSoup.cpp for an example of how to do this.

@@ +233,3 @@
            g_regex_match (get_non_search_regex (), address, 0, NULL) ||
            is_public_domain (address) ||
+           is_bang_search (address) ||

Probably should check is_bang_search at the top, after "scheme ||", since that's less work than checking for a file on disk, regex matching, parsing the public suffic list, or making a DNS query.

@@ +234,3 @@
            is_public_domain (address) ||
+           is_bang_search (address) ||
+           is_dns_resolvable (address);

I guess you've checked to make sure this code is only ever reached when the address looks like a website that's missing from the public suffix list, and not a search term, right? We don't want to be sending the user's search terms out in plaintext to the DNS server.
Comment 10 Michael Catanzaro 2018-02-11 21:04:12 UTC
(In reply to Michael Catanzaro from comment #9)
> In that case, it's
> better to fail and force the user to manually type in the URI scheme
> "https://" rather than make a DNS request.

I wonder how Firefox and Chrome handle this.
Comment 11 Michael Catanzaro 2018-02-24 23:30:07 UTC
(In reply to Michael Catanzaro from comment #10)
> I wonder how Firefox and Chrome handle this.

This is what we need to examine and figure out.
Comment 12 Michael Catanzaro 2018-02-24 23:31:02 UTC
(In reply to Michael Catanzaro from comment #11)
> This is what we need to examine and figure out.

We need to understand this to reenable DNS prefetch in the location entry, as well, which is probably quite important for performance (bug #661455)....
Comment 13 Jan-Michael Brummer 2018-03-01 21:47:08 UTC
Current findings:
 - Firefox stops prefetch as soon as a proxy is in use
 - Resolving flow:
   - Special check for .onion: if true, UNKNOWN_HOST
   - Whitelist check if it is a valid hostname, if not UNKNOWN_HOST
   - Try to convert to IP
   - if no match could be found:
     - start dns lookup (async)
Comment 14 GNOME Infrastructure Team 2018-08-03 20:41:22 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/epiphany/issues/282.