Bug 583795 – no more automatic search in location entry

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 583795 - no more automatic search in location entry


Summary:	no more automatic search in location entry


Status:	RESOLVED FIXED

Product:	epiphany
Classification:	Core
Component:	Backend
Version:	git master
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	Xan Lopez
QA Contact:	Epiphany Maintainers

URL:
Whiteboard:

Duplicates:	525683 (view as bug list)
Depends on:
Blocks:

Reported:	2009-05-25 12:27 UTC by Baptiste Mille-Mathias
Modified:	2009-12-17 18:41 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Add automatic search (1.18 KB, patch) 2009-05-25 19:12 UTC, Michael Kuhn	none	Details \| Review
Use load-error signal (2.99 KB, patch) 2009-05-28 18:07 UTC, Michael Kuhn	needs-work	Details \| Review
Use load-error signal (3.67 KB, patch) 2009-07-11 10:37 UTC, Michael Kuhn	none	Details \| Review
[PATCH] Use the load-error signal to reimplement search on the location bar (3.67 KB, patch) 2009-07-30 15:10 UTC, Gustavo Noronha (kov)	none	Details \| Review
patch updated for the refactoring of uri normalization (4.62 KB, patch) 2009-08-20 21:51 UTC, Gustavo Noronha (kov)	none	Details \| Review
using also gregex (6.62 KB, patch) 2009-08-23 17:39 UTC, Gustavo Noronha (kov)	none	Details \| Review
with the other changes done (6.51 KB, patch) 2009-08-23 20:41 UTC, Gustavo Noronha (kov)	none	Details \| Review
one more try (3.78 KB, patch) 2009-08-26 17:33 UTC, Gustavo Noronha (kov)	none	Details \| Review

Description Baptiste Mille-Mathias 2009-05-25 12:27:53 UTC

I typed some words in the url location entry, and I typed <Enter>, and I ended with the following message in the browser page:
=====================================================
Unable to load page

Problem occurred while loading the URL http://blah foo bar/

Cannot resolve hostname
=====================================================
Instead I expected to lookup the words in google automatically, and I think it used to work like that before.

cheers

Version: 2.27.2

Comment 1 Michael Kuhn 2009-05-25 19:12:26 UTC

Created attachment 135337 [details] [review]
Add automatic search

I'm using the attached patch to enable automatic search in the location entry.
I don't know if this is the right way to do it, but it works well for me.

Comment 2 Gustavo Noronha (kov) 2009-05-25 23:31:07 UTC

Except for the braces in the else block, this looks good to me. We should probably try to make this more reliable, though. Your code would choke on a search like this:

how do I open .txt files?

Even though it is obviously not an address. Since Epiphany already depends on libsoup, should we try to add http://, and pass the resulting string to soup_uri_new, and if it returns NULL, we go ahead and use the original string as a search string?

Comment 3 Michael Kuhn 2009-05-26 10:23:12 UTC

I tried your idea. The problem is that soup_uri_new seems to accept pretty much anything as a valid URI. For example, it returned a URI for "http://how do I open .txt files?".

Comment 4 Xan Lopez 2009-05-28 16:25:18 UTC

(In reply to comment #2)
> Except for the braces in the else block, this looks good to me. We should
> probably try to make this more reliable, though. Your code would choke on a
> search like this:
> 
> how do I open .txt files?
> 
> Even though it is obviously not an address. Since Epiphany already depends on
> libsoup, should we try to add http://, and pass the resulting string to
> soup_uri_new, and if it returns NULL, we go ahead and use the original string
> as a search string?
> 

I was thinking of connecting to 'load-error' and let the DNS figure out for us if the URL exists or not. Seems the most reliable way to go at this IMHO.

Comment 5 Michael Kuhn 2009-05-28 18:07:06 UTC

Created attachment 135504 [details] [review]
Use load-error signal

I updated my patch to use the load-error signal and it's working pretty well. The only side effect is that mistyped URLs are now searched for. For example, typing "google.foo" now triggers a search.

Comment 6 Gustavo Noronha (kov) 2009-06-25 18:39:47 UTC

+    wembed->priv->loading_uri = g_strdup (effective_url);
+
+    webkit_web_view_open (wembed->priv->web_view, effective_url);
+
+    g_free (effective_url);

The patch is looking good, if we think load-error is good enough for this. I would like to see an extra check there to make sure we only search stuff because of DNS errors, not because of any kind of load error. Also, you don't need this g_strdup/g_free dance here. You already strdup'ed the string when you called g_strdup_printf a few lines before, so just assign it. Also, you want to use webkit_web_view_load_uri, since _open is deprecated.

Comment 7 Michael Kuhn 2009-07-11 10:37:07 UTC

Created attachment 138236 [details] [review]
Use load-error signal

I'm attaching an updated version of the patch, which works with the current master.

Comment 8 Gustavo Noronha (kov) 2009-07-30 15:10:54 UTC

Created attachment 139564 [details] [review]
[PATCH] Use the load-error signal to reimplement search on the location bar


Changes slightly modified by Gustavo Noronha Silva to apply comments
about also checking if the error is related to DNS (and the other
error soup uses when we try to load multiple words as an URI).

Bug #583795
---
 embed/ephy-embed.c |   44 ++++++++++++++++++++++++++++++++++++++++++++
 embed/ephy-embed.h |    5 +++--
 src/ephy-link.c    |    5 +++++
 3 files changed, 52 insertions(+), 2 deletions(-)

Comment 9 Gustavo Noronha (kov) 2009-07-30 15:11:35 UTC

This is my proposed patch. It fixes the problems I was having with the original patch.

Comment 10 Michael Kuhn 2009-08-10 15:38:44 UTC

Just found a little problem with this approach: Searching for a word with umlauts (like "zurück") triggers a search for "xn--zurck-mva". I guess this is due to libsoup or WebKit converting the domain name according to the IDNA standard.

Comment 11 Gustavo Noronha (kov) 2009-08-20 21:51:27 UTC

Created attachment 141299 [details] [review]
patch updated for the refactoring of uri normalization

This is the same patch, updated after the small refactoring in uri normalization.

Comment 12 Xan Lopez 2009-08-21 12:45:18 UTC

(In reply to comment #11)
> Created an attachment (id=141299) [details]
> patch updated for the refactoring of uri normalization
> 
> This is the same patch, updated after the small refactoring in uri
> normalization.

This still has the problem mentioned in comment #10. Maybe using load-error was a bad idea, and we should just use a clever regexp to decide whether we should load or google the text (it would be good to see what gecko does, since we are just copying here).

Also, the 'handle_load_error' thing seems to behave the opposite way one would expect from the doc/name: if it's TRUE the view *won't* show an error page as the doc says, but it will google the text. In any case the function should be made private I think (with leading _), and we should get the embed going up through the hierarchy from the view instead of asking for the active one.

All of this applies in case we'll still use load-error of course, otherwise we'd replace the URI to load in-place.

Comment 13 Gustavo Noronha (kov) 2009-08-23 17:39:00 UTC

Created attachment 141503 [details] [review]
using also gregex

Here's an updated patch that uses also a regex (before using load-error); this addresses the problem of using non-ascii chars. I fixed the documentation of the function, but didn't address making the function private, and on how to get the embed. It would be good to have this in the next release, so if you feel the patch is almost there and want to handle those, please do =D.

Comment 14 Gustavo Noronha (kov) 2009-08-23 20:41:38 UTC

Created attachment 141508 [details] [review]
with the other changes done

Comment 15 Dan Winship 2009-08-26 14:53:01 UTC

(In reply to comment #3)
> I tried your idea. The problem is that soup_uri_new seems to accept pretty much
> anything as a valid URI. For example, it returned a URI for "http://how do I
> open .txt files?".

Yeah, soup_uri_new() tries to be non-judgemental. There's a macro SOUP_URI_VALID_FOR_HTTP() that does some additional checks, although it doesn't check that the "hostname" is actually a syntactically valid hostname.

(In the past, that string would have been rejected because of the spaces, but we changed SoupURI to allow un-encoded spaces. But probably that's a bug and it should only allow them in the path/query/fragment, not in the hostname...)

(In reply to comment #14)
> Created an attachment (id=141508) [details]

+      (error->code == SOUP_STATUS_CANT_RESOLVE ||
+       error->code == SOUP_STATUS_SWITCHING_PROTOCOLS)) {

SOUP_STATUS_SWITCHING_PROTOCOLS does not mean what you think it does, and is not really relevant here.

Unfortunately, this test probably won't work if the user is using a proxy, because libsoup doesn't try to resolve the hostname itself in that case, it leaves it to the proxy, and it's not totally clear what error the proxy will return in that case if the hostname is invalid.

+    /* Remove slash that is added to the URI when DNS fails.
+     * FIXME: who is adding this slash here? */
+    if (error->code == SOUP_STATUS_CANT_RESOLVE)
+      tmp_uri[strlen (tmp_uri) - 1] = '\0';

The http spec defines that "http://foo" and "http://foo/" mean the same thing, and libsoup canonicalizes to the latter.

Trying to undo the effect of the string->url->string conversion is tricky. Eg, the code above would break if there was already a "/" in the search string (because then another one won't get added, and so you'll remove the last character of the search string instead). And if there's a ":" in the search string then all sorts of weird things could happen. Is there any way you can get back the original location bar contents instead?

+  priv->search_regex = g_regex_new ("(^localhost$|^[0-9]+\\.[0-9]+\\.[0-9]+\\.[0-9]$|"
+                                    "^[^[\\.[:space:]]+\\.[^\\.[:space:]]+.*$|"
+                                    "^https?://[^/\\.[:space:]]+.*$|"
+                                    "^about:.*$)",
+                                    G_REGEX_OPTIMIZE, G_REGEX_MATCH_NOTEMPTY, &error);

This confused me until I realized it's really "NON_search_regex"...

It doesn't match IPv6 address literals :)

It might work better to call soup_uri_new(), and if it fails or has an unrecognized protocol then treat it as a search, and if not, do a regex check on just the domain name? (Can also use g_hostname_is_ip_address! :)

Probably SoupURI should be fixed to be more stringent too? Although I don't want to break other use cases.

Comment 16 Gustavo Noronha (kov) 2009-08-26 17:33:03 UTC

Created attachment 141767 [details] [review]
one more try

Comment 17 Gustavo Noronha (kov) 2009-08-26 17:54:56 UTC

(In reply to comment #16)
> Created an attachment (id=141767) [details]
> one more try

Pushed to master.

Comment 18 Reinout van Schouwen 2009-12-17 18:41:49 UTC

*** Bug 525683 has been marked as a duplicate of this bug. ***