After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 517960 - Port the url bar completion func to GRegex
Port the url bar completion func to GRegex
Status: RESOLVED FIXED
Product: epiphany
Classification: Core
Component: [obsolete] URL bar
git master
Other Linux
: Normal normal
: gnome-2-24
Assigned To: Epiphany Maintainers
Epiphany Maintainers
Depends on:
Blocks:
 
 
Reported: 2008-02-21 21:47 UTC by Diego Escalante Urrelo (not reading bugmail)
Modified: 2008-08-14 22:48 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
20080221_bgo_517960_regexp-in-url-bar.diff (2.72 KB, patch)
2008-02-21 21:51 UTC, Diego Escalante Urrelo (not reading bugmail)
accepted-commit_after_freeze Details | Review
Awesome regexp matching (not intended to be used by humans) (25.07 KB, image/png)
2008-02-21 21:56 UTC, Diego Escalante Urrelo (not reading bugmail)
  Details
This shows the partially working multiple keyword matching (24.96 KB, image/png)
2008-02-21 21:57 UTC, Diego Escalante Urrelo (not reading bugmail)
  Details
Substring matching, from your mind to your browser. (30.78 KB, image/png)
2008-02-21 21:58 UTC, Diego Escalante Urrelo (not reading bugmail)
  Details
20080221_bgo_517960_regexp-in-url-bar.diff (6.22 KB, patch)
2008-02-21 22:25 UTC, Diego Escalante Urrelo (not reading bugmail)
none Details | Review
20080222_bgo_517960_regexp-in-url-bar.diff (3.28 KB, patch)
2008-02-22 19:59 UTC, Diego Escalante Urrelo (not reading bugmail)
none Details | Review
20080223_bgo_517960_regexp-in-url-bar.diff (9.88 KB, patch)
2008-02-23 05:46 UTC, Diego Escalante Urrelo (not reading bugmail)
none Details | Review
20080227_bgo_517960_regexp-in-url-bar.diff (9.77 KB, patch)
2008-02-27 08:20 UTC, Diego Escalante Urrelo (not reading bugmail)
none Details | Review
20080526_bgo_517960_regexp-location-bar: Updated to the ultimate awesomeness (10.86 KB, patch)
2008-05-26 18:43 UTC, Diego Escalante Urrelo (not reading bugmail)
none Details | Review
20080527_bgo_517960_regexp: Updated to not change highlight when browsing results (11.02 KB, patch)
2008-05-27 14:46 UTC, Diego Escalante Urrelo (not reading bugmail)
needs-work Details | Review
[PATCH] Added completion for the extra col (3.08 KB, patch)
2008-07-25 20:10 UTC, Diego Escalante Urrelo (not reading bugmail)
needs-work Details | Review
[PATCH] woohoo bar. (6.10 KB, patch)
2008-08-14 03:40 UTC, Diego Escalante Urrelo (not reading bugmail)
needs-work Details | Review
woohoo bar (61.92 KB, image/png)
2008-08-14 03:41 UTC, Diego Escalante Urrelo (not reading bugmail)
  Details
[PATCH] Replace the pointer voodoo with a regexp. (8.34 KB, patch)
2008-08-14 11:25 UTC, Diego Escalante Urrelo (not reading bugmail)
committed Details | Review

Description Diego Escalante Urrelo (not reading bugmail) 2008-02-21 21:47:42 UTC
Attached patch does the magic, note that this will fix a bunch of bugs:

Bug 151932 – Doesn't give URL suggestion based on substring
Bug 343906 – No unicode support for bookmark completion

And we might want to check this:
Bug 328162 – Diacritics in topic keywords

Patch follows
Comment 1 Diego Escalante Urrelo (not reading bugmail) 2008-02-21 21:51:27 UTC
Created attachment 105730 [details] [review]
20080221_bgo_517960_regexp-in-url-bar.diff

Enables the URL bar to use regexp, removing the old strncasecmp calls.
This simplifies the code and gives for free this benefits:
- multiple keyword matching (see screenshot, would need some thought)
- utf8 friendly matching
- substring matching
- awesomeness

Screenshots follow
Comment 2 Diego Escalante Urrelo (not reading bugmail) 2008-02-21 21:56:29 UTC
Created attachment 105731 [details]
Awesome regexp matching (not intended to be used by humans)

This is the case where you go nuts with your url bar.
Comment 3 Diego Escalante Urrelo (not reading bugmail) 2008-02-21 21:57:14 UTC
Created attachment 105732 [details]
This shows the partially working multiple keyword matching

Note that those xkcd bookmarks are marked as comic and xkcd topics.
Comment 4 Diego Escalante Urrelo (not reading bugmail) 2008-02-21 21:58:47 UTC
Created attachment 105733 [details]
Substring matching, from your mind to your browser.

This is highly important in my opinion, not always you remember the full url, sometimes you just recall a -say- "championsleague/2008" in the path, but actual behaviour forces you to visit history or recall the full url to get there. 
This unexpected benefit rocks :), at least for me.
Comment 5 Christian Persch 2008-02-21 22:08:04 UTC
Very nice!

+	ret = (g_regex_match_simple (key, item, G_REGEX_CASELESS, G_REGEX_MATCH_NOTEMPTY)
+		|| g_regex_match_simple (key, url, G_REGEX_CASELESS, G_REGEX_MATCH_NOTEMPTY)
+		|| g_regex_match_simple (key, keywords, G_REGEX_CASELESS, G_REGEX_MATCH_NOTEMPTY)
+		);

This compiles the same regex |key| 3 times. Should use g_regex_match instead with a compiled regex.

With that fixed, ok to commit after we branch for 2.22.
Comment 6 Diego Escalante Urrelo (not reading bugmail) 2008-02-21 22:25:38 UTC
Created attachment 105735 [details] [review]
20080221_bgo_517960_regexp-in-url-bar.diff

Updated to use a compiled regexp.
Comment 7 Diego Escalante Urrelo (not reading bugmail) 2008-02-22 19:59:30 UTC
Created attachment 105781 [details] [review]
20080222_bgo_517960_regexp-in-url-bar.diff

This new patch reuses the location entry regex, see the following text for the bits needed in ephy-location-entry.c|h.
Also this new patch follows the idea in 
  Bug 516550 – when entering part of url already in the history, put a relief on it

of re: for processing as regex (e-l-e sets the behaviour since we get the regex from there).

+
+GRegex *
+ephy_location_entry_get_regex (EphyLocationEntry *entry)
+{
+	EphyLocationEntryPrivate *priv = entry->priv;
+	
+	return priv->regex;
+}
Index: lib/widgets/ephy-location-entry.h
===================================================================
--- lib/widgets/ephy-location-entry.h	(revision 7972)
+++ lib/widgets/ephy-location-entry.h	(working copy)
@@ -91,6 +91,8 @@ gboolean	ephy_location_entry_get_can_und
 
 gboolean	ephy_location_entry_get_can_redo	(EphyLocationEntry *entry);
 
+GRegex *	ephy_location_entry_get_regex		(EphyLocationEntry *entry);
+
 gboolean	ephy_location_entry_reset		(EphyLocationEntry *entry);
 
 void		ephy_location_entry_undo_reset		(EphyLocationEntry *entry);
Comment 8 Diego Escalante Urrelo (not reading bugmail) 2008-02-23 05:46:19 UTC
Created attachment 105805 [details] [review]
20080223_bgo_517960_regexp-in-url-bar.diff

Updated, but now it uses some stuff related to the boldify bug, so please note I would probably commit this two bugs as a single fix.
Comment 9 Claudio Saavedra 2008-02-26 03:21:27 UTC
Just because I am procrastinating, I had a quick look at the patch and found a few leaks. Please plug them:

These should stay in:

-	g_free (item);
-	g_free (url);
-	g_free (keywords);

It would be nice if after unref'ing the object, you set the pointer to NULL:

+	if (priv->regex) {
+		g_regex_unref (priv->regex);
+               priv->regex = NULL;
+       }

Free pattern after this:

+	priv->regex = g_regex_new (pattern, 
+				G_REGEX_CASELESS, G_REGEX_MATCH_NOTEMPTY, NULL);


At least cdata and cdata_escaped should be freed here:

+	gtk_tree_model_get (tree_model, iter, priv->text_col, &cdata, -1);
+
+	g_value_init (&markup, G_TYPE_STRING);
+
+	/* This is because urls can break markup due to =, &, blah blah */
+	cdata_escaped = g_markup_escape_text (cdata, -1);
+	markup_final = g_regex_replace (priv->regex, cdata_escaped, -1, 0, "<b>\\0</b>", G_REGEX_MATCH_NOTEMPTY, NULL);
+	
+	g_value_take_string (&markup, markup_final);
+	g_object_set_property (G_OBJECT (cell), "markup", &markup);
+	g_value_unset (&markup);
+}
Comment 10 Diego Escalante Urrelo (not reading bugmail) 2008-02-27 08:17:53 UTC
Another unicode thing, bme should get regex joy also:

Bug 387357 – [bme] topic searching does not support unicode / non-english chars
Comment 11 Diego Escalante Urrelo (not reading bugmail) 2008-02-27 08:20:15 UTC
Created attachment 106051 [details] [review]
20080227_bgo_517960_regexp-in-url-bar.diff

Follow claudio's suggestions
Comment 12 Diego Escalante Urrelo (not reading bugmail) 2008-03-26 20:01:09 UTC
For the records, 
Bug 118618 – Bookmark nick names 

could also be fixed with this, or at least it would be easy to manipulate what users type in.
Comment 13 Diego Escalante Urrelo (not reading bugmail) 2008-05-19 04:36:15 UTC
For the records, there's a bad bug when the match is part of an escaped thing.

Eg: type 'a'
http://<b>a</b>.com
http://www.a.com/?blah=ewqewq&<b>a</b>mp;er=eqweq
                                 ^ epic failure.
Comment 14 Diego Escalante Urrelo (not reading bugmail) 2008-05-26 18:43:10 UTC
Created attachment 111562 [details] [review]
20080526_bgo_517960_regexp-location-bar: Updated to the ultimate awesomeness

This new patch avoids the entities problems, it's ussing a PangoAttrList instead of replacing text. It's a bit faster than the other one in my placebo tests.

Also, it's matching all the occurrences of the search string in the shown text, it can be easily changed in the code if we want only one match to be bold, this didn't affect performance in my test.
Comment 15 Diego Escalante Urrelo (not reading bugmail) 2008-05-26 18:46:58 UTC
Oh and the patch applies cleanly to .22 if anyone wants to try it.
Comment 16 Diego Escalante Urrelo (not reading bugmail) 2008-05-27 14:46:03 UTC
Created attachment 111610 [details] [review]
20080527_bgo_517960_regexp: Updated to not change highlight when browsing results

This fixes a small bug that changed the highlight when browsing the results with the keyboard.
Comment 17 Baptiste Mille-Mathias 2008-07-24 20:01:14 UTC
Hello,

is it possible to commit this patch, or does it have an issue ?

thanks
Comment 18 Diego Escalante Urrelo (not reading bugmail) 2008-07-25 20:10:51 UTC
Created attachment 115270 [details] [review]
[PATCH] Added completion for the extra col

This will also highlight the "extra" text, so you can search your history items
by title. Not sure if it's really good, but let's try it.
---
 lib/widgets/ephy-location-entry.c |   38 +++++++++++++++++++++++++++++++++++-
 src/ephy-location-action.c        |    6 ++++-
 2 files changed, 41 insertions(+), 3 deletions(-)
Comment 19 Diego Escalante Urrelo (not reading bugmail) 2008-07-25 20:13:15 UTC
This patch applies on top of the other one, it adds matching for the extra col, this means you can search your history items by title. I'm including it as another patch because i'm not sure if it's really useful, at first thought yes, but not sure, i'll try it.
For the records, this patches apply cleanly to 2-24 branch, so you can try them without problem.

In quick unscientific tests, the extra matching didn't slowed down the completion significantly.
Comment 20 Diego Escalante Urrelo (not reading bugmail) 2008-07-28 04:10:41 UTC
There's an ugly bug that makes the completion popup be blank, terminal prints this: 

(epiphany:10141): Gtk-CRITICAL **: gtk_tree_view_scroll_to_cell: assertion `tree_view->priv->tree != NULL' failed

I thought it was a pango issue, that something was broken in the markup, but turns out that it's the tree view that doesn't scroll to show what we need (that's what i'm understanding).

Anyone knows when/why this can happen?
Comment 21 Reinout van Schouwen 2008-07-28 07:52:12 UTC
kris is the treeview maintainer, you could try to ping him.
Comment 22 Paul Pogonyshev 2008-08-03 18:21:00 UTC
See also bug 546005.
Comment 23 Diego Escalante Urrelo (not reading bugmail) 2008-08-14 03:40:13 UTC
Created attachment 116543 [details] [review]
[PATCH] woohoo bar.

 lib/widgets/ephy-location-entry.c |  113 ++++++++----------------------------
 src/ephy-completion-model.c       |   16 ++----
 2 files changed, 30 insertions(+), 99 deletions(-)
Comment 24 Diego Escalante Urrelo (not reading bugmail) 2008-08-14 03:41:10 UTC
Created attachment 116544 [details]
woohoo bar

This applies on top of the other patches, it's just an experiment, screenshot attached.
Comment 25 Reinout van Schouwen 2008-08-14 08:08:16 UTC
Nice colors, Diego. :) But what do the smart bookmarks in the dropdown look like? They aren't in the screenshot.
Comment 26 Diego Escalante Urrelo (not reading bugmail) 2008-08-14 11:25:32 UTC
Created attachment 116565 [details] [review]
[PATCH] Replace the pointer voodoo with a regexp.


The behaviour is still the same as current 2-24, but the backend is now regexp
powered.
The reason to not enable substring matching for now is that it can be confusing
to have entries in the completion that at first sight have nothing to do with
the input in the location bar.

Hopefully we can find out what's so wrong with the bolding code and fix it. For
now, I would like to get this one to avoid the patch mess I already have.
Please test, someone, applies without a hitch to 2-24 branch, should have no
visible effect besides maybe a little performance improvement.
---
 lib/widgets/ephy-location-entry.c |   53 +++++++++++++--
 lib/widgets/ephy-location-entry.h |    9 ++-
 src/ephy-location-action.c        |  129 ++++++++++--------------------------
 3 files changed, 88 insertions(+), 103 deletions(-)
Comment 27 Diego Escalante Urrelo (not reading bugmail) 2008-08-14 11:39:45 UTC
(In reply to comment #25)
> Nice colors, Diego. :) But what do the smart bookmarks in the dropdown look
> like? They aren't in the screenshot.
> 

Ahem... because they don't fit anymore... :P. The code in GTK+ for sizing the completion popup has changed, now action area is out of screen most of the time with a fairly descent browsing history (i.e. lots of entries).
The new behaviour seems to be to give the entries the full height available, the action area goes offscreen in that case.
Xan did it, I blame him.
Although I still think we have to change that UI, it's hard to use right now if you have more than one or two smart bookmarks.

Now, regarding this last patch I attached. It's the simplest form and most up to date form of the regexp port. It *only* changes the current pointer voodoo backend to use a regexp backend. I even made the regexp replicate the current behaviour, the reason as explained in the patch comment is that otherwise it will be a bit weird for users to have entries that are not evidently connected to what they typed (the current behaviour).
Also, the bolding code seems to be firing a GTK+ bug or an Ephy bug that makes the completion allocate size for the entries that match the input BUT don't actually show anything.

So, action items:
  - someone please test the patch in a 2-24 build, it applies cleanly and requires no other stuff than a rebuild. 
    + if you see a weird behaviour regarding empty completion popups (the popup is sized correctly to the number of matches, but is a big white rectangle).
    + if you see a change in the current matching behaviour, it's a bug -for now. it is supposed to be exactly the same until the bolding bug is fixed.
  - someone please read the code and give me a clue on why the hell the model of the completion entry can be NULL in some circumstances.

That's all for now, testing is really appreciated. I want to land the basic infra for the cooler patches like the woohoo bar and matching on history titles (doing it with the current infra means yet another string to process with pointer voodoo).

Oh, one last thing: do we want a woohoo bar or a classic two column popup?. I mean, the big visible feature here will be matching substrings and bolding them in the popup, do we want to do that over the current UI (two columns) or in a woohoo bar UI (one column)? I have been using the two columns nicely, I'm testing the one column UI since today. But, what do you think at first thought?

two columns: just like it is now in ephy, plus bolding (ok this screenie misses the second column, i'm lazy) http://bugzilla.gnome.org/attachment.cgi?id=105733
one column: http://bugzilla.gnome.org/attachment.cgi?id=116544

Ok enough, please comment.
Comment 28 Christian Persch 2008-08-14 21:33:01 UTC
Looks good to me. Please commit to trunk and 2-24 :)
Comment 29 Diego Escalante Urrelo (not reading bugmail) 2008-08-14 22:48:49 UTC
Committed with modifications. Closing this bug, moving woohoo stuff to another bug: #541782.