Bug 595653 – Add back support for the "direction up" navigation

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 595653 - Add back support for the "direction up" navigation


Summary:	Add back support for the "direction up" navigation


Status:	RESOLVED FIXED

Product:	epiphany
Classification:	Core
Component:	General
Version:	unspecified
Hardware:	Other All

Importance:	Normal normal
Target Milestone:	---
Assigned To:	Epiphany Maintainers
QA Contact:	Epiphany Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2009-09-19 08:01 UTC by Frederic Peters
Modified:	2009-09-22 09:29 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Add back support for the "direction up" navigation (2.94 KB, patch) 2009-09-19 08:01 UTC, Frederic Peters	none	Details \| Review
Add back support for the "direction up" navigation (3.50 KB, patch) 2009-09-19 11:55 UTC, Frederic Peters	none	Details \| Review
Add back support for the "direction up" navigation (3.43 KB, patch) 2009-09-20 09:04 UTC, Frederic Peters	accepted-commit_now	Details \| Review

Description Frederic Peters 2009-09-19 08:01:14 UTC

ephy_web_view_can_go_up and ephy_web_view_get_go_up_list were empty
functions, that always returned FALSE and an empty list.

This patch implements them, with the same rules as before.

Comment 1 Frederic Peters 2009-09-19 08:01:18 UTC

Created attachment 143478 [details] [review]
Add back support for the "direction up" navigation

Comment 2 Xan Lopez 2009-09-19 08:19:55 UTC

Comment on attachment 143478 [details] [review]
Add back support for the "direction up" navigation

> ephy_web_view_can_go_up (EphyWebView *view)
> {
>-  return FALSE;
>+  SoupURI *uri;
>+  gboolean result;
>+
>+  uri = soup_uri_new (ephy_web_view_get_address (view));
>+  if (uri == NULL)
>+    return FALSE;
>+
>+  result = (uri->fragment || uri->query || strlen(uri->path) > 1);
>+  soup_uri_free (uri);

It does not really seem this will return FALSE even for the example mentioned in the documentation of the function, as it will return TRUE if there's any path, but we want FALSE for foo.com/index.html. Maybe gecko has an explicit check for index.html/htm if it's the first and only path? It would be worth checking the code.

>+
>+  return result;
> }
> 
> /**
>@@ -2088,7 +2098,49 @@ ephy_web_view_print_preview_navigate (EphyWebView *view,
> GSList *
> ephy_web_view_get_go_up_list (EphyWebView *view)
> {
>-  return NULL;
>+  SoupURI *uri;
>+  GSList *result = NULL;
>+  char *t1, *t2;
>+
>+  uri = soup_uri_new (ephy_web_view_get_address (view));
>+  if (uri == NULL)
>+    return NULL;
>+
>+  /* remove fragment, then query, then go up path */
>+  if (uri->fragment) {
>+    soup_uri_set_fragment (uri, NULL);
>+    result = g_slist_prepend(result, soup_uri_to_string (uri, FALSE));
>+  }
>+
>+  if (uri->query) {
>+    soup_uri_set_query (uri, NULL);
>+    result = g_slist_prepend (result, soup_uri_to_string (uri, FALSE));
>+  }
>+
>+  if (uri->path[strlen(uri->path)-1] != '/') {
>+    /* not a trailing slash, remove "file" part */
>+    t1 = strrchr (uri->path, '/');
>+    t2 = g_strndup (uri->path, t1-uri->path+1);
>+    soup_uri_set_path (uri, t2);
>+    g_free (t2);
>+    result = g_slist_prepend (result, soup_uri_to_string (uri, FALSE));
>+  }
>+
>+  while (strcmp(uri->path, "/") != 0) {
>+    /* chop trailing / */
>+    uri->path[strlen (uri->path)-1] = 0;
>+    t1 = strrchr (uri->path, '/');
>+    t2 = g_strndup (uri->path, t1-uri->path+1);
>+    soup_uri_set_path (uri, t2);
>+    g_free (t2);
>+    result = g_slist_prepend (result, soup_uri_to_string (uri, FALSE));
>+  }
>+
>+  result = g_slist_reverse (result);
>+
>+  soup_uri_free (uri);
>+
>+  return result;
> }

You really have to use UTF-8 functions, URIs are not ASCII only.

Comment 3 Frederic Peters 2009-09-19 08:42:48 UTC

(In reply to comment #2)

> It does not really seem this will return FALSE even for the example mentioned
> in the documentation of the function, as it will return TRUE if there's any
> path, but we want FALSE for foo.com/index.html. Maybe gecko has an explicit
> check for index.html/htm if it's the first and only path? It would be worth
> checking the code.

Epihany/gecko had no such explicit check (as the server can set any file as DirectoryIndex it wouldn't be that useful anyway); I ran it to check and http://www.gnome.org/index.html has the "up" navigation sensitive.

(for reference, if I have to look it up again, it's in mozilla_embed_get_uri_parent())

> You really have to use UTF-8 functions, URIs are not ASCII only.

The only character used here is '/', which is not ambiguous in UTF-8.

Comment 4 Xan Lopez 2009-09-19 08:50:47 UTC

(In reply to comment #3)
> (In reply to comment #2)
> 
> > It does not really seem this will return FALSE even for the example mentioned
> > in the documentation of the function, as it will return TRUE if there's any
> > path, but we want FALSE for foo.com/index.html. Maybe gecko has an explicit
> > check for index.html/htm if it's the first and only path? It would be worth
> > checking the code.
> 
> Epihany/gecko had no such explicit check (as the server can set any file as
> DirectoryIndex it wouldn't be that useful anyway); I ran it to check and
> http://www.gnome.org/index.html has the "up" navigation sensitive.
> 
> (for reference, if I have to look it up again, it's in
> mozilla_embed_get_uri_parent())
> 

I see, so the doc is wrong. Please update that too :)

> > You really have to use UTF-8 functions, URIs are not ASCII only.
> 
> The only character used here is '/', which is not ambiguous in UTF-8.

Neither strrchr nor strlen will do the right thing with UTF-8 encoded strings, and you are using both of those.

Comment 5 Xan Lopez 2009-09-19 08:51:44 UTC

(In reply to comment #4)
> > > You really have to use UTF-8 functions, URIs are not ASCII only.
> > 
> > The only character used here is '/', which is not ambiguous in UTF-8.
> 
> Neither strrchr nor strlen will do the right thing with UTF-8 encoded strings,
> and you are using both of those.

For reference, the correct replacements are g_utf8_strrchr and g_utf8_strlen.

Comment 6 Frederic Peters 2009-09-19 09:08:50 UTC

Really I do want the functions to act on bytes, not on characters, there is no need for UTF-8 functions here.

 - strlen(uri->path) > 1 is ok as the string will always start with /, an ascii character (but it could be replaced by g_utf8_strlen without harm);
 - path[strlen(uri->path)-1] != '/' is ok as the last byte will either be /, an ascii character, or something else (but '/' will never be part of an UTF-8 multibyte character) (here g_utf8_strlen() wouldn't do it, as it would return the number of characters, not the number of bytes, which would give a wrong array index);
 - uri->path[strlen (uri->path)-1] = 0; is ok as the last character is known to be '/', a single-byte character (g_utf8_strlen() would be wrong for the same reason as above);
 - strrchr (uri->path, '/'); is ok as '/' can only exist as a single byte character (could be replaced by g_utf8_strchr without harm).

Comment 7 Xan Lopez 2009-09-19 09:26:29 UTC

(In reply to comment #6)
> Really I do want the functions to act on bytes, not on characters, there is no
> need for UTF-8 functions here.
> 
>  - strlen(uri->path) > 1 is ok as the string will always start with /, an ascii
> character (but it could be replaced by g_utf8_strlen without harm);

I agree it does not matter here since you only want to check if there's more than one item, but there's no reason to mix functions IMHO.

>  - path[strlen(uri->path)-1] != '/' is ok as the last byte will either be /, an
> ascii character, or something else (but '/' will never be part of an UTF-8
> multibyte character) (here g_utf8_strlen() wouldn't do it, as it would return
> the number of characters, not the number of bytes, which would give a wrong
> array index);

That's what g_utf8_offset_to_pointer is. In this case since you want to check the last character I think it's better to search for the \0 byte and then just move backward one step with that function (which accepts negative offsets).

In general I *very* much prefer to make things explicit instead of relying on details of the implementation for its correctness, which are bound to be missed by people when refactoring the code.

>  - uri->path[strlen (uri->path)-1] = 0; is ok as the last character is known to
> be '/', a single-byte character (g_utf8_strlen() would be wrong for the same
> reason as above);

Same than above.

>  - strrchr (uri->path, '/'); is ok as '/' can only exist as a single byte
> character (could be replaced by g_utf8_strchr without harm).

Same than the first comment.

Comment 8 Frederic Peters 2009-09-19 11:55:04 UTC

Created attachment 143491 [details] [review]
Add back support for the "direction up" navigation

Comment 9 Frederic Peters 2009-09-19 11:55:45 UTC

This uses g_utf8_ functions, and update the comment of can_go_up().

Comment 10 Dan Winship 2009-09-19 14:35:51 UTC

(In reply to comment #4)
> Neither strrchr nor strlen will do the right thing with UTF-8 encoded strings,
> and you are using both of those.

strrchr does the right thing if you're scanning for an ASCII character, and strlen does the right thing if you want the length in bytes (which in this case he does). The entire point of UTF-8 as opposed to the other unicode encodings is that stuff like this still works.

(Also, uri->path is %-encoded, so it ought to be ASCII anyway...)

Comment 11 Xan Lopez 2009-09-19 15:11:26 UTC

(In reply to comment #10)
> (In reply to comment #4)
> > Neither strrchr nor strlen will do the right thing with UTF-8 encoded strings,
> > and you are using both of those.
> 
> strrchr does the right thing if you're scanning for an ASCII character, and
> strlen does the right thing if you want the length in bytes (which in this case
> he does). The entire point of UTF-8 as opposed to the other unicode encodings
> is that stuff like this still works.

As I said my point is simply that in code dealing with UTF-8 encoded strings I'd rather use UTF-8 functions just to be safer than sorry, especially when performance is not an issue at all. For example, when dealing with UTF-8 strings I think it's much better to do, in case that you want to access the last character in the string:

char* utf8_last_char (const char *p)
{
   glong len = g_utf8_strlen (p, -1);
   return g_utf8_offset_to_pointer (p, len - 1);
}

(or if you just can't stand going through the string twice, cycle through the string with the utf8_next function and return the previous to the null character).

I do agree p[strlen(p) - 1] works in this case, but at least I think it's better to make the code explicit than to rely in particularities that might not still be the case when this stuff is changed 3 years down the road, or that the next person touching this might miss. In any case...
> 
> (Also, uri->path is %-encoded, so it ought to be ASCII anyway...)

... since this is the case I guess this whole thing was completely pointless, so Frederic feel free to commit the first version just with the comment changed, and sorry for bothering you.

Comment 12 Frederic Peters 2009-09-20 09:04:02 UTC

Created attachment 143514 [details] [review]
Add back support for the "direction up" navigation

No problem, as I detected a problem with about: and data: URI scheme I didn't commit but modified the patch for a new review (straightforward but perhaps there are other URI schemes where path is not really a path).

Comment 13 Xan Lopez 2009-09-22 09:14:46 UTC

Comment on attachment 143514 [details] [review]
Add back support for the "direction up" navigation

Looks good to me, please commit.

Comment 14 Frederic Peters 2009-09-22 09:29:24 UTC

Thanks, pushed (as f279bbf2de28).