GNOME Bugzilla – Bug 767257
Prefer standard HTML5 icon resources when detecting web app icons
Last modified: 2016-11-11 16:53:29 UTC
mcatanzaro, I hope you can find the time to review this. (The first patch should take ten seconds to review; the others a little longer.) Three patches. Patch 0001 makes the existing code work. Just corrects a simple omission that made the current code always defer to favicons and never use higher resolution icons when found. Patch 0002 defers to requesting /favicon.ico if all other discovery methods fails. Same behavior as every web browsers since the 90s. (Is it necessary, though?) Patch 0003 adds support for <link rel="icon" sizes="256x256"> (high quality bitmap icon, minimum size is arbitrarily set to 96) and <link rel="icon" sizes="any"> (vector). These are the current web standard method for discovering icons and is used as the preferred method for discovering icons. The vendor-specific msapplication-tileimage is now the second choice. Similarly, the vendor-specific apple-touch-icon is now preferred over og:image as the latter more often than not represents the current web page and not the entire website; the former always represents the website. Some more details in the commit messages. For the 50 or so websites I tested tonight, the result (even scaled up 16x16) is better. Many more websites ended up with higher quality icon resources than before this patch. PS: I’ll follow up by making a document on the wiki for web authors when this issue is resolved.
Created attachment 329142 [details] [review] Patch one resolves a small omission in the current code that made it always defer to favicons.
Created attachment 329143 [details] [review] Patch two adds a fallback to optimistially request /favicon.ico when all else fails.
Created attachment 329144 [details] [review] Patch three adds support for HTML5 high resolution icon discovery and SVG.
Review of attachment 329142 [details] [review]: ::: embed/ephy-web-view.c @@ +2641,3 @@ if (data->icon_uri != NULL && data->icon_uri[0] != '\0') { *icon_uri = data->icon_uri; + *icon_result = data->icon_result; Good catch. I think it's best to treat the return values as independent here, though, so I'd prefer we move this up above the conditional rather than putting it here. That way we'll have: * One line of code to set icon_result * One conditional to set icon_uri * One conditional to set icon_color rather than putting the code to set icon_result inside the conditional that sets icon_uri. It begs the question of whether it's appropriate to ever set icon_result if icon_uri is NULL. Probably not, but let's trust the web extension to have returned correct values here. Perhaps we could get rid of this bool entirely (in a follow-up patch) if it's only used to indicate whether the URI is NULL?
Review of attachment 329143 [details] [review]: ::: embed/web-extension/ephy-web-dom-utils.c @@ +313,3 @@ + /* Last ditch effort: just try to grab the favicon from the default location. */ + if (image == NULL) + image = "/favicon.ico"; Looks good. @@ +317,1 @@ ret = (image != NULL && *image != '\0'); Ah, but now this will always be TRUE, so it doesn't make sense to keep this anymore. I guess this function no longer needs a return value? Then can we get rid of the return value of ephy_web_dom_utils_get_best_icon as well, and therefore get rid of the icon_result boolean? The places to remove it from: * fill_default_application_image_cb in window-commands.c * ephy_web_view_get_best_web_app_icon_finish * ephy_web_extension_proxy_get_best_web_app_icon_finish * handle_method_call in ephy-web-extension.c * ephy_web_dom_utils_get_best_icon I think we can remove it, but I didn't look closely at how it is used in fill_default_application_image_cb. Anyway, this patch causes the boolean to be always TRUE and means the failure case in fill_default_application_image_cb is now dead code, so this needs some more thought.
Review of attachment 329144 [details] [review]: Thanks a bunch for working on this; I'm really tired of our terrible web app icons. :) ::: embed/web-extension/ephy-web-dom-utils.c @@ +185,3 @@ + g_ascii_strcasecmp (rel, "icon") == 0 || + g_ascii_strcasecmp (rel, "shortcut icon") == 0 || + g_ascii_strcasecmp (rel, "icon shortcut") == 0)) { I think you also want to check for "shortcut-icon" here (see below). @@ +189,3 @@ + int width; + int height; + g_free (rel); Especially in a long function like this, I think we can improve readability by adding a blank line after the block of declarations and another after the free. I'm also not really fond of how we free rel in two places here -- it's confusing -- but I also couldn't easily come up with a clearer way to write this, so I suppose it's fine. Good job being careful not to leak it. @@ +194,3 @@ + if (g_ascii_strcasecmp (sizes, "any") == 0) { + /* TODO: Keep the SVG rather than rasterizing it to PNG. */ + image = webkit_dom_html_link_element_get_href (WEBKIT_DOM_HTML_LINK_ELEMENT (node)); What if image was already set on a previous iteration of the loop? I think you have to free it first, like you do down below. @@ +210,3 @@ + if (width >= 96 && width > largest_icon) { + if (image != NULL) + g_free (image); g_free is NULL-safe, so you can get rid of the NULL check (it's already checked inside g_free). Same with the normal standard library free (and also C++ delete). @@ +215,3 @@ + } + } + g_free (sizes); This is fine, but I think it would be more readable to put this free inside the scope of the check if (sizes != NULL), i.e. up one bracket here, since you have to check it anyway. @@ +226,3 @@ + *uri_out = g_strdup (image); + if (color_out != NULL) + *color_out = g_strdup (color); This code can never be reached, since we don't set color anywhere. So we don't need the color_out parameter at all. I see you added it here just to parallel the other functions here, where it is also generally unused, which was probably the right call as there's much value in parallelism, but I think we should clean up all of these functions in a trivial follow-up patch that removes the color parameter where it's currently never used. @@ +373,3 @@ + g_ascii_strcasecmp (rel, "icon") == 0 || + g_ascii_strcasecmp (rel, "shortcut icon") == 0 || + g_ascii_strcasecmp (rel, "icon shortcut") == 0)) You removed the shortcut-icon form (with the hyphen); looks like an accident, right? Also: do you still need to check these here at all? If they were found, then they would be found first by get_icon_from_html_icon, right? Maybe we don't need this get_icon_from_favicon function anymore at all; we can just fall back to /favicon.ico if none of the other functions found an icon?
Created attachment 329210 [details] [review] Icon work. (What am I supposed to write in the file attachment description filed?)
(In reply to Daniel Aleksandersen from comment #7) > (What am I supposed to write in the file attachment description > filed?) Whatever you want, so long as it's descriptive of what the patch does.
Review of attachment 329210 [details] [review]: Cool, this is looking very close now. I notice you fixed a few memory leaks in ephy-web-dom-utils that I hadn't noticed; thanks. This is worth noting in the commit message. ::: embed/web-extension/ephy-web-dom-utils.c @@ +192,3 @@ + int width; + int height; + g_free (rel); Please add a blank line above this free. I'd do it for you if this was the last thing remaining, but I noticed something else.... @@ +230,3 @@ + + if (uri_out != NULL) + *uri_out = g_strdup (image); I think this strdup is wrong; we leak image here. @@ +388,3 @@ + /* Last ditch effort: just fallback to the default favicon location. */ + if (image == NULL) + image = (char*)"/favicon.ico"; Style: (char *)"/favicon.ico"; I don't like this cast; remember that we use const on strings to indicate ownership, so you're casting an unowned string (which must not be freed) to an owned string (which must be freed), which is never right except when working with an API that doesn't respect our conventions. Instead, add a strdup here... @@ +392,2 @@ if (uri_out != NULL) *uri_out = g_strdup (image); ...and remove it from here. Notice that the preexisting code is leaking image here. (The return value of webkit_dom_html_link_element_get_href is a char *, so it needs to be freed or passed on to something that will free it.) So removing the strdup here should be safe.
Created attachment 329372 [details] [review] Resolve the four final issues identified by Michael.
Review of attachment 329372 [details] [review]: OK, let's do this. ::: embed/web-extension/ephy-web-dom-utils.c @@ +386,3 @@ + /* Last ditch effort: just fallback to the default favicon location. */ + if (image == NULL) + image = g_strdup("/favicon.ico"); I'll add the missing space here when I push it.
It doesn't work for every site, but about half the sites I tested look way better. Excellent.
(In reply to Michael Catanzaro from comment #11) > I'll add the missing space here when I push it. Yup forgot to do that... will land a follow-up.
The following fix has been pushed: 751dbe6 web-dom-utils: Remove return value from get_icon_from_favicon
Created attachment 329394 [details] [review] web-dom-utils: Remove return value from get_icon_from_favicon It always returns TRUE now.
Thanks for getting this in! :)
Note: I found this hidden documentation: https://wiki.gnome.org/Apps/Web/Applications If you want to update it, then feel free to link to it from the main Epiphany wiki page.
*** Bug 696726 has been marked as a duplicate of this bug. ***