GNOME Bugzilla – Bug 730129
The ad blocker takes ages to initialise
Last modified: 2014-06-12 09:20:22 UTC
I'm working on Epiphany for the Raspberry Pi. One of the problems is that it takes ages to start. To be precise it takes 27 seconds between the program is started and an empty page is displayed. Just the ad blocker takes 17 seconds! This is due to three reasons: 1 - Regexes are used for trivial things like checking whether the first character in a string is a “*”. 2 - Regexes are recreated every time instead of just creating and optimising them once. 3 - All the loading and parsing it done at startup delaying the display of the window. Fixing 1 and 2 reduced the startup time from 17 seconds to 6. This was not good enough for us so I also made Epiphany parse only 500 lines at a time to fix 3. I think that the fixes for 1 and 2 are definitely useful for upstream, not sure if you are interested in 3 too (and if you are maybe you could increase the number of the parsed lines).
Created attachment 276530 [details] [review] uri-tester: don't use regular expressions for trivial checks
Created attachment 276531 [details] [review] uri-tester: don't recompile the regexes every time they are used
Created attachment 276532 [details] [review] uri-tester: don't pass strings around just to free them later
Created attachment 276533 [details] [review] uri-tester: parse the pattern file in idles
I forgot to mention that I couldn't test it on master due to dependency problems, but the patches apply cleanly and that file hasn't been touched much.
I tried it now, and look like it really faster (the loading of the overview page), but I can't know on another pages. it look like the adblock plugins not really works everywhere (see bug 697329).
Review of attachment 276530 [details] [review]: Looks good to me. ::: embed/uri-tester.c @@ +503,1 @@ !g_hash_table_lookup (tester->priv->keys, sig)) You can probably keep this in one line now. @@ +513,1 @@ !g_hash_table_lookup (tester->priv->pattern, patt)) Ditto.
Review of attachment 276531 [details] [review]: Looks good too! ::: embed/uri-tester.c @@ +316,2 @@ opts = g_hash_table_lookup (tester->priv->optslist, patt); + if (opts && g_regex_match (tester->priv->regex_third_party, opts, 0, NULL)) Are we just searching for string ",third-party"? Do we really need a regexp for this too or can we use strstr? or is it because of the caseless option? @@ +579,3 @@ } + if (g_regex_match (tester->priv->regex_subdocument, opts, 0, NULL)) Ditto.
Review of attachment 276532 [details] [review]: Makes sense, yes :-)
Review of attachment 276533 [details] [review]: Instead of using idles, I think we should make all the file reads using the gio async API.
By the way, it might make sense to branch 2.4 before pushing the more invasive patches. I think the first two (or three) might be safe to push to stable, though.
I meant 3.12.
(In reply to comment #8) > Are we just searching for string ",third-party"? Do we really need a regexp for > this too or can we use strstr? or is it because of the caseless option? > > @@ +579,3 @@ > } > > + if (g_regex_match (tester->priv->regex_subdocument, opts, 0, NULL)) Susprisngly it takes the same time doing it with a regex and by changing the string to lower case and then comparing. (In reply to comment #10) > Review of attachment 276533 [details] [review]: > > Instead of using idles, I think we should make all the file reads using the gio > async API. My experience with very similar problems is that it's actually more efficient this way instead of just trying to do stuff when the data is available. I was trying to optimise for my specific case, not for gnome in general. If you want I can rewrite the patch in a nicer way for gnome, but I think I will ship this version for the Raspberry Pi. But I'm not that sure it's worth it making it async in general. Epiphany is very fast here on my 3 year old computer for instance.
(In reply to comment #11) > By the way, it might make sense to branch 2.4 before pushing the more invasive > patches. I think the first two (or three) might be safe to push to stable, > though. I think all patches fit in stable, it's a performance optimization, I don't think any of them are invasive.
Created attachment 276927 [details] [review] Use gio async API to parse the patterns file Alternative patch using gio async api instead of idles
Your patch works, but in practice is not really helpful (I'm try on a Raspberry Pi). The file is read completely before the UI is responsive making the startup slower then what you would have by just reading the file synchronously. Using G_PRIORITY_LOW makes the window appear and be responsive way before the file is parsed. On the other end it takes much longer (13 seconds longer!) to read the file this way than with my patch that uses idles, like I suspected. The disadvantages with my version of the patch are that it's a ugly way of doing it and that it actually freezes for little intervals (not noticable) in the idle callback. My suggestion for upstream is to stick to your patch but using a lower priority than G_PRIORITY_DEFAULT (G_PRIORITY_DEFAULT_IDLE could work). For the Raspberry Pi I will ship my version of the patch as every little time saving counts :)
I guess you are talking about ephy + wk1, because in wk2, the ad blocker should never affect when the window is shown, nor block the UI at all, since it's run in the web process, right after it's launched. So, since there aren't ui related main loop sources in the web process, I'm not sure we need to change the priority. OTOH, WebKit IO thread uses the default priority to schedule sources to the main thread, so maybe it's a good idea to use default_idle as you suggest. I expected reading the file asynchronously would take more time, but not that much, I wonder why.
Ah yes, sorry, we are using webkit1 on the Pi for various reasons.
Comment on attachment 276927 [details] [review] Use gio async API to parse the patterns file I had forgotten this, I've changed the priorities and pushed the patch to both branches.