After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 730129 - The ad blocker takes ages to initialise
The ad blocker takes ages to initialise
Status: RESOLVED FIXED
Product: epiphany
Classification: Core
Component: General
git master
Other Linux
: Normal normal
: ---
Assigned To: Epiphany Maintainers
Epiphany Maintainers
Depends on:
Blocks:
 
 
Reported: 2014-05-14 13:29 UTC by Marco Barisione
Modified: 2014-06-12 09:20 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
uri-tester: don't use regular expressions for trivial checks (1.30 KB, patch)
2014-05-14 13:32 UTC, Marco Barisione
committed Details | Review
uri-tester: don't recompile the regexes every time they are used (4.26 KB, patch)
2014-05-14 13:32 UTC, Marco Barisione
committed Details | Review
uri-tester: don't pass strings around just to free them later (5.45 KB, patch)
2014-05-14 13:32 UTC, Marco Barisione
committed Details | Review
uri-tester: parse the pattern file in idles (2.84 KB, patch)
2014-05-14 13:32 UTC, Marco Barisione
none Details | Review
Use gio async API to parse the patterns file (3.00 KB, patch)
2014-05-21 10:54 UTC, Carlos Garcia Campos
committed Details | Review

Description Marco Barisione 2014-05-14 13:29:22 UTC
I'm working on Epiphany for the Raspberry Pi. One of the problems is that it takes ages to start. To be precise it takes 27 seconds between the program is started and an empty page is displayed. Just the ad blocker takes 17 seconds!

This is due to three reasons:
1 - Regexes are used for trivial things like checking whether the first character in a string is a “*”.
2 - Regexes are recreated every time instead of just creating and optimising them once.
3 - All the loading and parsing it done at startup delaying the display of the window.

Fixing 1 and 2 reduced the startup time from 17 seconds to 6.
This was not good enough for us so I also made Epiphany parse only 500 lines at a time to fix 3.

I think that the fixes for 1 and 2 are definitely useful for upstream, not sure if you are interested in 3 too (and if you are maybe you could increase the number of the parsed lines).
Comment 1 Marco Barisione 2014-05-14 13:32:19 UTC
Created attachment 276530 [details] [review]
uri-tester: don't use regular expressions for trivial checks
Comment 2 Marco Barisione 2014-05-14 13:32:23 UTC
Created attachment 276531 [details] [review]
uri-tester: don't recompile the regexes every time they are used
Comment 3 Marco Barisione 2014-05-14 13:32:27 UTC
Created attachment 276532 [details] [review]
uri-tester: don't pass strings around just to free them later
Comment 4 Marco Barisione 2014-05-14 13:32:30 UTC
Created attachment 276533 [details] [review]
uri-tester: parse the pattern file in idles
Comment 5 Marco Barisione 2014-05-14 13:41:15 UTC
I forgot to mention that I couldn't test it on master due to dependency problems, but the patches apply cleanly and that file hasn't been touched much.
Comment 6 Yosef Or Boczko 2014-05-14 13:44:04 UTC
I tried it now, and look like it really faster (the loading of the overview page), but I can't
know on another pages. it look like the adblock plugins not really works everywhere (see
bug 697329).
Comment 7 Carlos Garcia Campos 2014-05-15 07:35:53 UTC
Review of attachment 276530 [details] [review]:

Looks good to me.

::: embed/uri-tester.c
@@ +503,1 @@
             !g_hash_table_lookup (tester->priv->keys, sig))

You can probably keep this in one line now.

@@ +513,1 @@
                 !g_hash_table_lookup (tester->priv->pattern, patt))

Ditto.
Comment 8 Carlos Garcia Campos 2014-05-15 07:47:29 UTC
Review of attachment 276531 [details] [review]:

Looks good too!

::: embed/uri-tester.c
@@ +316,2 @@
   opts = g_hash_table_lookup (tester->priv->optslist, patt);
+  if (opts && g_regex_match (tester->priv->regex_third_party, opts, 0, NULL))

Are we just searching for string ",third-party"? Do we really need a regexp for this too or can we use strstr? or is it because of the caseless option?

@@ +579,3 @@
     }
 
+    if (g_regex_match (tester->priv->regex_subdocument, opts, 0, NULL))

Ditto.
Comment 9 Carlos Garcia Campos 2014-05-15 07:56:39 UTC
Review of attachment 276532 [details] [review]:

Makes sense, yes :-)
Comment 10 Carlos Garcia Campos 2014-05-15 08:02:19 UTC
Review of attachment 276533 [details] [review]:

Instead of using idles, I think we should make all the file reads using the gio async API.
Comment 11 Claudio Saavedra 2014-05-15 09:04:33 UTC
By the way, it might make sense to branch 2.4 before pushing the more invasive patches. I think the first two (or three) might be safe to push to stable, though.
Comment 12 Claudio Saavedra 2014-05-15 09:04:58 UTC
I meant 3.12.
Comment 13 Marco Barisione 2014-05-15 14:49:15 UTC
(In reply to comment #8)
> Are we just searching for string ",third-party"? Do we really need a regexp for
> this too or can we use strstr? or is it because of the caseless option?
> 
> @@ +579,3 @@
>      }
> 
> +    if (g_regex_match (tester->priv->regex_subdocument, opts, 0, NULL))

Susprisngly it takes the same time doing it with a regex and by changing the string to lower case and then comparing.

(In reply to comment #10)
> Review of attachment 276533 [details] [review]:
> 
> Instead of using idles, I think we should make all the file reads using the gio
> async API.

My experience with very similar problems is that it's actually more efficient this way instead of just trying to do stuff when the data is available. I was trying to optimise for my specific case, not for gnome in general.
If you want I can rewrite the patch in a nicer way for gnome, but I think I will ship this version for the Raspberry Pi.
But I'm not that sure it's worth it making it async in general. Epiphany is very fast here on my 3 year old computer for instance.
Comment 14 Carlos Garcia Campos 2014-05-16 06:25:56 UTC
(In reply to comment #11)
> By the way, it might make sense to branch 2.4 before pushing the more invasive
> patches. I think the first two (or three) might be safe to push to stable,
> though.

I think all patches fit in stable, it's a performance optimization, I don't think any of them are invasive.
Comment 15 Carlos Garcia Campos 2014-05-21 10:54:54 UTC
Created attachment 276927 [details] [review]
Use gio async API to parse the patterns file

Alternative patch using gio async api instead of idles
Comment 16 Marco Barisione 2014-05-21 15:05:59 UTC
Your patch works, but in practice is not really helpful (I'm try on a Raspberry Pi). The file is read completely before the UI is responsive making the startup slower then what you would have by just reading the file synchronously.

Using G_PRIORITY_LOW makes the window appear and be responsive way before the file is parsed. On the other end it takes much longer (13 seconds longer!) to read the file this way than with my patch that uses idles, like I suspected.
The disadvantages with my version of the patch are that it's a ugly way of doing it and that it actually freezes for little intervals (not noticable) in the idle callback.

My suggestion for upstream is to stick to your patch but using a lower priority than G_PRIORITY_DEFAULT (G_PRIORITY_DEFAULT_IDLE could work).
For the Raspberry Pi I will ship my version of the patch as every little time saving counts :)
Comment 17 Carlos Garcia Campos 2014-05-21 15:14:56 UTC
I guess you are talking about ephy + wk1, because in wk2, the ad blocker should never affect when the window is shown, nor block the UI at all, since it's run in the web process, right after it's launched. So, since there aren't ui related main loop sources in the web process, I'm not sure we need to change the priority. OTOH, WebKit IO thread uses the  default priority to schedule sources to the main thread, so maybe it's a good idea to use default_idle as you suggest.

I expected reading the file asynchronously would take more time, but not that much, I wonder why.
Comment 18 Marco Barisione 2014-05-21 15:28:12 UTC
Ah yes, sorry, we are using webkit1 on the Pi for various reasons.
Comment 19 Carlos Garcia Campos 2014-06-12 09:20:11 UTC
Comment on attachment 276927 [details] [review]
Use gio async API to parse the patterns file

I had forgotten this, I've changed the priorities and pushed the patch to both branches.