After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 775067 - The UI process main thread is blocked by libhttpseverywhere very often
The UI process main thread is blocked by libhttpseverywhere very often
Status: RESOLVED FIXED
Product: epiphany
Classification: Core
Component: General
git master
Other Linux
: Normal blocker
: ---
Assigned To: Epiphany Maintainers
libhttpseverywhere maintainer(s)
Depends on:
Blocks: 775575
 
 
Reported: 2016-11-25 08:08 UTC by Carlos Garcia Campos
Modified: 2016-12-03 20:02 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
uri-tester: Stop pretending we support multiple filters (9.25 KB, patch)
2016-11-26 18:46 UTC, Carlos Garcia Campos
committed Details | Review
uri-tester: Do not rewrite HTTPS urls (1.11 KB, patch)
2016-11-26 18:46 UTC, Carlos Garcia Campos
committed Details | Review
uri-tester: Never block the UI process main thread (58.19 KB, patch)
2016-11-26 18:47 UTC, Carlos Garcia Campos
needs-work Details | Review
uri-tester: Never block the UI process main thread (58.23 KB, patch)
2016-11-26 19:00 UTC, Carlos Garcia Campos
none Details | Review
uri-tester: Never block the UI process main thread (59.33 KB, patch)
2016-11-27 10:03 UTC, Carlos Garcia Campos
none Details | Review
uri-tester: Never block the UI process main thread (63.73 KB, patch)
2016-11-29 13:09 UTC, Carlos Garcia Campos
committed Details | Review

Description Carlos Garcia Campos 2016-11-25 08:08:20 UTC
This happens a lot with some websites, the UI freezes for a while, you can't even switch to another tab. This makes the whole UI a lot more unresponsive and laggish. I've checked what happened when the UI was blocked and found this:

  • #0 ??
    from /lib/x86_64-linux-gnu/libpcre.so.3
  • #1 ??
    from /lib/x86_64-linux-gnu/libpcre.so.3
  • #2 pcre_exec
    from /lib/x86_64-linux-gnu/libpcre.so.3
  • #3 g_match_info_next
    at gregex.c line 735
  • #4 g_regex_match_full
    at gregex.c line 1804
  • #5 g_regex_match
    at gregex.c line 1715
  • #6 https_everywhere_target_matches
    at /home/cgarcia/src/git/gnome/libhttpseverywhere/src/ruleset.vala line 264
  • #7 https_everywhere_context_rewrite
    at /home/cgarcia/src/git/gnome/libhttpseverywhere/src/context.vala line 153
  • #8 ephy_uri_tester_rewrite_uri
    from /home/cgarcia/gnome/lib/epiphany/libephymain.so
  • #9 ephy_uri_tester_return_response
    from /home/cgarcia/gnome/lib/epiphany/libephymain.so
  • #10 handle_method_call
    from /home/cgarcia/gnome/lib/epiphany/libephymain.so
  • #11 call_in_idle_cb
    at gdbusconnection.c line 4836
  • #12 g_main_dispatch
    at gmain.c line 3203
  • #13 g_main_context_dispatch
    at gmain.c line 3856
  • #14 g_main_context_iterate
    at gmain.c line 3929
  • #15 g_main_context_iteration
    at gmain.c line 3990
  • #16 g_application_run
    at gapplication.c line 2381
  • #17 main

WHAT? Why is the uri tester in the UI process? This is making the whole webkit2 process separation pointless. We always refused to add send-request signal to the UI process API when everybody askewd for it because we never want to send a sync message for every single resource load. Using sync DBus messages between UI process and web extensions is also very discouraged, if I could make a way to disallow it I would do it. I'm sure the reasons for moving the URI tester to the UI process are all quite valid, but the solution is not acceptable.
Comment 1 Michael Catanzaro 2016-11-25 16:30:38 UTC
Thanks for investigating this. https_everywhere_context_rewrite is not supposed to ever block the main thread -- it should be safe to use from the UI thread, that's why we made it async -- but clearly we messed up the implementation. I think we could fix this and make it robust to future such problems by switching it to use g_task_run_in_thread.

As for where the URI tester lives... the delay in doing a sync D-Bus call at send_request time should be extremely small relative to the 1-3s delay required to load HTTPS Everywhere filters. That has to happen before loading the first http:// URI, so it would be a major performance regression when opening http:// URIs in new tabs. That's what I was trying to avoid here. Surely blocking send_request is better than living with that delay? I dunno; I'm certainly open to reconsidering the design but it as it's a quite hard problem. The only solution I see is to move the work out of the web process, and that means we have to use a sync D-Bus call in send_request. It doesn't have to be the UI process -- we could certainly spin up a helper process to run the URI tester -- but seems easier to just fix https_everywhere_context_rewrite to do its work in a secondary thread, right?
Comment 2 Michael Catanzaro 2016-11-26 12:57:06 UTC
(In reply to Michael Catanzaro from comment #1)
> I think we could fix this and make it robust to future such
> problems by switching it to use g_task_run_in_thread.

Actually if it's slow enough for you to notice it and file a bug report, that's really not sufficient, we should also optimize it as far as possible. That's a big problem independent of which process we perform the rewrite in. I thought only the initial ruleset load was expected to be slow.
Comment 3 Michael Catanzaro 2016-11-26 17:52:43 UTC
See also: bug #772818

I guess we have to go back to slow initialization and just try to make libhttpseverywhere faster, because I don't see other options here if you're not OK with sync D-Bus calls in send_request.
Comment 4 Carlos Garcia Campos 2016-11-26 18:45:15 UTC
(In reply to Michael Catanzaro from comment #2)
> (In reply to Michael Catanzaro from comment #1)
> > I think we could fix this and make it robust to future such
> > problems by switching it to use g_task_run_in_thread.
> 
> Actually if it's slow enough for you to notice it and file a bug report,
> that's really not sufficient, we should also optimize it as far as possible.
> That's a big problem independent of which process we perform the rewrite in.
> I thought only the initial ruleset load was expected to be slow.

The problem is that for pages not having an https rule, we try to rewrite every single subresource request, and all XMLHttpRequests that might happen. If the page has a lot of subresources, ephy freezes for a while until everything is loaded, but just clicking on link in the same page makes everything freeze again. We should fix that of course, and also try to optimize libhttpseverywhere if possible to work faster, but no matter what we improve, we should never block the UI main thread for every single subresource load, and we should never send sync dbus messages from send-request callbacks. I have a patch to fix this, I've been playing with this and everything seems to work so I'm going to submit it.
Comment 5 Carlos Garcia Campos 2016-11-26 18:46:20 UTC
Created attachment 340811 [details] [review]
uri-tester: Stop pretending we support multiple filters
Comment 6 Carlos Garcia Campos 2016-11-26 18:46:53 UTC
Created attachment 340812 [details] [review]
uri-tester: Do not rewrite HTTPS urls
Comment 7 Carlos Garcia Campos 2016-11-26 18:47:23 UTC
Created attachment 340813 [details] [review]
uri-tester: Never block the UI process main thread
Comment 8 Carlos Garcia Campos 2016-11-26 18:53:24 UTC
Bring back mail notifications
Comment 9 Michael Catanzaro 2016-11-26 18:57:29 UTC
(In reply to Carlos Garcia Campos from comment #8)
> Bring back mail notifications

Oops :)
Comment 10 Carlos Garcia Campos 2016-11-26 19:00:16 UTC
Created attachment 340817 [details] [review]
uri-tester: Never block the UI process main thread

Noticed a small leak in previous patch.
Comment 11 Michael Catanzaro 2016-11-26 19:00:36 UTC
Review of attachment 340811 [details] [review]:

The problem is we have no progress on WebKit content extensions in a long time. I'm actually planning to add EasyPrivacy support hopefully this cycle, and that really requires support for multiple filters in EphyUriTester. You couldn't have known this because I just finished adding it to the roadmap half an hour ago, but I've documented it there now: https://wiki.gnome.org/Apps/Web/Roadmap
Comment 12 Michael Catanzaro 2016-11-26 19:02:23 UTC
Review of attachment 340812 [details] [review]:

D'oh.

There is one more place in this file that has this same broken check. You should fix that too!
Comment 13 Carlos Garcia Campos 2016-11-26 19:02:52 UTC
(In reply to Michael Catanzaro from comment #11)
> Review of attachment 340811 [details] [review] [review]:
> 
> The problem is we have no progress on WebKit content extensions in a long
> time. I'm actually planning to add EasyPrivacy support hopefully this cycle,
> and that really requires support for multiple filters in EphyUriTester. You
> couldn't have known this because I just finished adding it to the roadmap
> half an hour ago, but I've documented it there now:
> https://wiki.gnome.org/Apps/Web/Roadmap

I'm confident Adri will find the time.
Comment 14 Carlos Garcia Campos 2016-11-26 19:04:31 UTC
(In reply to Michael Catanzaro from comment #12)
> Review of attachment 340812 [details] [review] [review]:
> 
> D'oh.
> 
> There is one more place in this file that has this same broken check. You
> should fix that too!

Fixed in next patch, I guess, I haven't seen more of those.
Comment 15 Carlos Garcia Campos 2016-11-26 19:07:31 UTC
(In reply to Carlos Garcia Campos from comment #13)
> (In reply to Michael Catanzaro from comment #11)
> > Review of attachment 340811 [details] [review] [review] [review]:
> > 
> > The problem is we have no progress on WebKit content extensions in a long
> > time. I'm actually planning to add EasyPrivacy support hopefully this cycle,
> > and that really requires support for multiple filters in EphyUriTester. You
> > couldn't have known this because I just finished adding it to the roadmap
> > half an hour ago, but I've documented it there now:
> > https://wiki.gnome.org/Apps/Web/Roadmap
> 
> I'm confident Adri will find the time.

Otherwise I'll do it myself.
Comment 16 Carlos Garcia Campos 2016-11-26 19:17:17 UTC
Note that even if the Ui process is never blocked now, you can switch to another tab or close it, the https everywhere still blocks the web process, making scrolling quite laggish in some pages, especially pages that load contents while scrolling. And that happens when everything has been loaded. So, apart from making libhttpseverywhere faster if possible, we should not try to rewrite subresources if the main resource was not rewritten, for example. And we should definitely add a setting to disable it, or add UI to white list pages or whatever, because this is making ephy unusable with some websites.
Comment 17 Adrian Perez 2016-11-26 19:21:50 UTC
(In reply to Carlos Garcia Campos from comment #15)
> (In reply to Carlos Garcia Campos from comment #13)
> > (In reply to Michael Catanzaro from comment #11)
> > > Review of attachment 340811 [details] [review] [review] [review] [review]:
> > > 
> > > The problem is we have no progress on WebKit content extensions in a long
> > > time. I'm actually planning to add EasyPrivacy support hopefully this cycle,
> > > and that really requires support for multiple filters in EphyUriTester. You
> > > couldn't have known this because I just finished adding it to the roadmap
> > > half an hour ago, but I've documented it there now:
> > > https://wiki.gnome.org/Apps/Web/Roadmap
> > 
> > I'm confident Adri will find the time.
> 
> Otherwise I'll do it myself.

I'll see what I can do :)
Comment 18 Michael Catanzaro 2016-11-26 19:49:55 UTC
(In reply to Carlos Garcia Campos from comment #16)
> Note that even if the Ui process is never blocked now, you can switch to
> another tab or close it, the https everywhere still blocks the web process,
> making scrolling quite laggish in some pages, especially pages that load
> contents while scrolling. And that happens when everything has been loaded.
> So, apart from making libhttpseverywhere faster if possible,

Could you please file a separate bug for this against libhttpseverywhere in GNOME Bugzilla.

> we should not
> try to rewrite subresources if the main resource was not rewritten, for
> example.

OK. Want to file a new bug for this too, against Epiphany? You can assign me if you don't want to work on it yourself.

> And we should definitely add a setting to disable it, or add UI to
> white list pages or whatever, because this is making ephy unusable with some
> websites.

I think the reason I didn't notice any performance problems is that I'm using debug WebKit with Ephy master and performance is atrocious always. :) Are you able to share a website that is broken by this, for testing?

I am hoping to avoid the need for a setting if possible. If you're seeing TLS errors, that should not be too hard to fix: that's bug #775146, we just need to remember that we rewrote. But I have a report that some unspecified website is rewritten to https:// but doesn't listen on 443 anymore at all; there will surely be more such sites. This is trickier; I guess we need to remember that we rewrote the URL and try http:// next if the first load fails. But if the connection hangs, then we have a big problem that I would not know how to solve; that would call into question whether we can enable HTTPS Everywhere by default at all. Ditto for poor performance after the page is loaded; I'm not sure what we can do about that besides just disable HTTPS Everywhere if we can't make it faster.
Comment 19 Michael Catanzaro 2016-11-26 20:37:29 UTC
Review of attachment 340813 [details] [review]:

Whew, this was a hard review. Thank you very much for working on this. It looks mostly excellent as usual, but there are some tricky cases that need to be considered.

There is a typo "examnple" -> "example" in the commit message.

::: embed/ephy-embed-shell.c
@@ +106,3 @@
+  if (priv->uri_tester_update_cancellable) {
+    g_cancellable_cancel (priv->uri_tester_update_cancellable);
+    g_object_unref (priv->uri_tester_update_cancellable);

This is dispose, not finalize. I see that it should never be used after it's cancelled, but still, let's not play with fire: play it safe and use g_clear_object here.

@@ +127,3 @@
   g_clear_object (&priv->dbus_server);
 
+  g_free (priv->adblock_data_dir);

And this belongs in finalize. Or you could use g_clear_pointer, but I much prefer to create finalize and move it there. I think you taught me this. :)

@@ +678,3 @@
+
+  if (!g_file_copy_finish (src, result, &error)) {
+    LOG ("Error retrieving filter: %s\n", error->message);

Don't use LOG for this, that only prints in debug mode (non-default configure option). Use g_warning. Also be sure to check the error with g_error_matches and not print the warning if it's G_IO_ERROR_CANCELLED. It looks like you should also never call ephy_embed_shell_adblock_updated in that case, right?

@@ +739,3 @@
+                                     priv->uri_tester_update_cancellable,
+                                     (GAsyncReadyCallback)https_everywhere_update_cb,
+                                     shell);

What happens if the shell is disposed before this is called? Looks like we wind up looping through our EphyWebExtensionProxys in ephy_embed_shell_uri_tester_updated halfway through dispose... this looks risky, are you sure it's safe? Should we ref here and unref in the callback?

@@ +790,3 @@
+                             ephy_embed_shell_ensure_adblock_data_dir (shell),
+                             private_profile,
+                             priv->adblock_updated && priv->https_everywhere_updated);

OK, this is the one major problem I see in your patch. The HTTPSEverywhereUpdater just updates HTTPS filters from the Internet, which is not required at all to use the library. We don't want to block the web extension on this because it could take a very long time to download a large amount of data (~7-8 MiB, that's 5-6 seconds at 10 Mbps). You didn't notice this because ruleset updates are rare (once a month or so) but we don't want to block the web process that long when it occurs. What we do need to block on is the process of reading the existing filters from disk into memory (https_everywhere_context_init), but since that no longer happens in the UI process, we can simplify things here. There's no particular justification for running the HTTPSEverywhereUpdater when starting Epiphany; I just picked startup because it's one particular time for it to happen, the changes do not actually (currently) take effect until the next time you reopen Epiphany. And libhttpseverywhere writes the ruleset file atomically with g_file_set_contents, so looks like we don't need to worry about what happens if the update is being written while a web process is initializing libhttpseverywhere, it should be safe. So EphyEmbedShell should not use priv->https_everywhere_updated at all to decide if the UriTester is updated. In fact, you shouldn't need this variable at all. And that will simplify some of the code earlier in this file. We could wind up with some processes using an older ruleset than others, but that's not any problem IMO.

::: embed/ephy-uri-tester.c
@@ +572,2 @@
 static gboolean
 ephy_uri_tester_test_uri (EphyUriTester *tester,

Now would be a good time to rename this to should_block_uri or something like that.

@@ +611,2 @@
+  /* Note that if this were not fatal, we would need some way to ensure
+   * that future pending requests would not get stuck forever. */

This comment is stale, because we can't have pending requests after this patch. I guess anyone who might decide to remove the crash here in the future would think about the error path that would be needed, so I would just remove this comment.

@@ +613,2 @@
   if (error) {
+    if (!g_error_matches (error, G_IO_ERROR, G_IO_ERROR_CANCELLED))

Actually, since you don't pass a cancellable, you don't have to worry about this condition at all and should just remove it. That will save you from worrying about my next comment, which is that you try to use https_everywhere_context_rewrite even in the case where the operation is cancelled, which would cause a critical warning.

@@ +629,1 @@
+  context = g_main_context_new ();

You need to unref this.

@@ +785,3 @@
+
+  task = g_task_new (tester, NULL, NULL, NULL);
+  g_task_run_in_thread_sync (task, (GTaskThreadFunc)ephy_uri_tester_load_sync);

Why do you need to use GTask and a secondary thread here, instead of just calling the function directly? I do see that the goal is to perform two async operations at the same time without locking and roll them into a sync operation -- that makes sense to me -- but why can't you do it in the main thread instead of blocking the main thread on a secondary thread?

::: embed/web-extension/ephy-web-extension.c
@@ +211,3 @@
+    /* Wait for the UI process to finish the update */
+    while (!extension->uri_tester_updated)
+      g_main_context_iteration (NULL, TRUE);

Hm... the risk here is that WebKit probably expects that it "owns" the default main context in the web process, right? It could break assumptions in WebKit that code will not be called until a return to the main loop. I guess we just have to risk it, since I don't see another way, but it feels like a bad idea.

What happens if the UI process crashes before this point, how does the web process die? I think it's OK if it dies badly, just need to be confident it doesn't keep running forever churning CPU here; it's not obvious to me what will happen.

@@ +222,3 @@
+    }
+  } else if (!modified_uri) {
+    return FALSE;

I would add a comment here to indicate that this prevents the load; the return-by-bool semantics are not at all clear unless you're familiar with the WebKit API.

::: embed/ephy-uri-tester.h
@@ +25,3 @@
 G_BEGIN_DECLS
 
+#define ADBLOCK_FILTER_URL "https://easylist-downloads.adblockplus.org/easylist.txt"

Is it easy to update this patch to not remove support for multiple filters? Besides making my life easier to add EasyPrivacy, it'll also be easier to develop the user interface for filter configuration if we keep the backend support now. But since we don't have anything using that feature currently, I understand if you need to remove it for now....
Comment 20 Michael Catanzaro 2016-11-26 20:39:27 UTC
(In reply to Michael Catanzaro from comment #19)
> It looks like you should also never call ephy_embed_shell_adblock_updated in
> that case, right?

Sorry, ignore that; I see that it's wrong, I just forgot to remove it from my review.
Comment 21 Michael Catanzaro 2016-11-26 21:38:16 UTC
Comment on attachment 340811 [details] [review]
uri-tester: Stop pretending we support multiple filters

(Still not pleased with this, but I won't block it.)
Comment 22 Carlos Garcia Campos 2016-11-27 08:28:23 UTC
(In reply to Michael Catanzaro from comment #19)
> Review of attachment 340813 [details] [review] [review]:
> 
> Whew, this was a hard review. Thank you very much for working on this. It
> looks mostly excellent as usual, but there are some tricky cases that need
> to be considered.
> 
> There is a typo "examnple" -> "example" in the commit message.

Oops, only one typo this time, it's not that bad for a Saturday evening. :-)

> ::: embed/ephy-embed-shell.c
> @@ +106,3 @@
> +  if (priv->uri_tester_update_cancellable) {
> +    g_cancellable_cancel (priv->uri_tester_update_cancellable);
> +    g_object_unref (priv->uri_tester_update_cancellable);
> 
> This is dispose, not finalize. I see that it should never be used after it's
> cancelled, but still, let's not play with fire: play it safe and use
> g_clear_object here.

Right, need to null this.

> @@ +127,3 @@
>    g_clear_object (&priv->dbus_server);
>  
> +  g_free (priv->adblock_data_dir);
> 
> And this belongs in finalize. Or you could use g_clear_pointer, but I much
> prefer to create finalize and move it there. I think you taught me this. :)

Right again, I'll add a finalize.

> @@ +678,3 @@
> +
> +  if (!g_file_copy_finish (src, result, &error)) {
> +    LOG ("Error retrieving filter: %s\n", error->message);
> 
> Don't use LOG for this, that only prints in debug mode (non-default
> configure option). Use g_warning. Also be sure to check the error with
> g_error_matches and not print the warning if it's G_IO_ERROR_CANCELLED. It
> looks like you should also never call ephy_embed_shell_adblock_updated in
> that case, right?

I'm not actually using LOG, this is copy pasted, but yes, I'll change to use g_warning.

> @@ +739,3 @@
> +                                     priv->uri_tester_update_cancellable,
> +                                    
> (GAsyncReadyCallback)https_everywhere_update_cb,
> +                                     shell);
> 
> What happens if the shell is disposed before this is called? Looks like we
> wind up looping through our EphyWebExtensionProxys in
> ephy_embed_shell_uri_tester_updated halfway through dispose... this looks
> risky, are you sure it's safe? Should we ref here and unref in the callback?

I just forgot to ref/unref which is indeed what we do for the adblock filters too.

> @@ +790,3 @@
> +                             ephy_embed_shell_ensure_adblock_data_dir
> (shell),
> +                             private_profile,
> +                             priv->adblock_updated &&
> priv->https_everywhere_updated);
> 
> OK, this is the one major problem I see in your patch. The
> HTTPSEverywhereUpdater just updates HTTPS filters from the Internet, which
> is not required at all to use the library. We don't want to block the web
> extension on this because it could take a very long time to download a large
> amount of data (~7-8 MiB, that's 5-6 seconds at 10 Mbps). You didn't notice
> this because ruleset updates are rare (once a month or so) but we don't want
> to block the web process that long when it occurs. What we do need to block
> on is the process of reading the existing filters from disk into memory
> (https_everywhere_context_init), but since that no longer happens in the UI
> process, we can simplify things here. There's no particular justification
> for running the HTTPSEverywhereUpdater when starting Epiphany; I just picked
> startup because it's one particular time for it to happen, the changes do
> not actually (currently) take effect until the next time you reopen
> Epiphany. And libhttpseverywhere writes the ruleset file atomically with
> g_file_set_contents, so looks like we don't need to worry about what happens
> if the update is being written while a web process is initializing
> libhttpseverywhere, it should be safe. So EphyEmbedShell should not use
> priv->https_everywhere_updated at all to decide if the UriTester is updated.
> In fact, you shouldn't need this variable at all. And that will simplify
> some of the code earlier in this file. We could wind up with some processes
> using an older ruleset than others, but that's not any problem IMO.

I'm not sure I understand all this. Current code is deferring all requests and blocking the UI process until both update and init has happened. The patch does exactly the same with the only difference that the UI process does the update and the web process the init. What happens the first time init is called and update wasn't called, init does the update itself? Or do you mean we should move the update also to the web process? Even if the file is written atomically, several web process might start the update at the ame time, blocking all the web processes for 7-10 seconds, even if there's no risk of data corruption. I think it makes sense to update one in the UI process like we do for adblocker and let the web process just load the files from disk without having to worry about whether it exists or not.

> ::: embed/ephy-uri-tester.c
> @@ +572,2 @@
>  static gboolean
>  ephy_uri_tester_test_uri (EphyUriTester *tester,
> 
> Now would be a good time to rename this to should_block_uri or something
> like that.

Yes, indeed.

> @@ +611,2 @@
> +  /* Note that if this were not fatal, we would need some way to ensure
> +   * that future pending requests would not get stuck forever. */
> 
> This comment is stale, because we can't have pending requests after this
> patch. I guess anyone who might decide to remove the crash here in the
> future would think about the error path that would be needed, so I would
> just remove this comment.
> 
> @@ +613,2 @@
>    if (error) {
> +    if (!g_error_matches (error, G_IO_ERROR, G_IO_ERROR_CANCELLED))
> 
> Actually, since you don't pass a cancellable, you don't have to worry about
> this condition at all and should just remove it. That will save you from
> worrying about my next comment, which is that you try to use
> https_everywhere_context_rewrite even in the case where the operation is
> cancelled, which would cause a critical warning.

Will remove that.

> @@ +629,1 @@
> +  context = g_main_context_new ();
> 
> You need to unref this.

Indeed. Good catch.

> @@ +785,3 @@
> +
> +  task = g_task_new (tester, NULL, NULL, NULL);
> +  g_task_run_in_thread_sync (task,
> (GTaskThreadFunc)ephy_uri_tester_load_sync);
> 
> Why do you need to use GTask and a secondary thread here, instead of just
> calling the function directly? I do see that the goal is to perform two
> async operations at the same time without locking and roll them into a sync
> operation -- that makes sense to me -- but why can't you do it in the main
> thread instead of blocking the main thread on a secondary thread?

I don't know how to block the main thread and use it at the same time.

> ::: embed/web-extension/ephy-web-extension.c
> @@ +211,3 @@
> +    /* Wait for the UI process to finish the update */
> +    while (!extension->uri_tester_updated)
> +      g_main_context_iteration (NULL, TRUE);
> 
> Hm... the risk here is that WebKit probably expects that it "owns" the
> default main context in the web process, right? It could break assumptions
> in WebKit that code will not be called until a return to the main loop. I
> guess we just have to risk it, since I don't see another way, but it feels
> like a bad idea.

I don't know what you mean by "owns", but all WebKit operations are sources that will be dispatched anyway.

> What happens if the UI process crashes before this point, how does the web
> process die? I think it's OK if it dies badly, just need to be confident it
> doesn't keep running forever churning CPU here; it's not obvious to me what
> will happen.

In ConnectionUnix.cpp something will fail with a broken pipe, or a red will return 0 bytes because there's no other end and Connection::connectionDidClose() is called. That invalidates the connection and notifies the client, which in thid particular case is the WebProcess, so WebProcess::didClose() closes all pages, when the last page is closed ChildProcess::enableTermination() starts the termination timer. WebProcess::didClose() also stops the main loop, but it doesn't actually finish. When the termination timer fires WebProcess::shouldTerminate() is called which tries to ask the UI process with a sync message Messages::WebProcessProxy::ShouldTerminate(). Since the UI process is dead, Connection::sendSyncMessage() will return early because it was invalidated, but before returning it calls Connection::didFailToSendSyncMessage() that calls exit(0).

> @@ +222,3 @@
> +    }
> +  } else if (!modified_uri) {
> +    return FALSE;
> 
> I would add a comment here to indicate that this prevents the load; the
> return-by-bool semantics are not at all clear unless you're familiar with
> the WebKit API.

Yes, it must be confusing because it's the other way around, returning FALSE doesn't prevent the load. Note that this is not a semantic of the WebKit API, but GObject in general WebKitWebPage::send-request is a signal using the g_signal_accumulator_true_handled, that normally means: TRUE -> I've managed the signal, do not propagate further or do default things; FALSE -> I haven't managed the signal, propagate further or do the default thing. In this case the default is to just load the request normally.

> ::: embed/ephy-uri-tester.h
> @@ +25,3 @@
>  G_BEGIN_DECLS
>  
> +#define ADBLOCK_FILTER_URL
> "https://easylist-downloads.adblockplus.org/easylist.txt"
> 
> Is it easy to update this patch to not remove support for multiple filters?
> Besides making my life easier to add EasyPrivacy, it'll also be easier to
> develop the user interface for filter configuration if we keep the backend
> support now. But since we don't have anything using that feature currently,
> I understand if you need to remove it for now....

I have another question. The EasyPrivacy will use also a single filter list, right? Because in that case I don't think we want the current code either, that is too generic. We don't need a file to keep the list of filters and read/write it every time, we can just download the EasyPrivacy file directly the same way we do for the adblock one. Doing that would be very easy on top of these patches I guess.
Comment 23 Carlos Garcia Campos 2016-11-27 09:23:08 UTC
(In reply to Michael Catanzaro from comment #18)
> (In reply to Carlos Garcia Campos from comment #16)
> > Note that even if the Ui process is never blocked now, you can switch to
> > another tab or close it, the https everywhere still blocks the web process,
> > making scrolling quite laggish in some pages, especially pages that load
> > contents while scrolling. And that happens when everything has been loaded.
> > So, apart from making libhttpseverywhere faster if possible,
> 
> Could you please file a separate bug for this against libhttpseverywhere in
> GNOME Bugzilla.
> 
> > we should not
> > try to rewrite subresources if the main resource was not rewritten, for
> > example.
> 
> OK. Want to file a new bug for this too, against Epiphany? You can assign me
> if you don't want to work on it yourself.
> 
> > And we should definitely add a setting to disable it, or add UI to
> > white list pages or whatever, because this is making ephy unusable with some
> > websites.
> 
> I think the reason I didn't notice any performance problems is that I'm
> using debug WebKit with Ephy master and performance is atrocious always. :)
> Are you able to share a website that is broken by this, for testing?

It happens with many sites not using https, but in particular it was very annoying to browse gear4music.es. I've taken some numbers:

 - https_everywhere_context_init usually takes around 0.8 seconds here.
 - adblock usually takes 0.4 seconds to load here.
 - https_everywhere_context_rewrite takes between 0.01 and 0.2 seconds. The times are quite different, but I would say it's an average of 0.06.
 - ephy_uri_tester_block_uri takes between 0.0001 and 0.0005.

Adblock loading is expected to be faster because the file is smaller, but why I think https_everywhere_context_rewrite should be faster I think, are the regexp used by https everywhere mor complex than the ones used for the adblocker? Isn't just checking if a host look up? I guess it's not easy :-P Anyway, the major problem is probably that we keep trying for the same host again and again.

> I am hoping to avoid the need for a setting if possible. If you're seeing
> TLS errors, that should not be too hard to fix: that's bug #775146, we just
> need to remember that we rewrote. But I have a report that some unspecified
> website is rewritten to https:// but doesn't listen on 443 anymore at all;
> there will surely be more such sites. This is trickier; I guess we need to
> remember that we rewrote the URL and try http:// next if the first load
> fails. But if the connection hangs, then we have a big problem that I would
> not know how to solve; that would call into question whether we can enable
> HTTPS Everywhere by default at all. Ditto for poor performance after the
> page is loaded; I'm not sure what we can do about that besides just disable
> HTTPS Everywhere if we can't make it faster.

I'm not seeing TLS errors. I think all other browser allow to disable it even without uninstalling the extension, so I think we should either add a setting or allow to disable it for particular websites.
Comment 24 Carlos Garcia Campos 2016-11-27 10:03:34 UTC
Created attachment 340835 [details] [review]
uri-tester: Never block the UI process main thread
Comment 25 Michael Catanzaro 2016-11-27 16:25:46 UTC
> I'm not sure I understand all this. Current code is deferring all requests
> and blocking the UI process until both update and init has happened.

No, you're misreading it: the current code only waits for init, not for update. Actually it processes the deferred requests one line above where it starts the update. :)

> I think it makes sense to update one in the UI
> process like we do for adblocker and let the web process just load the files
> from disk without having to worry about whether it exists or not.

Rulesets are guaranteed to exist on disk. A set comes packaged with libhttpseverywhere, and the updater just downloads a second copy to your home directory if it finds a newer one. So yes: it's better to do it in the UI process -- that way we don't have 50 web processes downloading the same file redundantly -- but tell the web process that the URI tester is updated only based on the adblock filters and not the HTTPS Everywhere update. As long as we don't do it in the web process, it doesn't matter when it happens. If we want to go crazy we could install a separate binary to just update the rulesets and call it from a systemd timer unit, and that'd be fine.

> > @@ +785,3 @@
> > +
> > +  task = g_task_new (tester, NULL, NULL, NULL);
> > +  g_task_run_in_thread_sync (task,
> > (GTaskThreadFunc)ephy_uri_tester_load_sync);
> > 
> > Why do you need to use GTask and a secondary thread here, instead of just
> > calling the function directly? I do see that the goal is to perform two
> > async operations at the same time without locking and roll them into a sync
> > operation -- that makes sense to me -- but why can't you do it in the main
> > thread instead of blocking the main thread on a secondary thread?
> 
> I don't know how to block the main thread and use it at the same time.

I mean: why not simply call ephy_uri_tester_load_sync and not use GTask at all? It looks unneeded...?

> > ::: embed/web-extension/ephy-web-extension.c
> > @@ +211,3 @@
> > +    /* Wait for the UI process to finish the update */
> > +    while (!extension->uri_tester_updated)
> > +      g_main_context_iteration (NULL, TRUE);
> > 
> > Hm... the risk here is that WebKit probably expects that it "owns" the
> > default main context in the web process, right? It could break assumptions
> > in WebKit that code will not be called until a return to the main loop. I
> > guess we just have to risk it, since I don't see another way, but it feels
> > like a bad idea.
> 
> I don't know what you mean by "owns", but all WebKit operations are sources
> that will be dispatched anyway.

Often applications have code like this:

 do_something();
 have_library_do_something();
 do_something_else();

where do_something_else() has the constraint that nothing must iterate the default main context between the call to do_something() and do_something_else(). It's hard to think of specific examples, but I have written code like this before, where the code is safe because I know random library function calls don't cause sources to unexpectedly dispatch elsewhere in my application. Now, reverse the semantics of application <-> library so that WebKit is the "application" in the web process and the web extension in the "library". That's what I mean when I say it could cause unpredictable bugs if WebKit is not prepared to handle random sources being dispatched during send_request. Anyway, I don't see any way to avoid this and I guess it will probably be fine, let's just hope for the best.

> > @@ +222,3 @@
> > +    }
> > +  } else if (!modified_uri) {
> > +    return FALSE;
> > 
> > I would add a comment here to indicate that this prevents the load; the
> > return-by-bool semantics are not at all clear unless you're familiar with
> > the WebKit API.
> 
> Yes, it must be confusing because it's the other way around, returning FALSE
> doesn't prevent the load.

Ooops. :) Looking more closely at this knowing that, I got even more confused and thought you had a mistake there. I do see that your code is right, but the order of the conditionals is confusing. You should move the conditional containing LOG ("Refused to load %s") up inside the conditional that tests (flags & EPHY_URI_TEST_ADBLOCK) || (flags & EPHY_URI_TEST_HTTPS_EVERYWHERE) because it can never be reached unless that condition is true.

> > ::: embed/ephy-uri-tester.h
> > @@ +25,3 @@
> >  G_BEGIN_DECLS
> >  
> > +#define ADBLOCK_FILTER_URL
> > "https://easylist-downloads.adblockplus.org/easylist.txt"
> > 
> > Is it easy to update this patch to not remove support for multiple filters?
> > Besides making my life easier to add EasyPrivacy, it'll also be easier to
> > develop the user interface for filter configuration if we keep the backend
> > support now. But since we don't have anything using that feature currently,
> > I understand if you need to remove it for now....
> 
> I have another question. The EasyPrivacy will use also a single filter list,
> right? Because in that case I don't think we want the current code either,
> that is too generic. We don't need a file to keep the list of filters and
> read/write it every time, we can just download the EasyPrivacy file directly
> the same way we do for the adblock one. Doing that would be very easy on top
> of these patches I guess.

Yes, that would indeed work for EasyPrivacy. I guess that's fine for now. But when we expose UI, we'd want to support an arbitrary number of filtersets. By default we'd want to add country-specific filters for the current locale and any other locales that are set in the preferences dialog.
Comment 26 Michael Catanzaro 2016-11-27 16:38:55 UTC
(In reply to Carlos Garcia Campos from comment #23)
> It happens with many sites not using https, but in particular it was very
> annoying to browse gear4music.es. I've taken some numbers:
> 
>  - https_everywhere_context_init usually takes around 0.8 seconds here.
>  - adblock usually takes 0.4 seconds to load here.
>  - https_everywhere_context_rewrite takes between 0.01 and 0.2 seconds. The
> times are quite different, but I would say it's an average of 0.06.
>  - ephy_uri_tester_block_uri takes between 0.0001 and 0.0005.
> 
> Adblock loading is expected to be faster because the file is smaller, but
> why I think https_everywhere_context_rewrite should be faster I think, are
> the regexp used by https everywhere mor complex than the ones used for the
> adblocker? Isn't just checking if a host look up? I guess it's not easy :-P
> Anyway, the major problem is probably that we keep trying for the same host
> again and again.

It's probably not optimized at all. I bet we could gain a good speedup by having the library cache recently-resolved hosts and skip all the regex parsing if it has a hit. My question is why does it affect performance even after the page has already been loaded? Is it sending a bunch of requests with JavaScript?

> I'm not seeing TLS errors. I think all other browser allow to disable it
> even without uninstalling the extension, so I think we should either add a
> setting or allow to disable it for particular websites.

I think:

 * We of course only need the preference if there is some gain from disabling it.
 * If the main problem right now is performance, then let's see if we can fix that and only add a preference if we really can't. +0.2s for each subresource is clearly way too much.
 * If we do need to have a preference, it probably should not be enabled by default.
Comment 27 Michael Catanzaro 2016-11-27 17:24:10 UTC
Ugh, this is why it's slow:

https://www.eff.org/https-everywhere/rulesets#rules-and-regular-expressions

It's not as simple as http:// -> https://. Actually that totally breaks our logic to avoid loops like planet.gnome.org, where we both assumed all it ever did was change http:// -> https://, but I guess any other breakage should be quite rare.

And much worse: here is why it's impossible to build a host cache or reduce the number of queries per host:

https://www.eff.org/https-everywhere/rulesets#exclusions

So we can't reduce calls to rewrite, because they really are per-URL and not per-host. This is really unfortunate. We've done a lot of work on HTTPS Everywhere to give up on it now, but this feels close to a total dead end, unless we can make rewrite() much faster.
Comment 28 Michael Catanzaro 2016-11-27 17:34:04 UTC
(In reply to Carlos Garcia Campos from comment #16)
> we should not
> try to rewrite subresources if the main resource was not rewritten, for
> example.

We certainly apply this rule at the Epiphany level, though. That would fix the performance problem only on websites that do not trigger use of libhttpseverywhere, which is the general case.

I guess we could rationalize that websites that need to be rewritten deserve to be slow, but not really: the HTTPS Everywhere rewrite of course occurs before websites have any chance to redirect to https themselves, so webpages could be penalized for the existence of legacy rulesets.
Comment 29 Carlos Garcia Campos 2016-11-28 07:20:28 UTC
(In reply to Michael Catanzaro from comment #25)
> > I'm not sure I understand all this. Current code is deferring all requests
> > and blocking the UI process until both update and init has happened.
> 
> No, you're misreading it: the current code only waits for init, not for
> update. Actually it processes the deferred requests one line above where it
> starts the update. :)
> 
> > I think it makes sense to update one in the UI
> > process like we do for adblocker and let the web process just load the files
> > from disk without having to worry about whether it exists or not.
> 
> Rulesets are guaranteed to exist on disk. A set comes packaged with
> libhttpseverywhere, and the updater just downloads a second copy to your
> home directory if it finds a newer one.

Ah! I didn't know that, that's why httpseverywhere iterates several directories on init looking for rule files instead of just reading one file, I guess.

> So yes: it's better to do it in the
> UI process -- that way we don't have 50 web processes downloading the same
> file redundantly -- but tell the web process that the URI tester is updated
> only based on the adblock filters and not the HTTPS Everywhere update. As
> long as we don't do it in the web process, it doesn't matter when it
> happens. If we want to go crazy we could install a separate binary to just
> update the rulesets and call it from a systemd timer unit, and that'd be
> fine.

I understand it now :-) I guess, though, that a context already created will not be updated when the update finishes, so only new tabs will take the updated rules, right?

> > > @@ +785,3 @@
> > > +
> > > +  task = g_task_new (tester, NULL, NULL, NULL);
> > > +  g_task_run_in_thread_sync (task,
> > > (GTaskThreadFunc)ephy_uri_tester_load_sync);
> > > 
> > > Why do you need to use GTask and a secondary thread here, instead of just
> > > calling the function directly? I do see that the goal is to perform two
> > > async operations at the same time without locking and roll them into a sync
> > > operation -- that makes sense to me -- but why can't you do it in the main
> > > thread instead of blocking the main thread on a secondary thread?
> > 
> > I don't know how to block the main thread and use it at the same time.
> 
> I mean: why not simply call ephy_uri_tester_load_sync and not use GTask at
> all? It looks unneeded...?

GTask just makes it very convenient, otherwise I'll have to create the thread myself (better use the GTask pool) and do the mutex/condition too to wait for the thread to finish. GTask does all that for me.

> > > ::: embed/web-extension/ephy-web-extension.c
> > > @@ +211,3 @@
> > > +    /* Wait for the UI process to finish the update */
> > > +    while (!extension->uri_tester_updated)
> > > +      g_main_context_iteration (NULL, TRUE);
> > > 
> > > Hm... the risk here is that WebKit probably expects that it "owns" the
> > > default main context in the web process, right? It could break assumptions
> > > in WebKit that code will not be called until a return to the main loop. I
> > > guess we just have to risk it, since I don't see another way, but it feels
> > > like a bad idea.
> > 
> > I don't know what you mean by "owns", but all WebKit operations are sources
> > that will be dispatched anyway.
> 
> Often applications have code like this:
> 
>  do_something();
>  have_library_do_something();
>  do_something_else();
> 
> where do_something_else() has the constraint that nothing must iterate the
> default main context between the call to do_something() and
> do_something_else(). It's hard to think of specific examples, but I have
> written code like this before, where the code is safe because I know random
> library function calls don't cause sources to unexpectedly dispatch
> elsewhere in my application. Now, reverse the semantics of application <->
> library so that WebKit is the "application" in the web process and the web
> extension in the "library". That's what I mean when I say it could cause
> unpredictable bugs if WebKit is not prepared to handle random sources being
> dispatched during send_request. Anyway, I don't see any way to avoid this
> and I guess it will probably be fine, let's just hope for the best.

I don't think that's a problem in WebKit, there are a lot of sources that can happen at any time, since that's what we use to send tasks from secondary threads and work queues to the main thread.

> > > @@ +222,3 @@
> > > +    }
> > > +  } else if (!modified_uri) {
> > > +    return FALSE;
> > > 
> > > I would add a comment here to indicate that this prevents the load; the
> > > return-by-bool semantics are not at all clear unless you're familiar with
> > > the WebKit API.
> > 
> > Yes, it must be confusing because it's the other way around, returning FALSE
> > doesn't prevent the load.
> 
> Ooops. :) Looking more closely at this knowing that, I got even more
> confused and thought you had a mistake there. I do see that your code is
> right, but the order of the conditionals is confusing. You should move the
> conditional containing LOG ("Refused to load %s") up inside the conditional
> that tests (flags & EPHY_URI_TEST_ADBLOCK) || (flags &
> EPHY_URI_TEST_HTTPS_EVERYWHERE) because it can never be reached unless that
> condition is true.

You are right.

> > > ::: embed/ephy-uri-tester.h
> > > @@ +25,3 @@
> > >  G_BEGIN_DECLS
> > >  
> > > +#define ADBLOCK_FILTER_URL
> > > "https://easylist-downloads.adblockplus.org/easylist.txt"
> > > 
> > > Is it easy to update this patch to not remove support for multiple filters?
> > > Besides making my life easier to add EasyPrivacy, it'll also be easier to
> > > develop the user interface for filter configuration if we keep the backend
> > > support now. But since we don't have anything using that feature currently,
> > > I understand if you need to remove it for now....
> > 
> > I have another question. The EasyPrivacy will use also a single filter list,
> > right? Because in that case I don't think we want the current code either,
> > that is too generic. We don't need a file to keep the list of filters and
> > read/write it every time, we can just download the EasyPrivacy file directly
> > the same way we do for the adblock one. Doing that would be very easy on top
> > of these patches I guess.
> 
> Yes, that would indeed work for EasyPrivacy. I guess that's fine for now.
> But when we expose UI, we'd want to support an arbitrary number of
> filtersets. By default we'd want to add country-specific filters for the
> current locale and any other locales that are set in the preferences dialog.

I don't doubt we might want that functionality in the future, but I don't think we want the current (dead) code for that.
Comment 30 Carlos Garcia Campos 2016-11-28 07:26:30 UTC
(In reply to Michael Catanzaro from comment #28)
> (In reply to Carlos Garcia Campos from comment #16)
> > we should not
> > try to rewrite subresources if the main resource was not rewritten, for
> > example.
> 
> We certainly apply this rule at the Epiphany level, though. That would fix
> the performance problem only on websites that do not trigger use of
> libhttpseverywhere, which is the general case.

Yes, for sites that the rule applies, the main resource is redirected to https and then all other resources are already https, so it's not a problem. There could be a external resource using http, but it's not like running the rules for every single subresource like for non-https sites.

> I guess we could rationalize that websites that need to be rewritten deserve
> to be slow, but not really: the HTTPS Everywhere rewrite of course occurs
> before websites have any chance to redirect to https themselves, so webpages
> could be penalized for the existence of legacy rulesets.
Comment 31 Carlos Garcia Campos 2016-11-28 07:31:52 UTC
(In reply to Michael Catanzaro from comment #27)
> Ugh, this is why it's slow:
> 
> https://www.eff.org/https-everywhere/rulesets#rules-and-regular-expressions
> 
> It's not as simple as http:// -> https://. Actually that totally breaks our
> logic to avoid loops like planet.gnome.org, where we both assumed all it
> ever did was change http:// -> https://, but I guess any other breakage
> should be quite rare.
> 
> And much worse: here is why it's impossible to build a host cache or reduce
> the number of queries per host:
> 
> https://www.eff.org/https-everywhere/rulesets#exclusions
> 
> So we can't reduce calls to rewrite, because they really are per-URL and not
> per-host. This is really unfortunate. We've done a lot of work on HTTPS
> Everywhere to give up on it now, but this feels close to a total dead end,
> unless we can make rewrite() much faster.

Reading the chromium extension code (which is surprisingly easier to follow than the vala code of libhttpseverywhrere, even when I have no idea of JavaScript), it seems that there's a set of rules that can be applied for a particular host and you can at least cache that. See:

https://github.com/EFForg/https-everywhere/blob/master/chromium/rules.js#L241

If it's not so slow in ff and chromium, we should be able to make it at least as fast as them. Maybe we can extend the WebKit content extensions to do it in the future and move this to WebKit.
Comment 32 Carlos Garcia Campos 2016-11-28 07:36:26 UTC
(In reply to Michael Catanzaro from comment #26)
> (In reply to Carlos Garcia Campos from comment #23)
> > It happens with many sites not using https, but in particular it was very
> > annoying to browse gear4music.es. I've taken some numbers:
> > 
> >  - https_everywhere_context_init usually takes around 0.8 seconds here.
> >  - adblock usually takes 0.4 seconds to load here.
> >  - https_everywhere_context_rewrite takes between 0.01 and 0.2 seconds. The
> > times are quite different, but I would say it's an average of 0.06.
> >  - ephy_uri_tester_block_uri takes between 0.0001 and 0.0005.
> > 
> > Adblock loading is expected to be faster because the file is smaller, but
> > why I think https_everywhere_context_rewrite should be faster I think, are
> > the regexp used by https everywhere mor complex than the ones used for the
> > adblocker? Isn't just checking if a host look up? I guess it's not easy :-P
> > Anyway, the major problem is probably that we keep trying for the same host
> > again and again.
> 
> It's probably not optimized at all. I bet we could gain a good speedup by
> having the library cache recently-resolved hosts and skip all the regex
> parsing if it has a hit. My question is why does it affect performance even
> after the page has already been loaded? Is it sending a bunch of requests
> with JavaScript?

The main issue is when the page is first loaded and all subresources are loading at the same time. Once the page is loaded the problem is when new contents is loaded, mainly XMLHttp requests. It's quite common nowadays that sites load more content when you scroll, so you start scrolling and at some point the scroll is stuck for a while. 

> > I'm not seeing TLS errors. I think all other browser allow to disable it
> > even without uninstalling the extension, so I think we should either add a
> > setting or allow to disable it for particular websites.
> 
> I think:
> 
>  * We of course only need the preference if there is some gain from
> disabling it.
>  * If the main problem right now is performance, then let's see if we can
> fix that and only add a preference if we really can't. +0.2s for each
> subresource is clearly way too much.
>  * If we do need to have a preference, it probably should not be enabled by
> default.

Right now I would disable it if I had a setting for that.
Comment 33 Michael Catanzaro 2016-11-28 12:58:27 UTC
(In reply to Carlos Garcia Campos from comment #29)
> Ah! I didn't know that, that's why httpseverywhere iterates several
> directories on init looking for rule files instead of just reading one file,
> I guess.

Yes.

> I understand it now :-) I guess, though, that a context already created will
> not be updated when the update finishes, so only new tabs will take the
> updated rules, right?

Yes, and that's not a problem.

> GTask just makes it very convenient, otherwise I'll have to create the
> thread myself (better use the GTask pool) and do the mutex/condition too to
> wait for the thread to finish. GTask does all that for me.

I still don't understand, why do you need a thread at all? Why not call ephy_uri_tester_load_sync() on the main thread? Can't you set/pop the thread default context on the main thread just as well? I think it should still work, and then you can get rid of the GTask.

(In reply to Carlos Garcia Campos from comment #30)
> Yes, for sites that the rule applies, the main resource is redirected to
> https and then all other resources are already https, so it's not a problem.
> There could be a external resource using http, but it's not like running the
> rules for every single subresource like for non-https sites.

Hm. I bet many sites hardcode http:// URLs, which HTTPS Everywhere can "fix" if used, but maybe we could get away with only using it for the main resource. That would almost completely fix this performance problem, right? That might be our way out of this... I think we should do that for now and see how well it works; if the website isn't smart enough to load subresources with https:// then we are no worse off than without HTTPS Everywhere, it means we can't have any performance problem with subresources and we won't need a setting to disable it.
Comment 34 Carlos Garcia Campos 2016-11-28 13:25:43 UTC
(In reply to Michael Catanzaro from comment #33)
> (In reply to Carlos Garcia Campos from comment #29)
> > Ah! I didn't know that, that's why httpseverywhere iterates several
> > directories on init looking for rule files instead of just reading one file,
> > I guess.
> 
> Yes.
> 
> > I understand it now :-) I guess, though, that a context already created will
> > not be updated when the update finishes, so only new tabs will take the
> > updated rules, right?
> 
> Yes, and that's not a problem.
> 
> > GTask just makes it very convenient, otherwise I'll have to create the
> > thread myself (better use the GTask pool) and do the mutex/condition too to
> > wait for the thread to finish. GTask does all that for me.
> 
> I still don't understand, why do you need a thread at all? Why not call
> ephy_uri_tester_load_sync() on the main thread? Can't you set/pop the thread
> default context on the main thread just as well? I think it should still
> work, and then you can get rid of the GTask.

g_task_run_in_thread_sync is what makes ephy_uri_tester_load_sync synchronous. If we want to use the main thread, we would need to use a nested main loop to make it sync. It's much easier to use GTask.

> (In reply to Carlos Garcia Campos from comment #30)
> > Yes, for sites that the rule applies, the main resource is redirected to
> > https and then all other resources are already https, so it's not a problem.
> > There could be a external resource using http, but it's not like running the
> > rules for every single subresource like for non-https sites.
> 
> Hm. I bet many sites hardcode http:// URLs, which HTTPS Everywhere can "fix"
> if used, but maybe we could get away with only using it for the main
> resource. That would almost completely fix this performance problem, right?
> That might be our way out of this... I think we should do that for now and
> see how well it works; if the website isn't smart enough to load
> subresources with https:// then we are no worse off than without HTTPS
> Everywhere, it means we can't have any performance problem with subresources
> and we won't need a setting to disable it.

I don't think so, we should probably cache the result of the main resource and apply it only for the subresources.
Comment 35 Michael Catanzaro 2016-11-28 14:12:31 UTC
(In reply to Carlos Garcia Campos from comment #34)
> I don't think so, we should probably cache the result of the main resource
> and apply it only for the subresources.

I don't think that will work in general, the policies are defined per-URL and so we really can't rewrite any particular URL without testing it against the rules for that host.

I do see that it's possible to cache the rules for each host. That's good, though of course that has to be done in libhttpseverywhere.
Comment 36 Carlos Garcia Campos 2016-11-29 07:07:05 UTC
(In reply to Michael Catanzaro from comment #35)
> (In reply to Carlos Garcia Campos from comment #34)
> > I don't think so, we should probably cache the result of the main resource
> > and apply it only for the subresources.
> 
> I don't think that will work in general, the policies are defined per-URL
> and so we really can't rewrite any particular URL without testing it against
> the rules for that host.

It will work because it will not be slow, I don't care if the http->https doesn't work in that particular case. What I'm proposing is that we assume that if http://www.foo.com din't match any rule, http://www.foo.com/bar won't match it either. Of course, it could be wrong, but I prefer to take that risk. But if there's an external resource, then let's try to rewrite it.

> I do see that it's possible to cache the rules for each host. That's good,
> though of course that has to be done in libhttpseverywhere.

Yes.
Comment 37 Michael Catanzaro 2016-11-29 12:37:56 UTC
> What I'm proposing is that we assume
> that if http://www.foo.com din't match any rule, http://www.foo.com/bar
> won't match it either. Of course, it could be wrong, but I prefer to take
> that risk. But if there's an external resource, then let's try to rewrite it.

Yes, that's totally fine.

What we cannot do is assume that if http://www.foo.com is rewritten, then http://www.foo.com/bar should be too. We have to always check the rule (could be from a future hypothetical ruleset cache) in this case, else we will break sites as per [1].

BTW grindhold says he think he might be able to speedup rewrite by switching to some tree-style data structure, but I think we should pursue the ruleset cache idea separately.

[1] https://www.eff.org/https-everywhere/rulesets#exclusions
Comment 38 Carlos Garcia Campos 2016-11-29 13:09:52 UTC
Created attachment 340970 [details] [review]
uri-tester: Never block the UI process main thread

New patch. I realized that my previous patch was very racy, passing dynamic info as init params to extensions and using DBus to notify the web extensions were both bad ideas. I was not happy with the main context iteration either. 

So, the problem was that when opening ephy with a lot of tabs, it could happen that the update hasn't finished when the web process is forked, and finishes right after that, but before the web extension proxy has been created. So, we had tabs loading forever, iterating the main loop like crazy, but never receiving the expected DBus message. 

So, following the httpseverywhere idea, I thought we don't really need to notify the extensions at all. The uri tester can simply use the adblock file if it exists (if it's an old copy and a new one is being updated, it could be using old rules, like with httpseverywhere, not a big deal). Only when the file doesn't exist it monitors it, waiting for the UI process to create it. That way the extension doesn't need to know anything, because uri tester load will block until it has a filter file to read and parse. 

I've also brought back the async dbus connection to the web extension, I haven't noticed that was made sync too :-/
Comment 39 Michael Catanzaro 2016-11-29 13:57:53 UTC
(In reply to Carlos Garcia Campos from comment #38)
> Created attachment 340970 [details] [review] [review]
> uri-tester: Never block the UI process main thread
> 
> New patch. I realized that my previous patch was very racy, passing dynamic
> info as init params to extensions and using DBus to notify the web
> extensions were both bad ideas. I was not happy with the main context
> iteration either. 
> 
> So, the problem was that when opening ephy with a lot of tabs, it could
> happen that the update hasn't finished when the web process is forked, and
> finishes right after that, but before the web extension proxy has been
> created. So, we had tabs loading forever, iterating the main loop like
> crazy, but never receiving the expected DBus message. 

Oops, I should have noticed this too....

> So, following the httpseverywhere idea, I thought we don't really need to
> notify the extensions at all. The uri tester can simply use the adblock file
> if it exists (if it's an old copy and a new one is being updated, it could
> be using old rules, like with httpseverywhere, not a big deal). Only when
> the file doesn't exist it monitors it, waiting for the UI process to create
> it. That way the extension doesn't need to know anything, because uri tester
> load will block until it has a filter file to read and parse. 

Good idea.

> I've also brought back the async dbus connection to the web extension, I
> haven't noticed that was made sync too :-/

Oops, forgot about that. Yes, with this patch we can switch that back now.
Comment 40 Michael Catanzaro 2016-11-29 13:58:38 UTC
Comment on attachment 340811 [details] [review]
uri-tester: Stop pretending we support multiple filters

First patch: hesitant OK
Comment 41 Michael Catanzaro 2016-11-29 13:59:12 UTC
Comment on attachment 340812 [details] [review]
uri-tester: Do not rewrite HTTPS urls

Second patch: either meld it into the third patch, or fix both cases of g_str_has_prefix misuse here even though the second case is removed in the third patch.
Comment 42 Michael Catanzaro 2016-11-29 14:31:57 UTC
Review of attachment 340970 [details] [review]:

Excellent, thanks. Accepted with minor changes:

::: embed/ephy-embed-shell.c
@@ +105,3 @@
+    g_cancellable_cancel (priv->uri_tester_update_cancellable);
+    g_object_unref (priv->uri_tester_update_cancellable);
+    priv->uri_tester_update_cancellable = NULL;

This is fine, but I would use g_clear_object here as that's what it's made for. I don't mind the condition being checked an extra time.

@@ +587,3 @@
+
+  g_object_unref (updater);
+  g_clear_object (&priv->https_everywhere_context);

You don't use this anywhere except to pass to HTTPSEverywhereUpdater, so it shouldn't be a member of the priv struct at all. Just unref it right after you pass it to https_everywhere_updater_new.

@@ +588,3 @@
+  g_object_unref (updater);
+  g_clear_object (&priv->https_everywhere_context);
+  g_object_unref (shell);

And then you don't need the EphyEmbedShell in this function at all. You can get rid of the ref/unref I suggested you add earlier and not pass any user_data to this function.

@@ +691,3 @@
+
+  if (!g_file_copy_finish (src, result, &error) ||
+      !g_file_move (data->tmp_file, data->filter_file, G_FILE_COPY_OVERWRITE, NULL, NULL, NULL, &error)) {

You should use a second g_file_copy_async and then g_file_delete_async rather than blocking the UI process on g_file_move.

@@ +695,3 @@
+
+    /* If failed to retrieve, create an empty file if it doesn't exist to unblock extensions */
+    stream = g_file_create (data->filter_file, G_FILE_CREATE_NONE, NULL, NULL);

And here you can use g_file_create_async. If it's inconvenient to create callbacks for all of these, that's just a sign that we need a C migration plan. :)

::: embed/ephy-uri-tester.c
@@ +613,1 @@
     g_error_free (error);

Don't free this error, since this code follows g_error which never returns. Doesn't GCC give you an unreachable code warning?

@@ +658,3 @@
+       g_signal_connect (monitor, "changed", G_CALLBACK (adblock_file_monitor_changed), tester);
+    } else {
+      g_warning ("Filed to monitor adblock file: %s\n", error->message);

Filed -> Failed

::: embed/web-extension/ephy-web-extension.c
@@ +224,3 @@
   }
 
+  if (!modified_uri)

It's better than what you had before, but one more improvement: use "else if" here, for clarity, since it can never be reached otherwise.
Comment 43 Michael Catanzaro 2016-11-29 14:35:00 UTC
(In reply to Michael Catanzaro from comment #42)
> And here you can use g_file_create_async. If it's inconvenient to create
> callbacks for all of these, that's just a sign that we need a C migration
> plan. :)

By the way, to troll you, I will point out that this is really easy to do with vala lambdas, you can nest them three levels deep to have all the code written nice and sequentially, and it will generate correct C and chain all the async operations together properly.
Comment 44 Michael Catanzaro 2016-11-29 14:36:21 UTC
(In reply to Michael Catanzaro from comment #43)
> By the way, to troll you, I will point out that this is really easy to do
> with vala lambdas, you can nest them three levels deep to have all the code
> written nice and sequentially, and it will generate correct C and chain all
> the async operations together properly.

(Of course it allows arbitrary levels of nesting. I guess four is really the number needed here: first copy from the web, second copy from tmp file to destination, third to delete the temp file, fourth to create.)
Comment 45 Carlos Garcia Campos 2016-11-30 07:47:51 UTC
(In reply to Michael Catanzaro from comment #42)
> Review of attachment 340970 [details] [review] [review]:
> 
> Excellent, thanks. Accepted with minor changes:
> 
> ::: embed/ephy-embed-shell.c
> @@ +105,3 @@
> +    g_cancellable_cancel (priv->uri_tester_update_cancellable);
> +    g_object_unref (priv->uri_tester_update_cancellable);
> +    priv->uri_tester_update_cancellable = NULL;
> 
> This is fine, but I would use g_clear_object here as that's what it's made
> for. I don't mind the condition being checked an extra time.

Ok, I don't normally use g_clear_object when I've already null-checked it, but it's ok anyway.

> @@ +587,3 @@
> +
> +  g_object_unref (updater);
> +  g_clear_object (&priv->https_everywhere_context);
> 
> You don't use this anywhere except to pass to HTTPSEverywhereUpdater, so it
> shouldn't be a member of the priv struct at all. Just unref it right after
> you pass it to https_everywhere_updater_new.

Indeed! I had to read the generated GObject code of libhttpseverywhere to check that the updater keeps a reference of the context, so yes we don't really need to keep our ref since we don't really use the context (well, reading the generated code I didn't see what the updater uses the context for, apart from keeping it alive). Also checked https_everywhere_updater_update code, and it uses g_simple_async_result_new which takes a ref of the given source object, and it also keeps its own extra reference in the HttpsEverywhereUpdaterUpdateData. So we don't need to keep our reference either, we can just trust the object will be alive for the whole async operation like with any other gio async op.

> @@ +588,3 @@
> +  g_object_unref (updater);
> +  g_clear_object (&priv->https_everywhere_context);
> +  g_object_unref (shell);
> 
> And then you don't need the EphyEmbedShell in this function at all. You can
> get rid of the ref/unref I suggested you add earlier and not pass any
> user_data to this function.
> 
> @@ +691,3 @@
> +
> +  if (!g_file_copy_finish (src, result, &error) ||
> +      !g_file_move (data->tmp_file, data->filter_file,
> G_FILE_COPY_OVERWRITE, NULL, NULL, NULL, &error)) {
> 
> You should use a second g_file_copy_async and then g_file_delete_async
> rather than blocking the UI process on g_file_move.

I don't think so. In this case we know both files are local files, and GLocalFile implements move which ends up being a g_rename(). There's no need to make this code more complex for that.

> @@ +695,3 @@
> +
> +    /* If failed to retrieve, create an empty file if it doesn't exist to
> unblock extensions */
> +    stream = g_file_create (data->filter_file, G_FILE_CREATE_NONE, NULL,
> NULL);
> 
> And here you can use g_file_create_async. If it's inconvenient to create
> callbacks for all of these, that's just a sign that we need a C migration
> plan. :)

We know filter_file is a local file, we are not going to use the returned stream, so g_file_create in this case is just g_open(). I could make it more obvious in both cases by just getting the paths and using g_rename/g_open directly, in this second case we would also avoid creating a GFileOutputStream for nothing. But using anyc API on these cases just makes the code more complex for no reason.

> ::: embed/ephy-uri-tester.c
> @@ +613,1 @@
>      g_error_free (error);
> 
> Don't free this error, since this code follows g_error which never returns.
> Doesn't GCC give you an unreachable code warning?

Ok, I don't know, I think I only see warnings about gee things we don't use that I guess came from httpseverywhere. I wonder why we make this fatal, though, a g_warning would be better, no?

> @@ +658,3 @@
> +       g_signal_connect (monitor, "changed", G_CALLBACK
> (adblock_file_monitor_changed), tester);
> +    } else {
> +      g_warning ("Filed to monitor adblock file: %s\n", error->message);
> 
> Filed -> Failed
> 
> ::: embed/web-extension/ephy-web-extension.c
> @@ +224,3 @@
>    }
>  
> +  if (!modified_uri)
> 
> It's better than what you had before, but one more improvement: use "else
> if" here, for clarity, since it can never be reached otherwise.

Ok.
Comment 46 Michael Catanzaro 2016-11-30 17:12:40 UTC
(In reply to Carlos Garcia Campos from comment #45) 
> Indeed! I had to read the generated GObject code of libhttpseverywhere to
> check that the updater keeps a reference of the context, so yes we don't
> really need to keep our ref since we don't really use the context

It works same as you'd expect for a GObject API. If you use Vala more you'll have to check generated code less. :)

> (well,
> reading the generated code I didn't see what the updater uses the context
> for, apart from keeping it alive)

Um... well that's just a bug, it used to automatically reinitialize the rulesets after it completed the update, but we removed that and forget it doesn't need a Context anymore at all. Except it's part of the API, so we can't really change it because we decided to make it a shared library instead of a submodule, and we might want it in the future, and I guess it's just good practice to use the library context object everywhere anyway, so let's not change it.
 
> I don't think so. In this case we know both files are local files, and
> GLocalFile implements move which ends up being a g_rename(). There's no need
> to make this code more complex for that.

OK....

> > ::: embed/ephy-uri-tester.c
> > @@ +613,1 @@
> >      g_error_free (error);
> > 
> > Don't free this error, since this code follows g_error which never returns.
> > Doesn't GCC give you an unreachable code warning?
> 
> Ok, I don't know, I think I only see warnings about gee things we don't use
> that I guess came from httpseverywhere. I wonder why we make this fatal,
> though, a g_warning would be better, no?

Um... I think HTTPS Everywhere init should never fail, I'd rather have a crash report for that. But if you want it to be nonfatal, I'm fine with that, just make sure the code works in the error path (change _init to always throw an error, and make sure Ephy still works).