After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 675663 - Don't do blocking calls on main threads when collecting stats
Don't do blocking calls on main threads when collecting stats
Status: RESOLVED FIXED
Product: gnome-boxes
Classification: Applications
Component: general
unspecified
Other Linux
: Normal normal
: --
Assigned To: GNOME Boxes maintainer(s)
GNOME Boxes maintainer(s)
Depends on:
Blocks:
 
 
Reported: 2012-05-08 10:11 UTC by Alexander Larsson
Modified: 2016-03-31 13:56 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Avoid blocking call in disk stats (1.81 KB, patch)
2012-05-08 10:12 UTC, Alexander Larsson
committed Details | Review
Avoid blocking call in network stats (1.28 KB, patch)
2012-05-08 10:12 UTC, Alexander Larsson
committed Details | Review
Don't unnecessarily get info sync in stats collection (1.98 KB, patch)
2012-05-08 10:12 UTC, Alexander Larsson
rejected Details | Review
Turn LibvirtMachine.state from property to a method (2.10 KB, patch)
2012-05-08 10:12 UTC, Alexander Larsson
rejected Details | Review
Track the running machine state rather than doing sync calls (2.19 KB, patch)
2012-05-08 10:12 UTC, Alexander Larsson
reviewed Details | Review
libvirt-machine: Track the storage path so that we don't have to get it every time (2.95 KB, patch)
2012-05-08 10:12 UTC, Alexander Larsson
committed Details | Review
Add Util.run_in_thread (1.27 KB, patch)
2012-05-08 10:12 UTC, Alexander Larsson
committed Details | Review
Do stats collection calls async (2.48 KB, patch)
2012-05-08 10:12 UTC, Alexander Larsson
committed Details | Review
Track the running machine state rather than doing sync calls (2.38 KB, patch)
2012-05-08 13:08 UTC, Alexander Larsson
none Details | Review
Make use of domain signals to keep track of it's state (2.61 KB, patch)
2012-05-08 13:21 UTC, Zeeshan Ali
none Details | Review
Alternative event patch (3.17 KB, patch)
2012-05-08 14:34 UTC, Alexander Larsson
none Details | Review
Add MachineState enum and property to Machine, implement is_running over it (5.54 KB, patch)
2012-05-08 15:04 UTC, Alexander Larsson
committed Details | Review
Init Machine.state in constructor, not later (3.20 KB, patch)
2012-05-09 12:45 UTC, Alexander Larsson
committed Details | Review

Description Alexander Larsson 2012-05-08 10:11:43 UTC
We should never do sync calls on the main thread as that may block the UI.
Unfortunately we do this a lot when capturing statistics, which is particularly bad as it runs constantly, triggering UI hickups. 

I even regularly get a hang where the main thread gets blocked for 30 seconds until it timeouts when getting the stats. Leaving the whole desktop more or less locked up due to the keyboard grab.

Attaching a patch series that fixes this.
Comment 1 Alexander Larsson 2012-05-08 10:12:17 UTC
Created attachment 213649 [details] [review]
Avoid blocking call in disk stats

domain.get_devices () does a blocking get_config() call, but we already have a
copy of the config. Use that instead.
Comment 2 Alexander Larsson 2012-05-08 10:12:20 UTC
Created attachment 213650 [details] [review]
Avoid blocking call in network stats

domain.get_devices () does a blocking get_config() call, but we already have a
copy of the config. Use that instead.
Comment 3 Alexander Larsson 2012-05-08 10:12:22 UTC
Created attachment 213651 [details] [review]
Don't unnecessarily get info sync in stats collection

The calls to is_running where causing a sync get_info() call even though
we already had that information since we did an async get_info call.
Comment 4 Alexander Larsson 2012-05-08 10:12:25 UTC
Created attachment 213652 [details] [review]
Turn LibvirtMachine.state from property to a method

This was getting called in multiple places as if it was a cheap method, while
in reality its a blocking rpc call. Sometimes it was even used multiple times
within a method.

The new method is called get_state_sync() to make it obvious that this is a
blocking call.
Comment 5 Alexander Larsson 2012-05-08 10:12:28 UTC
Created attachment 213653 [details] [review]
Track the running machine state rather than doing sync calls

This means we know locally if the machine is running, paused or
not running.
Comment 6 Alexander Larsson 2012-05-08 10:12:31 UTC
Created attachment 213654 [details] [review]
libvirt-machine: Track the storage path so that we don't have to get it every time

This is a blocking rpc call that we can easily avoid
Comment 7 Alexander Larsson 2012-05-08 10:12:34 UTC
Created attachment 213655 [details] [review]
Add Util.run_in_thread

This lets us easily run sync methods async
Comment 8 Alexander Larsson 2012-05-08 10:12:37 UTC
Created attachment 213656 [details] [review]
Do stats collection calls async
Comment 9 Marc-Andre Lureau 2012-05-08 10:35:38 UTC
Review of attachment 213653 [details] [review]:

::: src/libvirt-machine.vala
@@ +125,3 @@
+        domain.suspended.connect ( () => { running = false; paused = Signal.get_invocation_hint(domain).detail.to_string () == "paused"; } );
+        domain.resumed.connect ( () => { running = true; paused = false; } );
+        domain.stopped.connect ( () => { running = false; paused = false; } );

According to Zeeshan, this is problematic: https://bugzilla.gnome.org/show_bug.cgi?id=674352#c32
Comment 10 Marc-Andre Lureau 2012-05-08 10:55:29 UTC
Patches looks fine, but they need to be rebased after Zeeshan changes from yesterday.

- 1 & 2 are essentially the same fix, so it could be a single patch
- 3 is probably already applied
- 5 would need Zeeshan comment
- 4 & 6 & 7 & 8 ack, although why use pointers? is that a vala bug?
Comment 11 Zeeshan Ali 2012-05-08 11:54:21 UTC
Review of attachment 213649 [details] [review]:

ACK
Comment 12 Zeeshan Ali 2012-05-08 11:54:52 UTC
Review of attachment 213650 [details] [review]:

ACK
Comment 13 Zeeshan Ali 2012-05-08 11:56:25 UTC
Review of attachment 213651 [details] [review]:

Already fixed in trunk.
Comment 14 Zeeshan Ali 2012-05-08 11:58:40 UTC
Review of attachment 213652 [details] [review]:

This is also already taken care of.
Comment 15 Alexander Larsson 2012-05-08 12:09:08 UTC
Pointers are needed, because "ref" is not supported for async methods in vala.
Comment 16 Zeeshan Ali 2012-05-08 12:11:07 UTC
Review of attachment 213653 [details] [review]:

::: src/libvirt-machine.vala
@@ +125,3 @@
+        domain.suspended.connect ( () => { running = false; paused = Signal.get_invocation_hint(domain).detail.to_string () == "paused"; } );
+        domain.resumed.connect ( () => { running = true; paused = false; } );
+        domain.stopped.connect ( () => { running = false; paused = false; } );

Yeah, it will at least break boxes for me. :) Also, we don't need separate bool props as we can simply update existing state prop on state changes.
Comment 17 Zeeshan Ali 2012-05-08 12:15:13 UTC
Review of attachment 213654 [details] [review]:

ACK but please try to keep the commit short log < 50 chars
Comment 18 Zeeshan Ali 2012-05-08 12:18:06 UTC
Review of attachment 213655 [details] [review]:

Apart from this nitpicks, good stuff!

::: src/util.vala
@@ -438,0 +438,5 @@
+
+    public delegate void RunInThreadFunc () throws  GLib.Error;
+    public async void run_in_thread (RunInThreadFunc func) throws GLib.Error {
... 2 more ...

redundant space between 2 '('

@@ -438,0 +438,17 @@
+
+    public delegate void RunInThreadFunc () throws  GLib.Error;
+    public async void run_in_thread (RunInThreadFunc func) throws GLib.Error {
... 14 more ...

I'd put a newline before & after 'yield' here but its up to you.
Comment 19 Zeeshan Ali 2012-05-08 12:22:22 UTC
Review of attachment 213656 [details] [review]:

::: src/libvirt-machine.vala
@@ -157,3 +157,3 @@
     }
 
-    private void update_io_stat (DomainInfo info, ref MachineStat stat) {
+    private async void update_io_stat (DomainInfo info, MachineStat *stat) {

As elmarco, pointed out there is no need for use of pointers. A 'ref' param perhaps?

@@ -166,3 +166,3 @@
                 return;
 
-            stat.disk = disk.get_stats ();
+            yield run_in_thread ( () => {

I suggest to instead add async variants to these libvirt-glib calls. That would be consistent with how we have been solving these kind of issues and benefit other libvirt-glib apps too.
Comment 20 Alexander Larsson 2012-05-08 12:31:01 UTC
Review of attachment 213656 [details] [review]:

::: src/libvirt-machine.vala
@@ +157,3 @@
     }
 
+    private async void update_io_stat (DomainInfo info, MachineStat *stat) {

Ref isn't supported on async functions.

@@ +166,3 @@
                 return;
 
+            yield run_in_thread ( () => {

I agree that libvirt-glib should have these, but it should be easy to switch boxes to that when its supported.
Comment 21 Alexander Larsson 2012-05-08 13:08:23 UTC
Created attachment 213664 [details] [review]
Track the running machine state rather than doing sync calls

This means we know locally if the machine is running, paused or
not running.
Comment 22 Alexander Larsson 2012-05-08 13:09:36 UTC
Updated the event patch
Comment 23 Zeeshan Ali 2012-05-08 13:21:08 UTC
Created attachment 213665 [details] [review]
Make use of domain signals to keep track of it's state
Comment 24 Alexander Larsson 2012-05-08 14:34:17 UTC
Created attachment 213673 [details] [review]
Alternative event patch

This patch doesn't use DomainState as that is not what you get from following the events.
Comment 25 Zeeshan Ali 2012-05-08 14:43:55 UTC
Review of attachment 213673 [details] [review]:

Looks good but wonder if it would be a good idea to put this new enum and is_running impl to base Machine class?
Comment 26 Alexander Larsson 2012-05-08 15:04:13 UTC
Created attachment 213676 [details] [review]
Add MachineState enum and property to Machine, implement is_running over it

This means we can track the state without having to do sync calls.
Comment 27 Zeeshan Ali 2012-05-08 15:13:45 UTC
Review of attachment 213676 [details] [review]:

Looks good. Just one nitpick about commit summary again: Please move 'and implement is_running over it' part to description and 'and' -> '&' to fit summary in ~50 cols.
Comment 28 Alexander Larsson 2012-05-08 15:22:50 UTC
commited.
Comment 29 Marc-Andre Lureau 2012-05-09 00:52:03 UTC
there is a regression that has been introduced

the screenshots are grayed in initial startup even if the VM is running
Comment 30 Alexander Larsson 2012-05-09 08:22:13 UTC
Thats likely from is_running moving to events, as thats what triggers that. Will look into it.
Comment 31 Alexander Larsson 2012-05-09 12:45:40 UTC
Created attachment 213735 [details] [review]
Init Machine.state in constructor, not later

We need this early on to see whether to show the screenshot as running or not on startup.
Comment 32 Marc-Andre Lureau 2012-05-09 12:54:50 UTC
Review of attachment 213735 [details] [review]:

ack, thanks
Comment 33 Zeeshan Ali 2012-05-10 14:49:16 UTC
This bug is fixed now?
Comment 34 Zeeshan Ali 2012-05-14 21:47:16 UTC
(In reply to comment #33)
> This bug is fixed now?

I assume the answer is 'yes' since all patches have been committed and I dont see any pending issues.. Feel free to correct.