Bug 683473 – Wait for box saving completion before exiting

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 683473 - Wait for box saving completion before exiting


Summary:	Wait for box saving completion before exiting


Status:	RESOLVED FIXED

Product:	gnome-boxes
Classification:	Applications
Component:	general
Version:	unspecified
Hardware:	Other All

Importance:	Normal normal
Target Milestone:	--
Assigned To:	GNOME Boxes maintainer(s)
QA Contact:	GNOME Boxes maintainer(s)

URL:
Whiteboard:

Duplicates:	674826 (view as bug list)
Depends on:
Blocks:

Reported:	2012-09-06 09:46 UTC by Christophe Fergeau
Modified:	2016-03-31 14:03 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Wait for box saving completion before exiting (1.29 KB, patch) 2012-09-06 09:46 UTC, Christophe Fergeau	needs-work	Details \| Review
Wait until all machines are suspended (2.23 KB, patch) 2012-09-07 15:46 UTC, Marc-Andre Lureau	committed	Details \| Review
proposed changes (2.09 KB, patch) 2012-09-11 17:19 UTC, Christophe Fergeau	needs-work	Details \| Review
Add Machine::suspend_at_exit (1.66 KB, patch) 2012-09-14 13:22 UTC, Christophe Fergeau	committed	Details \| Review
Wait until all machines are suspended (3.03 KB, patch) 2012-09-14 13:22 UTC, Christophe Fergeau	none	Details \| Review
Wait until all machines are suspended (2.65 KB, patch) 2012-09-14 14:00 UTC, Christophe Fergeau	none	Details \| Review
Wait until all machines are suspended (2.93 KB, patch) 2012-09-14 14:44 UTC, Christophe Fergeau	committed	Details \| Review

Description Christophe Fergeau 2012-09-06 09:46:40 UTC

The way we save running boxes when exiting Boxes is racy as we
start an async save of the VMs and then we complete the exit process
without making sure the various saves have completed.
It has happened a few times to be that after exiting Boxes with running
VMs, some of them are still up instead of being saved after I restart
Boxes.

Comment 1 Christophe Fergeau 2012-09-06 09:46:42 UTC

Created attachment 223617 [details] [review]
Wait for box saving completion before exiting

Comment 2 Marc-Andre Lureau 2012-09-06 10:44:59 UTC

Review of attachment 223617 [details] [review]:

what happens when you start gnome-boxes again while it is saving VMs? does it reopen the window/application?

Comment 3 Christophe Fergeau 2012-09-06 10:57:49 UTC

Review of attachment 223617 [details] [review]:

Hmm, segfault, I guess that's not the answer you want ;)

Comment 4 Alexander Larsson 2012-09-06 11:12:11 UTC

Maybe we should let the GApplication go and hold on to the process some other way, rather than deal with the complexities of reviving the GApp.

Comment 5 Marc-Andre Lureau 2012-09-07 15:46:53 UTC

Created attachment 223769 [details] [review]
Wait until all machines are suspended

We need to wait until all suspend() calls are finished to be sure
that the call went through. We shouldn't hold the app though, so
it needs to be done in a seperate loop when the program exits.

Comment 6 Christophe Fergeau 2012-09-07 16:41:22 UTC

Review of attachment 223769 [details] [review]:

Ah great, thanks for taking over this patch, looks good!

::: src/app.vala
@@ +566,3 @@
 
+    public async void suspend_machines () {
+        var waiting_counter = 0;

Can you add a debug ("suspending all machines"); message?

@@ +578,3 @@
 
+        while (waiting_counter > 0) {
+            yield;

Are we 100% sure yield; will only wait for the async methods started by this method, ie the .suspend.begin () calls?

::: src/main.vala
@@ +138,3 @@
+        return false;
+    });
+    loop.run ();

maybe it would be slightly better to hook that clean-up on the Application::shutdown signal which seems to be meant for that purpose?

Comment 7 Marc-Andre Lureau 2012-09-07 16:44:22 UTC

Review of attachment 223769 [details] [review]:

::: src/app.vala
@@ +566,3 @@
 
+    public async void suspend_machines () {
+        var waiting_counter = 0;

sure

@@ +578,3 @@
 
+        while (waiting_counter > 0) {
+            yield;

there is only one place calling suspend_machines.callback, so yes.

::: src/main.vala
@@ +138,3 @@
+        return false;
+    });
+    loop.run ();

after app.run () returns, the application is already ::shutdown, and the patch doesn't attempt to recycle it, since that would lead to the problem we had with the first approach.

Comment 8 Marc-Andre Lureau 2012-09-07 16:46:12 UTC

Review of attachment 223769 [details] [review]:

ok to commit with the debug() ?

Comment 9 Christophe Fergeau 2012-09-07 16:59:33 UTC

Review of attachment 223769 [details] [review]:

::: src/app.vala
@@ +578,3 @@
 
+        while (waiting_counter > 0) {
+            yield;

But there's nothing linking this yield; call with the machine.suspend.begin(); call, is there?
This is probably just me being confused by vala ;)

::: src/main.vala
@@ +138,3 @@
+        return false;
+    });
+    loop.run ();

What I mean is to have
app.shutdown.connect (() => { // your code doing the suspend });
return app.run ();

The way it is done means we'll create a mainloop and run suspend_machines in secondary instances I think, which is useless, and could even cause bugs when we do more cleanup at shutdown time.

Comment 10 Marc-Andre Lureau 2012-09-07 17:12:52 UTC

Review of attachment 223769 [details] [review]:

::: src/app.vala
@@ +578,3 @@
 
+        while (waiting_counter > 0) {
+            yield;

the suspend_machines.callback () is the link, resuming where the function suspend_machines left with the yield;

::: src/main.vala
@@ +138,3 @@
+        return false;
+    });
+    loop.run ();

The ::shutdown signal is emitted only on the registered primary instance immediately after the main loop terminates.

We need the loop to keep running, but the application to not hold.

The simplest way is to create our own loop for the final cleanup. I don't see why it would cause bugs, even when we do more cleanup, either at App.run() time or after.

Comment 11 Christophe Fergeau 2012-09-10 10:14:13 UTC

Review of attachment 223769 [details] [review]:

::: src/app.vala
@@ +578,3 @@
 
+        while (waiting_counter > 0) {
+            yield;

Ah right, thanks, now I'm understanding your first comment, thanks for explaining again :)

::: src/main.vala
@@ +138,3 @@
+        return false;
+    });
+    loop.run ();

Oh my suggestion is not about trying to reuse the App mainloop or anything like that, I'm fine with creating our own mainloop.
It just feels slightly cleaner to do cleanup that is specific to the primary instance only in the primary instance, ie from 'shutdown'. What could cause bugs is secondary instances trying to access members from App that have not been initialized.
But if you don't think that is worth it, it's not really important.

Comment 12 Christophe Fergeau 2012-09-11 17:19:53 UTC

Created attachment 224033 [details] [review]
proposed changes

They go on top of the existing patch, and implement my suggestions.
Feel free to squash in, or to drop ;)

Comment 13 Alexander Larsson 2012-09-12 12:13:39 UTC

Review of attachment 224033 [details] [review]:

::: src/app.vala
@@ +162,3 @@
+                });
+            loop.run ();
+        });

I don't think this is right. Shutdown is run immediately after the primary mainloop in GApplication is exited, while dbus deregistration happens in GApplication finalizer. So, this mainloop will serve the main context and thus the GApplication, meaning that late arriving dbus requests still get handled.

In fact, this means that the original patch is not quite right either.

Can we finalize the app before suspending the machines, or does the suspend code reference App.app?

Maybe we can use a different main context in the final loop in order to avoid serving the dbus calls for the GApplication? Does that work with e.g. the async libvirt calls during shutdown?

Comment 14 Christophe Fergeau 2012-09-13 16:14:20 UTC

Attachment 223769 [details] pushed as 524265e - Wait until all machines are suspended

Comment 15 Christophe Fergeau 2012-09-13 16:14:54 UTC

Arg, missed this review and pushed the patch already

Comment 16 Christophe Fergeau 2012-09-13 16:16:50 UTC

I've reverted the patch for now.

Comment 17 Christophe Fergeau 2012-09-14 11:12:27 UTC

I've tried to fix this patch, but I haven't had much success so far.

(In reply to comment #13)
> Can we finalize the app before suspending the machines, or does the suspend
> code reference App.app?

There is no need for anything from App.app except for GenericArray<Boxes.CollectionItem> machines = app.collection.items; which we can get a reference on before finalizing the app. However, I didn't manage to finalize the app cleanly as various piece of code seems to still have a ref on it even after it shutdown (even after cleaning the App.app ref, and the one held by main()). After recreating the mainloop, unwanted callbacks keep running...


> Maybe we can use a different main context in the final loop in order to avoid
> serving the dbus calls for the GApplication? Does that work with e.g. the async
> libvirt calls during shutdown?

The async libvirt calls are using a regular g_simple_async_result_run_in_thread ( http://libvirt.org/git/?p=libvirt-glib.git;a=blob;f=libvirt-gobject/libvirt-gobject-domain.c;h=bcfad2a62e856a1bf5374824461522f0ef59e77f;hb=HEAD#l1183 ).
I don't know if the libvirt/glib mainloop integration code is used there, hopefully not.. ( http://libvirt.org/git/?p=libvirt-glib.git;a=blob;f=libvirt-glib/libvirt-glib-event.c;h=2a9ee23521dcd3f9c04b70926f42306e8e5bdce5;hb=HEAD )

Comment 18 Christophe Fergeau 2012-09-14 13:22:33 UTC

Created attachment 224315 [details] [review]
Add Machine::suspend_at_exit

This indicates whether we must try to suspend the box when exiting
Boxes or not. For now this is only set for libvirt machines using
the default box connection.

Comment 19 Christophe Fergeau 2012-09-14 13:22:36 UTC

Created attachment 224316 [details] [review]
Wait until all machines are suspended

We need to wait until all suspend() calls are finished to be sure
that the call went through. We shouldn't hold the app though, so
it needs to be done in a seperate loop when the program exits.

Based on a patch from Marc-André Lureau <marcandre.lureau@gmail.com>

Comment 20 Alexander Larsson 2012-09-14 13:33:00 UTC

Review of attachment 224316 [details] [review]:

::: src/app.vala
@@ +580,3 @@
+
+        var idle = new IdleSource ();
+        idle.set_callback (() => {

I don't understand why this has to be in an idle?
It should be able to be called normally, as long as its is inside the push_thread_default(), as then the
async suspend calls will pick up this maincontext.

Comment 21 Christophe Fergeau 2012-09-14 13:47:48 UTC

Review of attachment 224316 [details] [review]:

::: src/app.vala
@@ +580,3 @@
+
+        var idle = new IdleSource ();
+        idle.set_callback (() => {

No idea either, this was initially this way so I kept it, but it's probably unneeded now, I'll try without.

Comment 22 Christophe Fergeau 2012-09-14 14:00:01 UTC

Created attachment 224318 [details] [review]
Wait until all machines are suspended

We need to wait until all suspend() calls are finished to be sure
that the call went through. We shouldn't hold the app though, so
it needs to be done in a seperate loop when the program exits.

Based on a patch from Marc-André Lureau <marcandre.lureau@gmail.com>

Comment 23 Alexander Larsson 2012-09-14 14:05:05 UTC

Looks good to me.

Comment 24 Christophe Fergeau 2012-09-14 14:44:13 UTC

Created attachment 224332 [details] [review]
Wait until all machines are suspended

We need to wait until all suspend() calls are finished to be sure
that the call went through. We shouldn't hold the app though, so
it needs to be done in a seperate loop when the program exits.

Based on a patch from Marc-André Lureau <marcandre.lureau@gmail.com>

Comment 25 Christophe Fergeau 2012-09-14 14:58:56 UTC

Attachment 224315 [details] pushed as 55c03cb - Add Machine::suspend_at_exit
Attachment 224332 [details] pushed as 0c46399 - Wait until all machines are suspended

Comment 26 Christophe Fergeau 2012-09-21 13:25:00 UTC

*** Bug 674826 has been marked as a duplicate of this bug. ***