Bug 674209 – Support VM creation on host w/o KVM

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 674209 - Support VM creation on host w/o KVM


Summary:	Support VM creation on host w/o KVM


Status:	RESOLVED FIXED

Product:	gnome-boxes
Classification:	Applications
Component:	installer
Version:	3.5.x (unsupported)
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	--
Assigned To:	GNOME Boxes maintainer(s)
QA Contact:	GNOME Boxes maintainer(s)

URL:
Whiteboard:

Duplicates:	679238 (view as bug list)
Depends on:
Blocks:	676256 678874

Reported:	2012-04-16 15:46 UTC by Daniel P. Berrange
Modified:	2016-03-31 13:54 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Create VM based on host capabilities (9.05 KB, patch) 2012-05-17 03:36 UTC, Zeeshan Ali	none	Details \| Review
Create VM before 'review' page (6.37 KB, patch) 2012-05-17 03:36 UTC, Zeeshan Ali	none	Details \| Review
Warn about unavailability of KVM (2.13 KB, patch) 2012-05-17 03:36 UTC, Zeeshan Ali	none	Details \| Review
Create VM before 'review' page (7.14 KB, patch) 2012-05-19 01:51 UTC, Zeeshan Ali	none	Details \| Review
Revert "add --checks option to troubleshot for vt/kvm" (4.41 KB, patch) 2012-05-19 01:53 UTC, Zeeshan Ali	rejected	Details \| Review
Create VM based on host capabilities (9.31 KB, patch) 2012-06-01 03:10 UTC, Zeeshan Ali	none	Details \| Review
Create VM before 'review' page (7.19 KB, patch) 2012-06-01 03:11 UTC, Zeeshan Ali	none	Details \| Review
Warn about unavailability of KVM (2.11 KB, patch) 2012-06-01 03:11 UTC, Zeeshan Ali	none	Details \| Review
Create VM based on host capabilities (9.31 KB, patch) 2012-06-06 02:30 UTC, Zeeshan Ali	none	Details \| Review
Create VM before 'review' page (7.59 KB, patch) 2012-06-06 02:31 UTC, Zeeshan Ali	none	Details \| Review
Warn about unavailability of KVM (2.11 KB, patch) 2012-06-06 02:31 UTC, Zeeshan Ali	none	Details \| Review
Refactor VMConfigurator.set_os_config() (2.59 KB, patch) 2012-06-06 23:54 UTC, Zeeshan Ali	needs-work	Details \| Review
Create VM based on host capabilities (7.74 KB, patch) 2012-06-06 23:54 UTC, Zeeshan Ali	none	Details \| Review
Create VM before 'review' page (8.04 KB, patch) 2012-06-06 23:54 UTC, Zeeshan Ali	none	Details \| Review
Warn about unavailability of KVM (2.12 KB, patch) 2012-06-06 23:54 UTC, Zeeshan Ali	accepted-commit_now	Details \| Review
Create VM based on host capabilities (7.39 KB, patch) 2012-06-07 13:48 UTC, Zeeshan Ali	accepted-commit_now	Details \| Review
Refactor VMConfigurator.set_os_config() (2.23 KB, patch) 2012-06-16 02:35 UTC, Zeeshan Ali	reviewed	Details \| Review
Create VM before 'review' page (8.36 KB, patch) 2012-06-16 02:39 UTC, Zeeshan Ali	none	Details \| Review
More robust installer media clean-up (2.05 KB, patch) 2012-06-16 02:40 UTC, Zeeshan Ali	none	Details \| Review
Separate methods for creation & launch of VM (3.32 KB, patch) 2012-06-19 01:52 UTC, Zeeshan Ali	reviewed	Details \| Review
Create VM before 'review' page (6.15 KB, patch) 2012-06-19 01:52 UTC, Zeeshan Ali	none	Details \| Review
Error out on failure to create LibvirtMachine for new domains (2.55 KB, patch) 2012-06-20 16:21 UTC, Zeeshan Ali	reviewed	Details \| Review
More robust installer media clean-up (2.33 KB, patch) 2012-06-20 16:32 UTC, Zeeshan Ali	reviewed	Details \| Review
Create VM before 'review' page (6.68 KB, patch) 2012-06-21 19:17 UTC, Zeeshan Ali	reviewed	Details \| Review
Create VM before 'review' page (7.97 KB, patch) 2012-06-25 19:16 UTC, Zeeshan Ali	none	Details \| Review
Error out on failure to create LibvirtMachine for new domains (2.49 KB, patch) 2012-06-26 12:56 UTC, Zeeshan Ali	accepted-commit_now	Details \| Review
Refactor VMConfigurator.set_os_config() (2.23 KB, patch) 2012-06-26 13:35 UTC, Zeeshan Ali	committed	Details \| Review
Create VM based on host capabilities (7.26 KB, patch) 2012-06-26 13:35 UTC, Zeeshan Ali	committed	Details \| Review
Separate methods for creation & launch of VM (3.32 KB, patch) 2012-06-26 13:35 UTC, Zeeshan Ali	committed	Details \| Review
Create VM before 'review' page (7.97 KB, patch) 2012-06-26 13:35 UTC, Zeeshan Ali	committed	Details \| Review
Warn about unavailability of KVM (2.12 KB, patch) 2012-06-26 13:36 UTC, Zeeshan Ali	committed	Details \| Review
More robust installer media clean-up (2.28 KB, patch) 2012-06-26 13:37 UTC, Zeeshan Ali	committed	Details \| Review
Error out on failure to create LibvirtMachine for new domains (2.49 KB, patch) 2012-06-26 13:37 UTC, Zeeshan Ali	committed	Details \| Review

Description Daniel P. Berrange 2012-04-16 15:46:08 UTC

Fedora 17, 3.4.0.1-2.fc17

I have a laptop which does not support KVM. For fun I decided to try Boxes anyway, to see if anything bad happened.

The new VM wizard failed at the 'Preparation' step with a very unhelpful popup message "Box creation failed".

On the console I discovered

(gnome-boxes:2934): Boxes-WARNING **: wizard.vala:155: Unable to start domain: unsupported configuration: Domain requires KVM, but it is not available. Check that virtualization is enabled in the host BIOS, and host configuration is setup to load the kvm modules.

This is clearly not appropriate as a log warning message, since it is in fact fatal to the creation attempt.

Either boxes should automatically fallback to using (slow) QEMU emulation, or the error message popup should tell me that I require KVM. If we go with the latter approach, then it should tell me I require KVM before I even start going through the wizard to save me wasting my time selecting ISO images,e tc.

Comment 1 Zeeshan Ali 2012-04-16 15:59:05 UTC

I thought you told me that libvirtd automatically inserts the kvm module? At least it really should do that for us. However, you still have a point since user also needs to enable this in BIOS.

Comment 2 Daniel P. Berrange 2012-04-16 16:01:09 UTC

I think you missed the part where I wrote:

> I have a laptop which does not support KVM.

It does not have *any* VT support whatsoever, so the question of libvirtd loading KVM modules automatically is irrelevant.

Comment 3 Zeeshan Ali 2012-04-17 13:38:53 UTC

(In reply to comment #2)
> I think you missed the part where I wrote:
> 
> > I have a laptop which does not support KVM.

Indeed I did, sorry! This is kinda important so I intend to:

1. Add Capabilities API to libvirt-glib
2. Use this API in Boxes to figure which type of virtualization to use.
3. If KVM is not available, warn the user at 'review' step that VM will be slow.

Comment 4 Daniel P. Berrange 2012-04-17 13:42:35 UTC

> 1. Add Capabilities API to libvirt-glib
> 2. Use this API in Boxes to figure which type of virtualization to use.
> 3. If KVM is not available, warn the user at 'review' step that VM will be slow.

That sounds like the correct approach. While doing this you should also be sure to query the KVM for the arch that matches the host arch. I tried boxes on my i686 Mac Mini which /does/ support KVM, pointing at an i686 guest operating system ISO, but boxes tried to create an x86_64 guest and obviously failed.

Comment 5 Zeeshan Ali 2012-05-17 03:36:17 UTC

Created attachment 214225 [details] [review]
Create VM based on host capabilities

Choose the virtualization type (KVM or Qemu), architecture and features
(acpi, apic and pae) of the created domain based on host capabilities and
installer media.

Comment 6 Zeeshan Ali 2012-05-17 03:36:23 UTC

Created attachment 214226 [details] [review]
Create VM before 'review' page

This means that we now move most of the failure points before arriving to
review page. If user hits 'create' on the review page, the VM is simply
launched. OTOH if user hits 'back' or 'cancel', the created VM is deleted
along with its volume (cheap operation).

Comment 7 Zeeshan Ali 2012-05-17 03:36:26 UTC

Created attachment 214227 [details] [review]
Warn about unavailability of KVM

Comment 8 Zeeshan Ali 2012-05-17 03:39:19 UTC

(In reply to comment #5)
> Created an attachment (id=214225) [details] [review]
(In reply to comment #6)
> Created an attachment (id=214226) [details] [review]
(In reply to comment #7)
> Created an attachment (id=214227) [details] [review]

So finally I managed to put together some patches that I'm not to ashamed to share. :) These patches require my latest libvirt-glib patches from git master.

Comment 9 Christophe Fergeau 2012-05-17 08:49:50 UTC

(In reply to comment #6)
> Created an attachment (id=214226) [details] [review]
> Create VM before 'review' page
> 
> This means that we now move most of the failure points before arriving to
> review page. If user hits 'create' on the review page, the VM is simply
> launched. OTOH if user hits 'back' or 'cancel', the created VM is deleted
> along with its volume (cheap operation).

What are the concrete benefits of doing that? Why did you want to change it? (not saying it's a bad idea, I just have no idea why doing this is better than what was done before ;). This would be good to make it explicit in the commit log. In case of errors, won't it be a bit unexpected for the user to get a "VM Creation failed" message before it tells Boxes to go ahead after the review page?

Comment 10 Zeeshan Ali 2012-05-18 01:57:06 UTC

(In reply to comment #9)
> (In reply to comment #6)
> > Created an attachment (id=214226) [details] [review] [details] [review]
> > Create VM before 'review' page
> > 
> > This means that we now move most of the failure points before arriving to
> > review page. If user hits 'create' on the review page, the VM is simply
> > launched. OTOH if user hits 'back' or 'cancel', the created VM is deleted
> > along with its volume (cheap operation).
> 
> What are the concrete benefits of doing that? Why did you want to change it?
> (not saying it's a bad idea, I just have no idea why doing this is better than
> what was done before ;).

If you look at the later commit, you'll notice that we at least need to create the configuration(s) before 'review' since we need information about the new domain. Now its an interesting question whether we should only create configuration beforehand or the whole machine.

The reason I decided to go with the latter approach was that I recalled mccann saying that we should ensure that we have everything ready at the 'review' page. I agree with that since its better to error out during preparation and setup rather than after we told user that everything is ready and she/he clicks create and "boom".

> This would be good to make it explicit in the commit
> log.

Sure.

> In case of errors, won't it be a bit unexpected for the user to get a "VM
> Creation failed" message before it tells Boxes to go ahead after the review
> page?

Of course but we are not moving that error message to review. :) From user's POV the machine is created after he/she hits 'create' button.

Comment 11 Zeeshan Ali 2012-05-19 01:51:00 UTC

Created attachment 214430 [details] [review]
Create VM before 'review' page

This means that we now move most of the failure points before arriving to
review page. If user hits 'create' on the review page, the VM is simply
launched. OTOH if user hits 'back' or 'cancel', the created VM is deleted
along with its volume (cheap operation).

Why do we need this change?

If you look at the later commit, you'll notice that we at least need to
create the configuration(s) before 'review' since we need information
about the new domain to display to user. Now its an interesting question
whether we should only create configuration before hand or the whole
machine.

The reason I decided to go with the latter approach was that I recalled
mccann saying that we should ensure that we have everything ready at the
'review' page. I agree with that since its better to error out during
preparation and setup rather than after we told user that everything is
ready, she/he clicks create and we go *boom*.

Also IMHO it makes for a much better end-user experience if user has to
wait less on each step rather than wait a long time on one step.

Comment 12 Zeeshan Ali 2012-05-19 01:53:51 UTC

Created attachment 214431 [details] [review]
Revert "add --checks option to troubleshot for vt/kvm"

This reverts commit ba236c46e4120c7c177339846ef74c999a658666.

Conflicts:

	src/util.vala

Comment 13 Zeeshan Ali 2012-05-19 01:57:18 UTC

(In reply to comment #12)
> Created an attachment (id=214431) [details] [review]
> Revert "add --checks option to troubleshot for vt/kvm"
> 
> This reverts commit ba236c46e4120c7c177339846ef74c999a658666.

I guess with these changes, we don't need these checks any more? If for some reason we want to keep these checks, they need to be ported to make use of libvirt's capability API as anyone can easily modify our default connection to use a remote libvirt and these checks will break in that scenario.

Comment 14 Marc-Andre Lureau 2012-05-19 11:37:24 UTC

(In reply to comment #13)
> (In reply to comment #12)
> > Created an attachment (id=214431) [details] [review] [details] [review]
> > Revert "add --checks option to troubleshot for vt/kvm"
> > 
> > This reverts commit ba236c46e4120c7c177339846ef74c999a658666.
> 
> I guess with these changes, we don't need these checks any more? If for some
> reason we want to keep these checks, they need to be ported to make use of
> libvirt's capability API as anyone can easily modify our default connection to
> use a remote libvirt and these checks will break in that scenario.

I would rather extend them to make use of libvirt. Keep what we have and add more checks. that was my plan. The more checks, the better.

Comment 15 Marc-Andre Lureau 2012-05-19 11:39:07 UTC

Review of attachment 214431 [details] [review]:

these checks are lower level than libvirt and can supplement libvirt checks

Comment 16 Christophe Fergeau 2012-05-19 18:46:23 UTC

Quick note about stuff that we should check:
- complain if qemu or qemu-kvm are not available (this patch probably does this)
- check if it has spice support (not sure this can be done with the capabilities API)
Just had someone on IRC with these issues, writing them down before I forget ;)

Comment 17 Zeeshan Ali 2012-05-23 13:37:23 UTC

(In reply to comment #15)
> Review of attachment 214431 [details] [review]:
> 
> these checks are lower level than libvirt and can supplement libvirt checks

They are also local-only so they'll fail if anyone uses remove libvirt connection (a usecase we do support already AFAIK) as default. Thats why I mentioned porting them to libvirt's caps API.

Comment 18 Marc-Andre Lureau 2012-05-23 13:47:37 UTC

(In reply to comment #17)
> They are also local-only so they'll fail if anyone uses remove libvirt
> connection (a usecase we do support already AFAIK) as default. Thats why I
> mentioned porting them to libvirt's caps API.

We won't support VM creation in any near future. So the local check still stands.

Comment 19 Marc-Andre Lureau 2012-05-23 13:48:06 UTC

^ +remote

Comment 20 Zeeshan Ali 2012-05-23 13:50:42 UTC

(In reply to comment #14)
> (In reply to comment #13)
> > (In reply to comment #12)
> > > Created an attachment (id=214431) [details] [review] [details] [review] [details] [review]
> > > Revert "add --checks option to troubleshot for vt/kvm"
> > > 
> > > This reverts commit ba236c46e4120c7c177339846ef74c999a658666.
> > 
> > I guess with these changes, we don't need these checks any more? If for some
> > reason we want to keep these checks, they need to be ported to make use of
> > libvirt's capability API as anyone can easily modify our default connection to
> > use a remote libvirt and these checks will break in that scenario.
> 
> I would rather extend them to make use of libvirt. Keep what we have and add
> more checks. that was my plan. The more checks, the better.

Although I do agree with 'the more checks, the better' approach in general but I wouldn't want to have redundant checks. The absence of KVM cap on libvirt connection is already enough to know that either kvm-module isn't loaded, host is incapable of VT or VT isn't enabled. IMO the first one is something that libvirt is supposed to take care of, not us (and we don't have any means to check on remote connection) and my patches take care of the other two.

Comment 21 Zeeshan Ali 2012-05-23 13:51:28 UTC

(In reply to comment #18)
> (In reply to comment #17)
> > They are also local-only so they'll fail if anyone uses remove libvirt
> > connection (a usecase we do support already AFAIK) as default. Thats why I
> > mentioned porting them to libvirt's caps API.
> 
> We won't support VM creation in any near future. So the local check still
> stands.

True, forgot that.

Comment 22 Marc-Andre Lureau 2012-05-23 13:56:41 UTC

(In reply to comment #20)
> Although I do agree with 'the more checks, the better' approach in general but
> I wouldn't want to have redundant checks. The absence of KVM cap on libvirt
> connection is already enough to know that either kvm-module isn't loaded, host
> is incapable of VT or VT isn't enabled. IMO the first one is something that
> libvirt is supposed to take care of, not us (and we don't have any means to
> check on remote connection) and my patches take care of the other two.

I actually think we could have some "redundant" checks there, to help us pin-point the problem more quickly. If no package dependency forces you to have qemu-kvm installed, you might get libvirt saying no KVM, ok, but where is the problem exactly? That's what I wish we could help with --checks. May be this could be improved in libvirt, though I don't think they are focusing on this kind of user-level situations.

For example, I think it's worth checking filesystem permission too (instead of failing later on without any clear reason) etc..

Comment 23 Zeeshan Ali 2012-05-23 19:05:13 UTC

(In reply to comment #22)
> (In reply to comment #20)
> > Although I do agree with 'the more checks, the better' approach in general but
> > I wouldn't want to have redundant checks. The absence of KVM cap on libvirt
> > connection is already enough to know that either kvm-module isn't loaded, host
> > is incapable of VT or VT isn't enabled. IMO the first one is something that
> > libvirt is supposed to take care of, not us (and we don't have any means to
> > check on remote connection) and my patches take care of the other two.
> 
> I actually think we could have some "redundant" checks there, to help us
> pin-point the problem more quickly. If no package dependency forces you to have
> qemu-kvm installed, you might get libvirt saying no KVM, ok, but where is the
> problem exactly? That's what I wish we could help with --checks. May be this
> could be improved in libvirt, though I don't think they are focusing on this
> kind of user-level situations.

But with my other patches, we don't require 'kvm' anymore so we don't need to check as part of installation but yes, we can keep the '--checks' part for diagnosing problems.

> For example, I think it's worth checking filesystem permission too (instead of
> failing later on without any clear reason) etc..

filesystem permission on installer media? Sure!

Comment 24 Zeeshan Ali 2012-05-23 19:05:50 UTC

Oh and what do you think about the other patches?

Comment 25 Christophe Fergeau 2012-05-23 19:29:22 UTC

(In reply to comment #23)
> > I actually think we could have some "redundant" checks there, to help us
> > pin-point the problem more quickly. If no package dependency forces you to have
> > qemu-kvm installed, you might get libvirt saying no KVM, ok, but where is the
> > problem exactly? That's what I wish we could help with --checks. May be this
> > could be improved in libvirt, though I don't think they are focusing on this
> > kind of user-level situations.
> 
> But with my other patches, we don't require 'kvm' anymore so we don't need to
> check as part of installation but yes, we can keep the '--checks' part for
> diagnosing problems.

What if you have no qemu at all  installed ? ;) though the right solution in this case may be to use packagekit to get qemu.

Comment 26 Zeeshan Ali 2012-05-23 21:26:56 UTC

(In reply to comment #25)
> (In reply to comment #23)
> > > I actually think we could have some "redundant" checks there, to help us
> > > pin-point the problem more quickly. If no package dependency forces you to have
> > > qemu-kvm installed, you might get libvirt saying no KVM, ok, but where is the
> > > problem exactly? That's what I wish we could help with --checks. May be this
> > > could be improved in libvirt, though I don't think they are focusing on this
> > > kind of user-level situations.
> > 
> > But with my other patches, we don't require 'kvm' anymore so we don't need to
> > check as part of installation but yes, we can keep the '--checks' part for
> > diagnosing problems.
> 
> What if you have no qemu at all  installed ? ;) though the right solution in
> this case may be to use packagekit to get qemu.

I didn't know we want to support anything other than qemu, your answer to your question is very much correct. :)

Comment 27 Christophe Fergeau 2012-05-24 08:44:52 UTC

(In reply to comment #26)
> 
> I didn't know we want to support anything other than qemu, your answer to your
> question is very much correct. :)

Just thinking of very broken installations with no qemu installed at all, would be nice to fail gracefully when this happens.

Comment 28 Zeeshan Ali 2012-05-24 17:59:03 UTC

(In reply to comment #26)
> (In reply to comment #25)
> > (In reply to comment #23)
> > > > I actually think we could have some "redundant" checks there, to help us
> > > > pin-point the problem more quickly. If no package dependency forces you to have
> > > > qemu-kvm installed, you might get libvirt saying no KVM, ok, but where is the
> > > > problem exactly? That's what I wish we could help with --checks. May be this
> > > > could be improved in libvirt, though I don't think they are focusing on this
> > > > kind of user-level situations.
> > > 
> > > But with my other patches, we don't require 'kvm' anymore so we don't need to
> > > check as part of installation but yes, we can keep the '--checks' part for
> > > diagnosing problems.
> > 
> > What if you have no qemu at all  installed ? ;) though the right solution in
> > this case may be to use packagekit to get qemu.
> 
> I didn't know we want to support anything other than qemu, your answer to your
> question is very much correct. :)

Well, we could add a graceful failure with an error message in that case.

Comment 29 Zeeshan Ali 2012-06-01 03:10:52 UTC

Created attachment 215379 [details] [review]
Create VM based on host capabilities

Choose the virtualization type (KVM or Qemu), architecture and features
(acpi, apic and pae) of the created domain based on host capabilities and
installer media.

V2:

* Rebased on git master
* Error out on unavailability of Qemu as par comment#27.

Comment 30 Zeeshan Ali 2012-06-01 03:11:41 UTC

Created attachment 215380 [details] [review]
Create VM before 'review' page

This means that we now move most of the failure points before arriving to
review page. If user hits 'create' on the review page, the VM is simply
launched. OTOH if user hits 'back' or 'cancel', the created VM is deleted
along with its volume (cheap operation).

Why do we need this change?

If you look at the later commit, you'll notice that we at least need to
create the configuration(s) before 'review' since we need information
about the new domain to display to user. Now its an interesting question
whether we should only create configuration before hand or the whole
machine.

The reason I decided to go with the latter approach was that I recalled
mccann saying that we should ensure that we have everything ready at the
'review' page. I agree with that since its better to error out during
preparation and setup rather than after we told user that everything is
ready, she/he clicks create and we go *boom*.

Also IMHO it makes for a much better end-user experience if user has to
wait less on each step rather than wait a long time on one step.

V2: Rebased on git master

Comment 31 Zeeshan Ali 2012-06-01 03:11:52 UTC

Created attachment 215381 [details] [review]
Warn about unavailability of KVM

V2: Rebased on git master

Comment 32 Alexander Larsson 2012-06-05 10:08:08 UTC

Review of attachment 215379 [details] [review]:

::: src/vm-configurator.vala
@@ +47,3 @@
+        if (guest_supports_feature (best_caps, "pae"))
+            features += "pae";
+        domain.set_features (features);

There are some other interesting features that at least I have not gotten enabled in the VM I created with boxes:

sep - fast sysenter/leave
rdtscp - faster timer
constant_tsc - seems to be useful as per http://docs.fedoraproject.org/en-US/Fedora/13/html/Virtualization_Guide/chap-Virtualization-KVM_guest_timing_management.html
sse3, sse4_1, sse4_2 - These might be important for e.g. the software fallback gnome-shell
avx - another sse kinda thing, has support in pixman

Also, its not reporting the same cpu model in the guest as in the host. 
In general, does it really make sense to hardcode all these features here. Should we not be able to just tell libvirt that we want a VM that is as efficient as possible that will only run on this machine?

Comment 33 Christophe Fergeau 2012-06-05 10:21:23 UTC

(In reply to comment #32)
> Also, its not reporting the same cpu model in the guest as in the host. 
> In general, does it really make sense to hardcode all these features here.
> Should we not be able to just tell libvirt that we want a VM that is as
> efficient as possible that will only run on this machine?

I think this is what http://www.redhat.com/archives/libvir-list/2012-January/msg00232.html was about, so this should be possible with some magic libvirt XML.

Comment 34 Alexander Larsson 2012-06-05 11:42:53 UTC

> - guest CPU should be a copy of host CPU as advertised by capabilities
>  XML (this is a short cut for manually copying host CPU specification
>  from capabilities to domain XML):
>    <cpu mode='host-model'/>

Sounds like this is what we want?

Comment 35 Christophe Fergeau 2012-06-05 11:54:29 UTC

(In reply to comment #34)
> > - guest CPU should be a copy of host CPU as advertised by capabilities
> >  XML (this is a short cut for manually copying host CPU specification
> >  from capabilities to domain XML):
> >    <cpu mode='host-model'/>
> 
> Sounds like this is what we want?

Yes, I wasn't sure at first if we wanted 'host-model' or 'host-passthrough', after rereading this is what we want.

Comment 36 Zeeshan Ali 2012-06-05 14:04:38 UTC

(In reply to comment #35)
> (In reply to comment #34)
> > > - guest CPU should be a copy of host CPU as advertised by capabilities
> > >  XML (this is a short cut for manually copying host CPU specification
> > >  from capabilities to domain XML):
> > >    <cpu mode='host-model'/>
> > 
> > Sounds like this is what we want?
> 
> Yes, I wasn't sure at first if we wanted 'host-model' or 'host-passthrough',
> after rereading this is what we want.

Interesting, wouldn't this also solve bug#676029?

Comment 37 Zeeshan Ali 2012-06-06 02:30:47 UTC

Created attachment 215704 [details] [review]
Create VM based on host capabilities

V3: Rebased on git master

Comment 38 Zeeshan Ali 2012-06-06 02:31:14 UTC

Created attachment 215705 [details] [review]
Create VM before 'review' page

V3: Rebased on git master

Comment 39 Zeeshan Ali 2012-06-06 02:31:32 UTC

Created attachment 215706 [details] [review]
Warn about unavailability of KVM

V3: Rebased on git master

Comment 40 Christophe Fergeau 2012-06-06 12:38:32 UTC

Comment on attachment 215704 [details] [review]
Create VM based on host capabilities

>From a20425dfece35087389af050ffd427f6568e8e56 Mon Sep 17 00:00:00 2001
>From: "Zeeshan Ali (Khattak)" <zeeshanak@gnome.org>
>Date: Thu, 3 May 2012 07:34:42 +0300
>Subject: [PATCH] Create VM based on host capabilities
>
>Choose the virtualization type (KVM or Qemu), architecture and features
>(acpi, apic and pae) of the created domain based on host capabilities and
>installer media.
>
>https://bugzilla.gnome.org/show_bug.cgi?id=674209
>---
> configure.ac             |    2 +-
> src/installer-media.vala |    8 ++++
> src/vm-configurator.vala |  111 ++++++++++++++++++++++++++++++++++++++++------
> src/vm-creator.vala      |    3 +-
> 4 files changed, 109 insertions(+), 15 deletions(-)
>
>diff --git a/configure.ac b/configure.ac
>index 238ed20..1d263c5 100644
>--- a/configure.ac
>+++ b/configure.ac
>@@ -46,7 +46,7 @@ GOBJECT_INTROSPECTION_MIN_VERSION=0.9.6
> GTK_MIN_VERSION=3.3.5
> GTK_VNC_MIN_VERSION=0.4.4
> LIBVIRT_GLIB_MIN_VERSION=0.0.8
>-LIBVIRT_GCONFIG_MIN_VERSION=0.0.8
>+LIBVIRT_GCONFIG_MIN_VERSION=0.0.9
> LIBXML2_MIN_VERSION=2.7.8
> SPICE_GTK_MIN_VERSION=0.9
> GUDEV_MIN_VERSION=165
>diff --git a/src/installer-media.vala b/src/installer-media.vala
>index c912f9c..a726161 100644
>--- a/src/installer-media.vala
>+++ b/src/installer-media.vala
>@@ -67,6 +67,14 @@ private class Boxes.InstallerMedia : Object {
>         resources = media_manager.os_db.get_resources_for_os (os, architecture);
>     }
> 
>+    public bool is_architecture_compatible (string architecture) {
>+        return os_media == null || // Unknown media
>+               os_media.architecture == architecture ||
>+               (os_media.architecture == "i386" && architecture == "i686") ||
>+               (os_media.architecture == "i386" && architecture == "x86_64") ||
>+               (os_media.architecture == "i686" && architecture == "x86_64");
>+    }
>+
>     private async GUdev.Device? get_device_from_path (string path, Client client, Cancellable? cancellable) {
>         try {
>             var mount_dir = File.new_for_commandline_arg (path);
>diff --git a/src/vm-configurator.vala b/src/vm-configurator.vala
>index d9335f5..cc9efc9 100644
>--- a/src/vm-configurator.vala
>+++ b/src/vm-configurator.vala
>@@ -3,6 +3,10 @@
> using Osinfo;
> using GVirConfig;
> 
>+private errordomain Boxes.VMConfiguratorError {
>+    NO_GUEST_CAPS,
>+}
>+
> private class Boxes.VMConfigurator {
>     private const string BOXES_NS = "boxes";
>     private const string BOXES_NS_URI = "http://live.gnome.org/Boxes/";
>@@ -13,7 +17,10 @@ private class Boxes.VMConfigurator {
>     private const string INSTALLATION_XML = "<os-state>" + INSTALLATION_STATE + "</os-state>";
>     private const string INSTALLED_XML = "<os-state>" + INSTALLED_STATE + "</os-state>";
> 
>-    public Domain create_domain_config (InstallerMedia install_media, string name, string target_path) {
>+    public Domain create_domain_config (InstallerMedia install_media,
>+                                        string         name,
>+                                        string         target_path,
>+                                        Capabilities   caps) throws VMConfiguratorError {
>         var domain = new Domain ();
>         domain.name = name;
> 
>@@ -23,13 +30,24 @@ private class Boxes.VMConfigurator {
>             domain.set_custom_xml (xml, BOXES_NS, BOXES_NS_URI);
>         } catch (GLib.Error error) { assert_not_reached (); /* We are so screwed if this happens */ }
> 
>+        var best_caps = get_best_guest_caps (caps, install_media);
>         domain.memory = install_media.resources.ram / KIBIBYTES;
>         domain.vcpu = install_media.resources.n_cpus;
>-        domain.set_virt_type (DomainVirtType.KVM);
> 
>-        set_os_config (domain, install_media);
>+        var virt_type = guest_kvm_enabled (best_caps) ? DomainVirtType.KVM : DomainVirtType.QEMU;
>+        domain.set_virt_type (virt_type);
>+
>+        set_os_config (domain, install_media, best_caps);
>+
>+        string[] features = {};
>+        if (guest_supports_feature (best_caps, "acpi"))
>+            features += "acpi";
>+        if (guest_supports_feature (best_caps, "apic"))
>+            features += "apic";
>+        if (guest_supports_feature (best_caps, "pae"))
>+            features += "pae";

As suggested by Alex, we should just tell libvirt to use the best capabilities it can. This can be done in an additional patch

>+        domain.set_features (features);
> 
>-        domain.set_features ({ "acpi", "apic", "pae" });
>         var clock = new DomainClock ();
>         if (install_media.os != null && install_media.os.short_id.contains ("win"))
>             clock.set_offset (DomainClockOffset.LOCALTIME);
>@@ -90,7 +108,7 @@ private class Boxes.VMConfigurator {
>         try {
>             domain.set_custom_xml (INSTALLED_XML, BOXES_NS, BOXES_NS_URI);
>         } catch (GLib.Error error) { assert_not_reached (); /* We are so screwed if this happens */ }
>-        set_os_config (domain);
>+        set_post_install_os_config (domain);
>         domain.set_lifecycle (DomainLifecycleEvent.ON_REBOOT, DomainLifecycleAction.RESTART);
> 
>         // Make source (installer/live media) optional
>@@ -106,8 +124,6 @@ private class Boxes.VMConfigurator {
>             if (disk_type == DomainDiskGuestDeviceType.CDROM || disk_type == DomainDiskGuestDeviceType.FLOPPY)
>                 disk.set_startup_policy (DomainDiskStartupPolicy.OPTIONAL);
>         }
>-
>-        domain.set_devices (devices);
>     }
> 
>     public bool is_install_config (Domain domain) {
>@@ -201,16 +217,26 @@ private class Boxes.VMConfigurator {
>         domain.add_device (disk);
>     }
> 
>-    private void set_os_config (Domain domain, InstallerMedia? install_media = null) {
>+    private void set_post_install_os_config (Domain domain) {
>+        var os = domain.get_os ();
>+
>+        os.set_kernel (null);
>+        os.set_ramdisk (null);
>+        os.set_cmdline (null);
>+
>+        var boot_devices = os.get_boot_devices ();
>+        boot_devices.remove (DomainOsBootDevice.CDROM);
>+        os.set_boot_devices (boot_devices);
>+    }

Isn't this whole function unrelated to this change? 

>+
>+    private void set_os_config (Domain domain, InstallerMedia install_media, CapabilitiesGuest guest_caps) {
>         var os = new DomainOs ();
>         os.set_os_type (DomainOsType.HVM);
>-        os.set_arch ("x86_64");
>+        os.set_arch (guest_caps.get_arch ().get_name ());
> 
>         var boot_devices = new GLib.List<DomainOsBootDevice> ();
>-        if (install_media != null) {
>-            set_direct_boot_params (os, install_media);
>-            boot_devices.append (DomainOsBootDevice.CDROM);
>-        }
>+        set_direct_boot_params (os, install_media);
>+        boot_devices.append (DomainOsBootDevice.CDROM);
>         boot_devices.append (DomainOsBootDevice.HD);
>         os.set_boot_devices (boot_devices);
> 
>@@ -281,4 +307,63 @@ private class Boxes.VMConfigurator {
> 
>         return null;
>     }
>+
>+    private CapabilitiesGuest get_best_guest_caps (Capabilities   caps,
>+                                                   InstallerMedia install_media) throws VMConfiguratorError {
>+        var guests_caps = caps.get_guests ();
>+
>+        // First find all compatible guest caps
>+        var compat_guests_caps = new GLib.List<CapabilitiesGuest> ();
>+        foreach (var guest_caps in guests_caps) {
>+            var guest_arch = guest_caps.get_arch ().get_name ();
>+
>+            if (install_media.is_architecture_compatible (guest_arch))
>+                compat_guests_caps.append (guest_caps);
>+        }
>+
>+        // Now lets see if there is any KVM-enabled guest caps
>+        foreach (var guest_caps in compat_guests_caps)
>+            if (guest_kvm_enabled (guest_caps))
>+                return guest_caps;
>+
>+        // No KVM-enabled guest caps :( We at least need Qemu
>+        foreach (var guest_caps in compat_guests_caps)
>+            if (guest_is_qemu (guest_caps))
>+                return guest_caps;
>+

The 3 loops above can be simplified:

var compat_guests_caps = new GLib.List<CapabilitiesGuest> ();
Bleh qemu_caps;
foreach (var guest_caps in guests_caps) {
    var guest_arch = guest_caps.get_arch ().get_name ();
    if (!install_media.is_architecture_compatible (guest_arch))
        continue;
    if (guest_kvm_enabled (guest_caps))
        return guest_caps; 
    if (guest_is_qemu (guest_caps))
        qemu_caps = guest_caps;
}
if (!qemu_caps)
    //error
else
    return qemu_caps


>+    private static bool guest_kvm_enabled (CapabilitiesGuest guest_caps) {

The naming comes directly from libvirt XML, but associating kvm with guest surprises me a bit, I tend to associate this with the hypervisor, not the guest.

>+        var arch = guest_caps.get_arch ();
>+        foreach (var domain in arch.get_domains ())
>+            if (domain.get_virt_type () == DomainVirtType.KVM)
>+                return true;
>+
>+        return false;
>+    }
>+
>+    private static bool guest_is_qemu (CapabilitiesGuest guest_caps) {
>+        var arch = guest_caps.get_arch ();
>+        foreach (var domain in arch.get_domains ())
>+            if (domain.get_virt_type () == DomainVirtType.QEMU)
>+                return true;
>+
>+        return false;
>+    }
>+
>+    private static bool guest_supports_feature (CapabilitiesGuest guest_caps, string feature_name) {
>+        var supports = false;
>+
>+        foreach (var feature in guest_caps.get_features ())
>+            if (feature_name == feature.get_name ()) {
>+                supports = true;
>+
>+                break;
>+            }
>+
>+        return supports;
>+    }

Would definitely be nicer to tell libvirt to just use the host caps ;)

> }
>diff --git a/src/vm-creator.vala b/src/vm-creator.vala
>index 8eaa8ba..8905901 100644
>--- a/src/vm-creator.vala
>+++ b/src/vm-creator.vala
>@@ -65,7 +65,8 @@ private class Boxes.VMCreator {
>         }
> 
>         var volume = yield create_target_volume (name, install_media.resources.storage);
>-        var config = configurator.create_domain_config (install_media, name, volume.get_path ());
>+        var caps = yield connection.get_capabilities_async (cancellable);
>+        var config = configurator.create_domain_config (install_media, name, volume.get_path (), caps);
> 
>         var domain = connection.create_domain (config);
>         domain.start (0);
>-- 
>1.7.10.2

Comment 41 Christophe Fergeau 2012-06-06 13:14:48 UTC

Comment on attachment 215705 [details] [review]
Create VM before 'review' page

>From a5c61d45a4f7b87e5ff176c67fa65784f87fa086 Mon Sep 17 00:00:00 2001
>From: "Zeeshan Ali (Khattak)" <zeeshanak@gnome.org>
>Date: Wed, 16 May 2012 16:46:00 +0300
>Subject: [PATCH] Create VM before 'review' page

>The reason I decided to go with the latter approach was that I recalled
>mccann saying that we should ensure that we have everything ready at the
>'review' page. I agree with that since its better to error out during
>preparation and setup rather than after we told user that everything is
>ready, she/he clicks create and we go *boom*.

As long as we make 100% sure (and I mean 100% not 99%) that the VM will be fully destroyed/cleaned up (including storage, ...) when there is a failure, why not. Speaking of which, are things properly cleaned up if the user closes Boxes while the wizard is running? I couldn't really spot this in the code. One way to handle this could be to have some temporary object which does the VM destruction/cleanup in its destructor, and pass that around in the wizard. Then when we are 100% sure the VM will start at the end of the wizard, we can steal the VM from this object, otherwise we let the object die, and regardless of how this happens (unexpected exception, ...), the cleanup is still done. 

>
>Also IMHO it makes for a much better end-user experience if user has to
>wait less on each step rather than wait a long time on one step.
>
>https://bugzilla.gnome.org/show_bug.cgi?id=674209
>---
> src/vm-creator.vala |   20 +++++++++-----------
> src/wizard.vala     |   48 +++++++++++++++++++++++++++++++++---------------
> 2 files changed, 42 insertions(+), 26 deletions(-)
>
>diff --git a/src/vm-creator.vala b/src/vm-creator.vala
>index 8905901..597d156 100644
>--- a/src/vm-creator.vala
>+++ b/src/vm-creator.vala
>@@ -42,12 +42,12 @@ private class Boxes.VMCreator {
>         }
>     }
> 
>-    public async void create_and_launch_vm (InstallerMedia install_media, Cancellable? cancellable) throws GLib.Error {
>+    public async LibvirtMachine create_vm (InstallerMedia install_media, Cancellable? cancellable) throws GLib.Error {
>         if (connection == null) {
>             // Wait for needed libvirt connection
>             ulong handler = 0;
>             handler = app.notify["default-connection"].connect (() => {
>-                create_and_launch_vm.callback ();
>+                create_vm.callback ();
>                 app.disconnect (handler);
>             });
> 
>@@ -55,25 +55,23 @@ private class Boxes.VMCreator {
>         }
> 
>         var name = yield create_domain_name_from_media (install_media);
>-        var fullscreen = true;
>         if (install_media is UnattendedInstaller) {
>-            var unattended = install_media as UnattendedInstaller;
>-
>             var hostname = name.replace (" ", "-");
>-            yield unattended.setup (hostname, cancellable);
>-            fullscreen = !unattended.express_install;
>+            yield (install_media as UnattendedInstaller).setup (hostname, cancellable);

code testing types at runtime and behaving differently depending on the result is evil. Here it really looks like you're emulating a virtual method call by hand.

Comment 42 Christophe Fergeau 2012-06-06 13:20:24 UTC

Comment on attachment 215706 [details] [review]
Warn about unavailability of KVM

>From 60900dc4a6406855697c380fd6909175d008274f Mon Sep 17 00:00:00 2001
>From: "Zeeshan Ali (Khattak)" <zeeshanak@gnome.org>
>Date: Wed, 16 May 2012 19:49:29 +0300
>Subject: [PATCH] Warn about unavailability of KVM

>@@ -486,6 +489,11 @@ private class Boxes.Wizard: Boxes.UI {
> 
>         summary = new WizardSummary ();
>         vbox.pack_start (summary.widget, true, true);
>+        nokvm_label = new Gtk.Label (_("Virtualization extensions are unavailable on your host system. Expect this box to be extremely slow. If your host system is recent enough (made in or after 2008), these extensions are probably available on your system and only need to be enabled from BIOS setup."));

Patch looks good, I'd "s/host system/system" in the message, and change the end to "You may need to enable them in your computer BIOS" or something less affirmative than the current message. Dunno if the gory bits of this message should be hidden in a "details" section. 

>+        nokvm_label.get_style_context ().add_class ("boxes-logo-notice-label");
>+        nokvm_label.wrap = true;
>+        nokvm_label.max_width_chars = 50;
>+        vbox.pack_start (nokvm_label, false, false);
>         vbox.show_all ();
> 
>         /* topbar */
>-- 
>1.7.10.2

Comment 43 Zeeshan Ali 2012-06-06 15:20:34 UTC

(In reply to comment #40)
> (From update of attachment 215704 [details] [review])
> >-    private void set_os_config (Domain domain, InstallerMedia? install_media = null) {
> >+    private void set_post_install_os_config (Domain domain) {
> >+        var os = domain.get_os ();
> >+
> >+        os.set_kernel (null);
> >+        os.set_ramdisk (null);
> >+        os.set_cmdline (null);
> >+
> >+        var boot_devices = os.get_boot_devices ();
> >+        boot_devices.remove (DomainOsBootDevice.CDROM);
> >+        os.set_boot_devices (boot_devices);
> >+    }
> 
> Isn't this whole function unrelated to this change? 

Not exactly, set_os_config() would then need to have yet another 'if' for the guest_caps being passed or not so better refactor this now rather than complicate it even more with this change.
 
> >+    private CapabilitiesGuest get_best_guest_caps (Capabilities   caps,
> >+                                                   InstallerMedia install_media) throws VMConfiguratorError {
> >+        var guests_caps = caps.get_guests ();
> >+
> >+        // First find all compatible guest caps
> >+        var compat_guests_caps = new GLib.List<CapabilitiesGuest> ();
> >+        foreach (var guest_caps in guests_caps) {
> >+            var guest_arch = guest_caps.get_arch ().get_name ();
> >+
> >+            if (install_media.is_architecture_compatible (guest_arch))
> >+                compat_guests_caps.append (guest_caps);
> >+        }
> >+
> >+        // Now lets see if there is any KVM-enabled guest caps
> >+        foreach (var guest_caps in compat_guests_caps)
> >+            if (guest_kvm_enabled (guest_caps))
> >+                return guest_caps;
> >+
> >+        // No KVM-enabled guest caps :( We at least need Qemu
> >+        foreach (var guest_caps in compat_guests_caps)
> >+            if (guest_is_qemu (guest_caps))
> >+                return guest_caps;
> >+
> 
> The 3 loops above can be simplified:

I see it as more efficient, rather than simplified. Since we don't usually have tons of capabilities, I'd go for readability rather than efficiency. Anyways, no strong feelings so I'd go for you suggestion if there wasn't one small issue:

> var compat_guests_caps = new GLib.List<CapabilitiesGuest> ();
> Bleh qemu_caps;
> foreach (var guest_caps in guests_caps) {
>     var guest_arch = guest_caps.get_arch ().get_name ();
>     if (!install_media.is_architecture_compatible (guest_arch))
>         continue;
>     if (guest_kvm_enabled (guest_caps))
>         return guest_caps; 
>     if (guest_is_qemu (guest_caps))
>         qemu_caps = guest_caps;

Since we don't break here, we'll end-up choosing the last qemu caps in the list. I think we want to take the first qemu caps available when there is no kvm caps.

> >+    private static bool guest_kvm_enabled (CapabilitiesGuest guest_caps) {
> 
> The naming comes directly from libvirt XML, but associating kvm with guest
> surprises me a bit, I tend to associate this with the hypervisor, not the
> guest.
> 
> >+        var arch = guest_caps.get_arch ();
> >+        foreach (var domain in arch.get_domains ())
> >+            if (domain.get_virt_type () == DomainVirtType.KVM)
> >+                return true;
> >+
> >+        return false;
> >+    }
> >+
> >+    private static bool guest_is_qemu (CapabilitiesGuest guest_caps) {
> >+        var arch = guest_caps.get_arch ();
> >+        foreach (var domain in arch.get_domains ())
> >+            if (domain.get_virt_type () == DomainVirtType.QEMU)
> >+                return true;
> >+
> >+        return false;
> >+    }
> >+
> >+    private static bool guest_supports_feature (CapabilitiesGuest guest_caps, string feature_name) {
> >+        var supports = false;
> >+
> >+        foreach (var feature in guest_caps.get_features ())
> >+            if (feature_name == feature.get_name ()) {
> >+                supports = true;
> >+
> >+                break;
> >+            }
> >+
> >+        return supports;
> >+    }
> 
> Would definitely be nicer to tell libvirt to just use the host caps ;)

Sure, we'll need to add more libvirt-gconfig api for that '<cpu mode=../>'.

Comment 44 Zeeshan Ali 2012-06-06 15:31:22 UTC

(In reply to comment #41)
> (From update of attachment 215705 [details] [review])
> >From a5c61d45a4f7b87e5ff176c67fa65784f87fa086 Mon Sep 17 00:00:00 2001
> >From: "Zeeshan Ali (Khattak)" <zeeshanak@gnome.org>
> >Date: Wed, 16 May 2012 16:46:00 +0300
> >Subject: [PATCH] Create VM before 'review' page
> 
> >The reason I decided to go with the latter approach was that I recalled
> >mccann saying that we should ensure that we have everything ready at the
> >'review' page. I agree with that since its better to error out during
> >preparation and setup rather than after we told user that everything is
> >ready, she/he clicks create and we go *boom*.
> 
> As long as we make 100% sure (and I mean 100% not 99%) that the VM will be
> fully destroyed/cleaned up (including storage, ...) when there is a failure,
> why not. Speaking of which, are things properly cleaned up if the user closes
> Boxes while the wizard is running? I couldn't really spot this in the code.

Yeah:
         cancel.clicked.connect (() => {
+            if (machine != null) {
+                machine.delete ();

Also done when going backwards from 'Review' page.

> One
> way to handle this could be to have some temporary object which does the VM
> destruction/cleanup in its destructor, and pass that around in the wizard. Then
> when we are 100% sure the VM will start at the end of the wizard, we can steal
> the VM from this object, otherwise we let the object die, and regardless of how
> this happens (unexpected exception, ...), the cleanup is still done. 

I don't see the advantage to that approach since to delete the VM, we'll need to unref this holder object so if we'll remember to do that, whey not just remember to call machine.delete?

> >         var name = yield create_domain_name_from_media (install_media);
> >-        var fullscreen = true;
> >         if (install_media is UnattendedInstaller) {
> >-            var unattended = install_media as UnattendedInstaller;
> >-
> >             var hostname = name.replace (" ", "-");
> >-            yield unattended.setup (hostname, cancellable);
> >-            fullscreen = !unattended.express_install;
> >+            yield (install_media as UnattendedInstaller).setup (hostname, cancellable);
> 
> code testing types at runtime and behaving differently depending on the result
> is evil. Here it really looks like you're emulating a virtual method call by
> hand.

Agreed! Though since the code was already like this, I better fix this in a separate patch/bug.

Comment 45 Christophe Fergeau 2012-06-06 15:42:24 UTC

(In reply to comment #44)
> > As long as we make 100% sure (and I mean 100% not 99%) that the VM will be
> > fully destroyed/cleaned up (including storage, ...) when there is a failure,
> > why not. Speaking of which, are things properly cleaned up if the user closes
> > Boxes while the wizard is running? I couldn't really spot this in the code.
> 
> Yeah:
>          cancel.clicked.connect (() => {
> +            if (machine != null) {
> +                machine.delete ();
> 
> Also done when going backwards from 'Review' page.
> 

If the user click on the "close" WM button, a click on the 'cancel' button is triggered ??

> > One
> > way to handle this could be to have some temporary object which does the VM
> > destruction/cleanup in its destructor, and pass that around in the wizard. Then
> > when we are 100% sure the VM will start at the end of the wizard, we can steal
> > the VM from this object, otherwise we let the object die, and regardless of how
> > this happens (unexpected exception, ...), the cleanup is still done. 
> 
> I don't see the advantage to that approach since to delete the VM, we'll need
> to unref this holder object so if we'll remember to do that, whey not just
> remember to call machine.delete?

unref'ing is automatic, and will get done even in code paths we did not think about, calling machine.delete ourselves is not automatic, and it's always possible to miss one code path where it should get called.

Comment 46 Christophe Fergeau 2012-06-06 15:45:40 UTC

(In reply to comment #43)
> (In reply to comment #40)
> > Isn't this whole function unrelated to this change? 
> 
> Not exactly, set_os_config() would then need to have yet another 'if' for the
> guest_caps being passed or not so better refactor this now rather than
> complicate it even more with this change.
> 

I didn't get why it was needed, but if this is needed to make this patch simpler then the refactor can be done before this patch

Comment 47 Zeeshan Ali 2012-06-06 17:00:14 UTC

(In reply to comment #45)
> (In reply to comment #44)
> > > As long as we make 100% sure (and I mean 100% not 99%) that the VM will be
> > > fully destroyed/cleaned up (including storage, ...) when there is a failure,
> > > why not. Speaking of which, are things properly cleaned up if the user closes
> > > Boxes while the wizard is running? I couldn't really spot this in the code.
> > 
> > Yeah:
> >          cancel.clicked.connect (() => {
> > +            if (machine != null) {
> > +                machine.delete ();
> > 
> > Also done when going backwards from 'Review' page.
> > 
> 
> If the user click on the "close" WM button, a click on the 'cancel' button is
> triggered ??

Nope. Good point, I'll fix that. :)

> > > One
> > > way to handle this could be to have some temporary object which does the VM
> > > destruction/cleanup in its destructor, and pass that around in the wizard. Then
> > > when we are 100% sure the VM will start at the end of the wizard, we can steal
> > > the VM from this object, otherwise we let the object die, and regardless of how
> > > this happens (unexpected exception, ...), the cleanup is still done. 
> > 
> > I don't see the advantage to that approach since to delete the VM, we'll need
> > to unref this holder object so if we'll remember to do that, whey not just
> > remember to call machine.delete?
> 
> unref'ing is automatic, and will get done even in code paths we did not think
> about, calling machine.delete ourselves is not automatic, and it's always
> possible to miss one code path where it should get called.

If we keep the reference to object as a field/prop of wizard, we'll have to explicitly call 'object = null' to loose the reference. As we currently have cyclic references (which I wanted to get rid of in the beginning but was discouraged from doing), wizard object itself doesn't get destroyed on exit of boxes so this approach wouldn't work.

Comment 48 Zeeshan Ali 2012-06-06 23:54:10 UTC

Created attachment 215810 [details] [review]
Refactor VMConfigurator.set_os_config()

Comment 49 Zeeshan Ali 2012-06-06 23:54:27 UTC

Created attachment 215811 [details] [review]
Create VM based on host capabilities

Choose the virtualization type (KVM or Qemu), architecture and features
(acpi, apic and pae) of the created domain based on host capabilities and
installer media.

Comment 50 Zeeshan Ali 2012-06-06 23:54:45 UTC

Created attachment 215812 [details] [review]
Create VM before 'review' page

Comment 51 Zeeshan Ali 2012-06-06 23:54:55 UTC

Created attachment 215813 [details] [review]
Warn about unavailability of KVM

Comment 52 Zeeshan Ali 2012-06-07 03:40:25 UTC

(In reply to comment #32)
> Review of attachment 215379 [details] [review]:
> 
> ::: src/vm-configurator.vala
> @@ +47,3 @@
> +        if (guest_supports_feature (best_caps, "pae"))
> +            features += "pae";
> +        domain.set_features (features);
> 
> There are some other interesting features that at least I have not gotten
> enabled in the VM I created with boxes:
> 
> sep - fast sysenter/leave
> rdtscp - faster timer
> constant_tsc - seems to be useful as per
> http://docs.fedoraproject.org/en-US/Fedora/13/html/Virtualization_Guide/chap-Virtualization-KVM_guest_timing_management.html
> sse3, sse4_1, sse4_2 - These might be important for e.g. the software fallback
> gnome-shell
> avx - another sse kinda thing, has support in pixman

Looking at the output of `virsh capabilities` on my machine and libvirt XML docs, I don't think we need to explicitly enable these features. As far as I can tell this 'features' elements is only for those features that can be turned off and on:

http://libvirt.org/formatdomain.html#elementsFeatures

Also there seems to be a toplevel 'features' element (the one we use already) and 'feature' elements under 'cpu' node. Don't know whats the diff but that 'host-model' deals with the latter.

Comment 53 Christophe Fergeau 2012-06-07 12:14:02 UTC

Comment on attachment 215810 [details] [review]
Refactor VMConfigurator.set_os_config()


> 
>-    private void set_os_config (Domain domain, InstallerMedia? install_media = null) {
>+    private void set_post_install_os_config (Domain domain) {
>+        var os = domain.get_os ();
>+
>+        os.set_kernel (null);
>+        os.set_ramdisk (null);
>+        os.set_cmdline (null);

These 3 things belong to fedora-installer.vala since they are added by fedora-installer.vala
Looks good otherwise.

Comment 54 Christophe Fergeau 2012-06-07 12:28:14 UTC

Comment on attachment 215811 [details] [review]
Create VM based on host capabilities


>diff --git a/src/vm-configurator.vala b/src/vm-configurator.vala
>index 1f74734..ff367d6 100644
>--- a/src/vm-configurator.vala
>+++ b/src/vm-configurator.vala
>@@ -106,8 +124,6 @@ private class Boxes.VMConfigurator {
>             if (disk_type == DomainDiskGuestDeviceType.CDROM || disk_type == DomainDiskGuestDeviceType.FLOPPY)
>                 disk.set_startup_policy (DomainDiskStartupPolicy.OPTIONAL);
>         }
>-
>-        domain.set_devices (devices);
>     }

As far as I can tell this is unrelated to this patch. Looks good otherwise.

Comment 55 Christophe Fergeau 2012-06-07 12:29:27 UTC

Comment on attachment 215813 [details] [review]
Warn about unavailability of KVM

>From 57e6ac0d145922c0fb7549b7a2988e685763beaf Mon Sep 17 00:00:00 2001
>From: "Zeeshan Ali (Khattak)" <zeeshanak@gnome.org>
>Date: Wed, 16 May 2012 19:49:29 +0300
>Subject: [PATCH] Warn about unavailability of KVM

Looks good

Comment 56 Zeeshan Ali 2012-06-07 13:41:23 UTC

(In reply to comment #53)
> (From update of attachment 215810 [details] [review])
> 
> > 
> >-    private void set_os_config (Domain domain, InstallerMedia? install_media = null) {
> >+    private void set_post_install_os_config (Domain domain) {
> >+        var os = domain.get_os ();
> >+
> >+        os.set_kernel (null);
> >+        os.set_ramdisk (null);
> >+        os.set_cmdline (null);
> 
> These 3 things belong to fedora-installer.vala since they are added by
> fedora-installer.vala
> Looks good otherwise.

True but since these were already here, pushing them to fedora-installer.vala would be a separate patch..

Comment 57 Zeeshan Ali 2012-06-07 13:48:07 UTC

Created attachment 215851 [details] [review]
Create VM based on host capabilities

Choose the virtualization type (KVM or Qemu), architecture and features
(acpi, apic and pae) of the created domain based on host capabilities and
installer media.

Comment 58 Zeeshan Ali 2012-06-07 13:51:22 UTC

(In reply to comment #56)
> (In reply to comment #53)
> > (From update of attachment 215810 [details] [review] [details])
> > 
> > > 
> > >-    private void set_os_config (Domain domain, InstallerMedia? install_media = null) {
> > >+    private void set_post_install_os_config (Domain domain) {
> > >+        var os = domain.get_os ();
> > >+
> > >+        os.set_kernel (null);
> > >+        os.set_ramdisk (null);
> > >+        os.set_cmdline (null);
> > 
> > These 3 things belong to fedora-installer.vala since they are added by
> > fedora-installer.vala
> > Looks good otherwise.
> 
> True but since these were already here, pushing them to fedora-installer.vala
> would be a separate patch..

Moreover, this is not so trivial as we don't have access to installer_media anymore at this point.

Comment 59 Christophe Fergeau 2012-06-07 13:58:30 UTC

(In reply to comment #56) 
> True but since these were already here, pushing them to fedora-installer.vala
> would be a separate patch..

No they were not already there, this patch adds them.

Comment 60 Zeeshan Ali 2012-06-07 14:05:36 UTC

(In reply to comment #59)
> (In reply to comment #56) 
> > True but since these were already here, pushing them to fedora-installer.vala
> > would be a separate patch..
> 
> No they were not already there, this patch adds them.

Ah yes, they were not needed before. You know my patch better than me it seems. :) Anyway, do you think it totally not belongs here so its worth it to address this right now?

Comment 61 Christophe Fergeau 2012-06-07 14:21:25 UTC

(In reply to comment #60)
> (In reply to comment #59)
> > (In reply to comment #56) 
> > > True but since these were already here, pushing them to fedora-installer.vala
> > > would be a separate patch..
> > 
> > No they were not already there, this patch adds them.
> 
> Ah yes, they were not needed before. You know my patch better than me it seems.
> :) Anyway, do you think it totally not belongs here so its worth it to address
> this right now?

Given that fedora is the only one to use this, yes this looks totally out of place. However, it would probably be fine to just drop these 3 lines? Or are they causing some problems?

Comment 62 Zeeshan Ali 2012-06-07 14:23:14 UTC

(In reply to comment #61)
> (In reply to comment #60)
> > (In reply to comment #59)
> > > (In reply to comment #56) 
> > > > True but since these were already here, pushing them to fedora-installer.vala
> > > > would be a separate patch..
> > > 
> > > No they were not already there, this patch adds them.
> > 
> > Ah yes, they were not needed before. You know my patch better than me it seems.
> > :) Anyway, do you think it totally not belongs here so its worth it to address
> > this right now?
> 
> Given that fedora is the only one to use this, yes this looks totally out of
> place. However, it would probably be fine to just drop these 3 lines? Or are
> they causing some problems?

Of course they'll cause problems since after installation we don't want to use direct boot for passing kickstart parameters.

Comment 63 Zeeshan Ali 2012-06-16 02:35:49 UTC

Created attachment 216547 [details] [review]
Refactor VMConfigurator.set_os_config()

This version addresses the issue pointed out in comment#53.

Comment 64 Zeeshan Ali 2012-06-16 02:39:49 UTC

Created attachment 216548 [details] [review]
Create VM before 'review' page

This version ensures 'create' button isn't enabled before everything is ready for 'review' page.

Comment 65 Zeeshan Ali 2012-06-16 02:40:31 UTC

Created attachment 216549 [details] [review]
More robust installer media clean-up

* Never mind if files to be deleted don't exist.
* Unset some fields at clean-up to make InstallerMedia more re-entrant.

Comment 66 Christophe Fergeau 2012-06-18 10:40:24 UTC

Review of attachment 216547 [details] [review]:

::: src/vm-configurator.vala
@@ +208,3 @@
+        var boot_devices = old_os.get_boot_devices ();
+        boot_devices.remove (DomainOsBootDevice.CDROM);
+        os.set_boot_devices (boot_devices);

Any specific reason for doing this rather than creating a new list with just a DomainOsBootDevice.HD element?
Looks good otherwise, and this question is just curiousity at this point ;)

Comment 67 Christophe Fergeau 2012-06-18 10:42:29 UTC

Review of attachment 216549 [details] [review]:

::: src/fedora-installer.vala
@@ +87,3 @@
+                debug ("Removed '%s'.", kernel_path);
+            } catch (IOError.NOT_FOUND e) {}
+

Please add some logs saying the removal of the file failed

@@ +96,3 @@
+                initrd_file.delete ();
+                debug ("Removed '%s'.", initrd_path);
+            } catch (IOError.NOT_FOUND e) {}

same here, looks good otherwise

Comment 68 Christophe Fergeau 2012-06-18 11:27:50 UTC

Review of attachment 216548 [details] [review]:

> Now its an interesting question whether we should only create configuration before hand or the whole machine.

Since it's apparently not possible to have the new VM be automatically cleaned up when an error or something unexpected happen, I'd prefer if we only went with creating the configuration beforehand. This avoids the need to sprinkle VM deletion all over the wizard code, and this makes it less likely that we'll leak a VM. Given the number of scenarios when one can get "Boxes creation failed" which we are only starting to get under control, I don't think I'm being too pessimistic when I think with this change, we'll need to add more cleanup work.

::: src/vm-creator.vala
@@ -77,3 @@
             ulong signal_id = 0;
 
             signal_id = App.app.notify["ui-state"].connect (() => {

All of the changes in this file could go in a preparation commit splitting create_and_lauch_vm into create_vm and lauch_vm but calling these from the same place as before. This would make the gist of this commit (moving VM creation) easier to review

::: src/wizard.vala
@@ +65,3 @@
                 case WizardPage.REVIEW:
+                    review.begin ((source, result) => {
+                        if (!review.end (result))

Shouldn't wizard.cleanup() be called when review fails?

@@ +84,3 @@
+                        machine.delete ();
+                        machine = null;
+                    }

wizard.cleanup()?

@@ +149,3 @@
     }
 
+    public void cleanup () {

I'd prefer a more explicit name (destroy_machine?) so that future eyes looking at this code know what this is cleaning up.

@@ +534,3 @@
+                machine.delete ();
+                machine = null;
+            }

wizard.cleanup()?

Comment 69 Zeeshan Ali 2012-06-18 18:59:19 UTC

(In reply to comment #68)
> Review of attachment 216548 [details] [review]:
> 
> > Now its an interesting question whether we should only create configuration before hand or the whole machine.
> 
> Since it's apparently not possible to have the new VM be automatically cleaned
> up when an error or something unexpected happen, I'd prefer if we only went
> with creating the configuration beforehand. This avoids the need to sprinkle VM
> deletion all over the wizard code, and this makes it less likely that we'll
> leak a VM. Given the number of scenarios when one can get "Boxes creation
> failed" which we are only starting to get under control, I don't think I'm
> being too pessimistic when I think with this change, we'll need to add more
> cleanup work.

There is only 4 scenerios where we need to do the cleanup (launch failed, 'back' from review, cancellation and quit of boxes) and we have them all covered already.

I wouldn't worry too much about the scenario where we accidently leak the VM as its very easy to delete the VM if that happens. Not saying its not a bug but only that its not a big one to worry too much about.

BTW, in the commit log I did provide arguments in favor of my decision on your request.

> ::: src/vm-creator.vala
> @@ -77,3 @@
>              ulong signal_id = 0;
> 
>              signal_id = App.app.notify["ui-state"].connect (() => {
> 
> All of the changes in this file could go in a preparation commit splitting
> create_and_lauch_vm into create_vm and lauch_vm but calling these from the same
> place as before. This would make the gist of this commit (moving VM creation)
> easier to review

Ok, i'll try to split..

> ::: src/wizard.vala
> @@ +65,3 @@
>                  case WizardPage.REVIEW:
> +                    review.begin ((source, result) => {
> +                        if (!review.end (result))
> 
> Shouldn't wizard.cleanup() be called when review fails?

Currently there is no need since review only fails if vm creation failed but yeah, we should do that to be future-safe.
 
> @@ +84,3 @@
> +                        machine.delete ();
> +                        machine = null;
> +                    }
> 
> wizard.cleanup()?
> 
> @@ +149,3 @@
>      }
> 
> +    public void cleanup () {
> 
> I'd prefer a more explicit name (destroy_machine?) so that future eyes looking
> at this code know what this is cleaning up.

My point with cleanup was that it could be a generic public cleanup function that we could add other stuff to in future if we need to do any other cleanups. Now that I think of it, i think i should have a private destroy_machine function and call that from cleanup and other places where we need to delete the machine that you pointed to.

Comment 70 Zeeshan Ali 2012-06-18 19:12:23 UTC

(In reply to comment #67)
> Review of attachment 216549 [details] [review]:
> 
> ::: src/fedora-installer.vala
> @@ +87,3 @@
> +                debug ("Removed '%s'.", kernel_path);
> +            } catch (IOError.NOT_FOUND e) {}
> +
> 
> Please add some logs saying the removal of the file failed

Its not a problem if file was already removed. This is the same as '-f' option of 'rm' command.

Comment 71 Zeeshan Ali 2012-06-19 01:48:42 UTC

(In reply to comment #69)
> (In reply to comment #68)
> > Review of attachment 216548 [details] [review] [details]:
> > 
> > ::: src/wizard.vala
> > @@ +65,3 @@
> >                  case WizardPage.REVIEW:
> > +                    review.begin ((source, result) => {
> > +                        if (!review.end (result))
> > 
> > Shouldn't wizard.cleanup() be called when review fails?
> 
> Currently there is no need since review only fails if vm creation failed but
> yeah, we should do that to be future-safe.

While I was about to do that, I realized that clean-up *is* already done in this case as we go backwards when review returns false and that trigers the following code below, just like when user hits 'back' button:


           if (forwards) {
           ...
           } else {
                switch (page) {
                case WizardPage.REVIEW:
                    destroy_machine ();
                    break;
                }
           }

Comment 72 Zeeshan Ali 2012-06-19 01:52:17 UTC

Created attachment 216712 [details] [review]
Separate methods for creation & launch of VM

Comment 73 Zeeshan Ali 2012-06-19 01:52:51 UTC

Created attachment 216713 [details] [review]
Create VM before 'review' page

Comment 74 Christophe Fergeau 2012-06-19 08:20:28 UTC

(In reply to comment #71)
> While I was about to do that, I realized that clean-up *is* already done in
> this case as we go backwards when review returns false and that trigers the
> following code below, just like when user hits 'back' button:
> 

This just feels very fragile and will break as soon as this code gets changed :(

Comment 75 Christophe Fergeau 2012-06-19 08:57:05 UTC

(In reply to comment #69)
> There is only 4 scenerios where we need to do the cleanup (launch failed,
> 'back' from review, cancellation and quit of boxes) and we have them all
> covered already.

We have them all covered *now*, no idea how the code will be in 6 months from now. Having to do this explicit cleanup adds maintainance overhead. Moreover, the fact that at least one cleanup path was initially missing makes me worried that maybe there are more forgotten paths like this one.
Moreover, this has to be contrasted with the fact that the current code does not add have these maintainance issues.


> 
> I wouldn't worry too much about the scenario where we accidently leak the VM as
> its very easy to delete the VM if that happens. Not saying its not a bug but
> only that its not a big one to worry too much about.

I beg to differ, it's a very visible bug since the non functional VM will show up in the VM list. Once again, this cannot happen with the current code (or far less).

> 
> BTW, in the commit log I did provide arguments in favor of my decision on your
> request.

Yes thanks for that. To be honest,  "it makes for a much better end-user experience if user has to wait less on each step rather than wait a long time on one step" seems bogus to me, booting the guest OS will be much slower than creating the VM, and the user will expect this step to take a long time, so it's better to be slow one time there rather than having various wizard pages lag because we are doing various work

"I agree with that since its better to error out during
preparation and setup rather than after we told user that everything is
ready, she/he clicks create and we go *boom*."

While I can agree with the general idea, I don't expect the VM to be created before I tell it to go ahead on the review page. To me the review page means "Here what will get created if you go ahead". So an error message telling me the VM failed to be created before I click "next" in the review page would be very surprising.
Won't this also make things more complicated when we add the possibility to change disk size from the review page?

In short, my position is that this patch makes the code more complex and error-prone, and this doesn't come with huge benefits, that's why I'm rather unenthusiastic about it :-/

> 
> > ::: src/vm-creator.vala
> > @@ -77,3 @@
> >              ulong signal_id = 0;
> > 
> >              signal_id = App.app.notify["ui-state"].connect (() => {
> > 
> > All of the changes in this file could go in a preparation commit splitting
> > create_and_lauch_vm into create_vm and lauch_vm but calling these from the same
> > place as before. This would make the gist of this commit (moving VM creation)
> > easier to review
> 
> Ok, i'll try to split..

Oh I used a conditional, I didn't mean that it had to be done, it was mainly meant as something to keep in mind next time you push patches for review ;) Thanks for doing the work!

Comment 76 Christophe Fergeau 2012-06-19 09:39:26 UTC

Review of attachment 216712 [details] [review]:

Looks good apart from a few comments.

::: src/vm-creator.vala
@@ +74,1 @@
+        if (!(install_media is UnattendedInstaller) || !(install_media as UnattendedInstaller).express_install) {

InstallerMedia could probably offer a 'start_in_fullscreen' property (just a comment, you can change it if you want but you don't have to)

@@ -76,1 @@
-        if (machine != null && fullscreen) {

We lost this null test after the patch, is it intended?

Comment 77 Zeeshan Ali 2012-06-19 13:22:44 UTC

(In reply to comment #75)
> (In reply to comment #69)
> > There is only 4 scenerios where we need to do the cleanup (launch failed,
> > 'back' from review, cancellation and quit of boxes) and we have them all
> > covered already.
> 
> We have them all covered *now*, no idea how the code will be in 6 months from
> now. Having to do this explicit cleanup adds maintainance overhead. Moreover,
> the fact that at least one cleanup path was initially missing makes me worried
> that maybe there are more forgotten paths like this one.
> Moreover, this has to be contrasted with the fact that the current code does
> not add have these maintainance issues.

A solution to that would be to come-up with a test plan for QA teams for all these scenarios so we don't introduce any such bug in an release cycle.
 
> > I wouldn't worry too much about the scenario where we accidently leak the VM as
> > its very easy to delete the VM if that happens. Not saying its not a bug but
> > only that its not a big one to worry too much about.
> 
> I beg to differ, it's a very visible bug since the non functional VM will show
> up in the VM list. Once again, this cannot happen with the current code (or far
> less).

We really should ensure that VM creation shouldn't fail in the first place. If that scenario isn't rare enough, we have failed miserably in our goals and having a nonfunctional VM (that could be easily deleted) shown in the UI is nothing compared to that failure from us.

> > BTW, in the commit log I did provide arguments in favor of my decision on your
> > request.
> 
> Yes thanks for that. To be honest,  "it makes for a much better end-user
> experience if user has to wait less on each step rather than wait a long time
> on one step" seems bogus to me, booting the guest OS will be much slower than
> creating the VM, and the user will expect this step to take a long time, so
> it's better to be slow one time there rather than having various wizard pages
> lag because we are doing various work

Why would user expect this step to take long? Why would setup phase expected to be quick?

> "I agree with that since its better to error out during
> preparation and setup rather than after we told user that everything is
> ready, she/he clicks create and we go *boom*."
> 
> While I can agree with the general idea, I don't expect the VM to be created
> before I tell it to go ahead on the review page.

And its not getting created from the user's POV. Not every user (like you) would know whats actually happening behind the scene. :)

> To me the review page means
> "Here what will get created if you go ahead". So an error message telling me
> the VM failed to be created before I click "next" in the review page would be
> very surprising.

You can ask the designers about this but to me its obvious that after all the steps when we tell user that "everything is ready", it really should be. All chances of failures should be moved away from from 'create' button actions as much as possible.

> Won't this also make things more complicated when we add the possibility to
> change disk size from the review page?

Why would it? VM is not launched yet so you can change its config as much as you like and there is no need to reboot or anything for them to take any effect.
 
> In short, my position is that this patch makes the code more complex and
> error-prone, and this doesn't come with huge benefits, that's why I'm rather
> unenthusiastic about it :-/

Sorry that I failed to convince you. :(

Comment 78 Zeeshan Ali 2012-06-19 13:42:00 UTC

(In reply to comment #76)
> Review of attachment 216712 [details] [review]:
> 
> Looks good apart from a few comments.
> 
> ::: src/vm-creator.vala
> @@ +74,1 @@
> +        if (!(install_media is UnattendedInstaller) || !(install_media as
> UnattendedInstaller).express_install) {
> 
> InstallerMedia could probably offer a 'start_in_fullscreen' property (just a
> comment, you can change it if you want but you don't have to)
> 
> @@ -76,1 @@
> -        if (machine != null && fullscreen) {
> 
> We lost this null test after the patch, is it intended?

While the removal of null check in the location was intended (the param isn't nullable), I see that we are kindof silently ignore the error if creation of 'machine' fails in app.vala. I'll fix that as part of this patch..

Comment 79 Christophe Fergeau 2012-06-19 15:17:59 UTC

(In reply to comment #77)
> (In reply to comment #75)

> A solution to that would be to come-up with a test plan for QA teams for all
> these scenarios so we don't introduce any such bug in an release cycle.

Uh? You can write test plans to make sure we don't regress, I don't see how you can write test plans taking into account unknown future changes to this code.
And once again, with the current code, we don't have to be worried about this!

> We really should ensure that VM creation shouldn't fail in the first place. If
> that scenario isn't rare enough, we have failed miserably in our goals and
> having a nonfunctional VM (that could be easily deleted) shown in the UI is
> nothing compared to that failure from us.

We do even better now and don't have ghosts VMs in the UI even when things fail, which still happen for some of our testers.

> 
> Why would user expect this step to take long? Why would setup phase expected to
> be quick?

You can move backward/forward in the setup phase, so I'd rather not have perceptible lag when someone does this. On the other hand, when you go forward on the review page, you can no longer go backward, and you are booting an OS, which will take time anyway, so delays are more acceptable here.


> 
> And its not getting created from the user's POV. Not every user (like you)
> would know whats actually happening behind the scene. :)

The user won't know until there is some kind of failure and he gets an error message about it, or a leaked VM in the UI... That's why I specifically mentioned error messages.

> 
> 
> You can ask the designers about this but to me its obvious that after all the
> steps when we tell user that "everything is ready", it really should be. All
> chances of failures should be moved away from from 'create' button actions as
> much as possible.

In your own words "We really should ensure that VM creation shouldn't fail in the first place. If that scenario isn't rare enough, we have failed miserably in our goals and having [...] is nothing compared to that failure from us."



> > Won't this also make things more complicated when we add the possibility to
> > change disk size from the review page?
> 
> Why would it? VM is not launched yet so you can change its config as much as
> you like and there is no need to reboot or anything for them to take any
> effect.

"more complicated" = code to create the image with the default size + code to resize an already created disk image because we created it too early instead of creating an image directly with the right size...

Comment 80 Christophe Fergeau 2012-06-20 12:20:18 UTC

(In reply to comment #70)
> (In reply to comment #67)
> > Review of attachment 216549 [details] [review] [details]:
> > 
> > ::: src/fedora-installer.vala
> > @@ +87,3 @@
> > +                debug ("Removed '%s'.", kernel_path);
> > +            } catch (IOError.NOT_FOUND e) {}
> > +
> > 
> > Please add some logs saying the removal of the file failed
> 
> Its not a problem if file was already removed. This is the same as '-f' option
> of 'rm' command.

Is it something that we expect to happen? If this can happen because of the way the code is written then I agree with you. If we are not supposed to get there with no file to remove, then a g_debug or a g_message would give us a hint that something unexpected happened.

Comment 81 Zeeshan Ali 2012-06-20 16:21:36 UTC

Created attachment 216851 [details] [review]
Error out on failure to create LibvirtMachine for new domains

Comment 82 Zeeshan Ali 2012-06-20 16:32:18 UTC

Created attachment 216852 [details] [review]
More robust installer media clean-up

* Never mind if files to be deleted don't exist.
* Unset some fields at clean-up to make InstallerMedia more re-entrant.

Comment 83 Zeeshan Ali 2012-06-20 16:43:30 UTC

So for the record, I did try to change "Create VM before 'review' page" to only create config before review page rather than the whole VM. The problem is that we can't create complete domain configuration before creation of main storage volume as we can only reliably get the path of the volume *after* the volume is created. So we'll either have to have a partial domain config at review and we add the volume's path at creation time (seems pretty ugly to me) or we create the volume before review page (which brings us to the same potential issues as with pre-creating the VM).

Comment 84 Zeeshan Ali 2012-06-21 19:17:13 UTC

Created attachment 216956 [details] [review]
Create VM before 'review' page

Add to comment that alternative was tried and what issues it had.

Comment 85 Zeeshan Ali 2012-06-21 19:18:00 UTC

(In reply to comment #84)
> Created an attachment (id=216956) [details] [review]
> Create VM before 'review' page
> 
> Add to comment that alternative was tried and what issues it had.

comment -> commit log message.

Comment 86 Christophe Fergeau 2012-06-22 10:08:02 UTC

Review of attachment 216851 [details] [review]:

Looks good apart from one comment

::: src/app.vala
@@ +189,3 @@
+    }
+
+    private LibvirtMachine? try_add_domain (CollectionSource source, GVir.Connection connection, GVir.Domain domain) {

The return value doesn't seem to be used

Comment 87 Christophe Fergeau 2012-06-22 10:09:56 UTC

Review of attachment 216852 [details] [review]:

Just one comment

::: src/util.vala
@@ +546,3 @@
     }
+
+    public void delete_file (File file) throws GLib.Error {

It doesn't throw, does it?

Comment 88 Christophe Fergeau 2012-06-22 10:16:44 UTC

Review of attachment 216956 [details] [review]:

I still think this is a wrong move from a robustness/maintainability point of view, but you want to get that in nonetheless. Doing a careful review doesn't seem very useful.

Comment 89 Marc-Andre Lureau 2012-06-22 10:27:53 UTC

Review of attachment 216956 [details] [review]:

nak for now

::: src/wizard.vala
@@ +66,3 @@
+                    review.begin ((source, result) => {
+                        if (!review.end (result))
+                            page = page - 1;

This is racy, we rather use sync code to avoid having "zombie" machine around because the user clicked prev/next too quickly.

Another solution is to change the page setter to an async method, and disable next/prev as long as it is not finished.

I think making the wizard page tuning sync isn't going to ruin user experience...

Comment 90 Marc-Andre Lureau 2012-06-22 10:36:15 UTC

Comment on attachment 216956 [details] [review]
Create VM before 'review' page

nack from me

> 
>                 case WizardPage.REVIEW:
>-                    if (!review ())
>-                        return;
>+                    review.begin ((source, result) => {
>+                        if (!review.end (result))
>+                            page = page - 1;
>+                    });

This is racy, we rather use sync code to avoid having "zombie" machine around because the user clicked prev/next too quickly.
Another solution is to change the page setter to an async method, and disable next/prev as long as it is not finished.
I think making the wizard page tuning sync isn't going to ruin user experience...

Comment 91 Zeeshan Ali 2012-06-22 13:28:33 UTC

(In reply to comment #87)
> Review of attachment 216852 [details] [review]:
> 
> Just one comment
> 
> ::: src/util.vala
> @@ +546,3 @@
>      }
> +
> +    public void delete_file (File file) throws GLib.Error {
> 
> It doesn't throw, does it?

It does, it only catches a very particular error and rest of errors are forwarded to the caller..

Comment 92 Zeeshan Ali 2012-06-22 13:35:48 UTC

(In reply to comment #89)
> Review of attachment 216956 [details] [review]:
> 
> nak for now
> 
> ::: src/wizard.vala
> @@ +66,3 @@
> +                    review.begin ((source, result) => {
> +                        if (!review.end (result))
> +                            page = page - 1;
> 
> This is racy, we rather use sync code to avoid having "zombie" machine around
> because the user clicked prev/next too quickly.

review method sets the sensitivity of the next_button at the beginning and end. Perhaps it should also set the sensitivity of prev_button?

Comment 93 Marc-Andre Lureau 2012-06-22 13:54:54 UTC

(In reply to comment #92)
> > This is racy, we rather use sync code to avoid having "zombie" machine around
> > because the user clicked prev/next too quickly.
> 
> review method sets the sensitivity of the next_button at the beginning and end.
> Perhaps it should also set the sensitivity of prev_button?

Even then, when clicking cancel, the operation would not be cancelled. You essentially need to handle this like a synchronous operation, so if you don't want to update the page setter yet to be async, keep the review() sync please.

Comment 94 Zeeshan Ali 2012-06-22 14:11:26 UTC

(In reply to comment #93)
> (In reply to comment #92)
> > > This is racy, we rather use sync code to avoid having "zombie" machine around
> > > because the user clicked prev/next too quickly.
> > 
> > review method sets the sensitivity of the next_button at the beginning and end.
> > Perhaps it should also set the sensitivity of prev_button?
> 
> Even then, when clicking cancel, the operation would not be cancelled. You
> essentially need to handle this like a synchronous operation, so if you don't
> want to update the page setter yet to be async, keep the review() sync please.

Well, review cant be sync with rest of the changes so I guess I'll change the page setting aysnc..

Comment 95 Zeeshan Ali 2012-06-24 18:52:50 UTC

(In reply to comment #94)
> (In reply to comment #93)
> > (In reply to comment #92)
> > > > This is racy, we rather use sync code to avoid having "zombie" machine around
> > > > because the user clicked prev/next too quickly.
> > > 
> > > review method sets the sensitivity of the next_button at the beginning and end.
> > > Perhaps it should also set the sensitivity of prev_button?
> > 
> > Even then, when clicking cancel, the operation would not be cancelled. You
> > essentially need to handle this like a synchronous operation, so if you don't
> > want to update the page setter yet to be async, keep the review() sync please.
> 
> Well, review cant be sync with rest of the changes so I guess I'll change the
> page setting aysnc..

While I was busy changing page setting to async, I realized that won't solve anything as we can't know how long will the async operation take and since UI will be responsive, the only way to deal with this is to disable the control button and set them to previous value again after async operation is done.

BTW, this issue isnt being introduce by my patch but it already exists with the async 'create' method. My patch only moves the issue to previous page AFAICT.

Comment 96 Zeeshan Ali 2012-06-25 19:16:05 UTC

Created attachment 217235 [details] [review]
Create VM before 'review' page

This version temporarily disables the controls while async review is in progress..

Comment 97 Marc-Andre Lureau 2012-06-25 19:26:12 UTC

Review of attachment 217235 [details] [review]:

::: src/wizard.vala
@@ +78,3 @@
+                        back_button.sensitive = true;
+                        next_button.sensitive = true;
+                        cancel_button.sensitive = true;

it should be possible to cancel operation, no?

Comment 98 Zeeshan Ali 2012-06-25 19:59:23 UTC

(In reply to comment #97)
> Review of attachment 217235 [details] [review]:
> 
> ::: src/wizard.vala
> @@ +78,3 @@
> +                        back_button.sensitive = true;
> +                        next_button.sensitive = true;
> +                        cancel_button.sensitive = true;
> 
> it should be possible to cancel operation, no?

Yeah, same for create method in current code. I can have a look into that afterwards but I don't see the need to do it in this patch (series). Will need the patch from bug#675788.

Comment 99 Zeeshan Ali 2012-06-26 12:56:13 UTC

Created attachment 217284 [details] [review]
Error out on failure to create LibvirtMachine for new domains

Comment 100 Marc-Andre Lureau 2012-06-26 13:19:07 UTC

Review of attachment 217284 [details] [review]:

ack

Comment 101 Zeeshan Ali 2012-06-26 13:35:13 UTC

Created attachment 217285 [details] [review]
Refactor VMConfigurator.set_os_config()

Comment 102 Zeeshan Ali 2012-06-26 13:35:26 UTC

Created attachment 217286 [details] [review]
Create VM based on host capabilities

Comment 103 Zeeshan Ali 2012-06-26 13:35:37 UTC

Created attachment 217287 [details] [review]
Separate methods for creation & launch of VM

Comment 104 Zeeshan Ali 2012-06-26 13:35:55 UTC

Created attachment 217288 [details] [review]
Create VM before 'review' page

Comment 105 Zeeshan Ali 2012-06-26 13:36:19 UTC

Created attachment 217289 [details] [review]
Warn about unavailability of KVM

Comment 106 Zeeshan Ali 2012-06-26 13:37:10 UTC

Created attachment 217290 [details] [review]
More robust installer media clean-up

Comment 107 Zeeshan Ali 2012-06-26 13:37:34 UTC

Created attachment 217291 [details] [review]
Error out on failure to create LibvirtMachine for new domains

Comment 108 Marc-Andre Lureau 2012-06-26 14:24:39 UTC

Review of attachment 217285 [details] [review]:

ack

Comment 109 Marc-Andre Lureau 2012-06-26 14:24:44 UTC

Review of attachment 217286 [details] [review]:

ack

Comment 110 Marc-Andre Lureau 2012-06-26 14:24:50 UTC

Review of attachment 217287 [details] [review]:

ack

Comment 111 Marc-Andre Lureau 2012-06-26 14:27:58 UTC

Review of attachment 217288 [details] [review]:

ack (with some reluctancy)

Comment 112 Marc-Andre Lureau 2012-06-26 14:28:04 UTC

Review of attachment 217289 [details] [review]:

ack

Comment 113 Marc-Andre Lureau 2012-06-26 14:29:08 UTC

Review of attachment 217290 [details] [review]:

I don't think delete_file() is a helper worth splitting in util.vala, since it isn't shared and is not something we could consider moving elsewhere

ack otherwise

Comment 114 Marc-Andre Lureau 2012-06-26 14:29:23 UTC

Review of attachment 217291 [details] [review]:

ack

Comment 115 Zeeshan Ali 2012-06-26 14:35:45 UTC

Attachment 217285 [details] pushed as dbce1e3 - Refactor VMConfigurator.set_os_config()
Attachment 217286 [details] pushed as 0daf0ad - Create VM based on host capabilities
Attachment 217287 [details] pushed as 452a35e - Separate methods for creation & launch of VM
Attachment 217288 [details] pushed as 13bca42 - Create VM before 'review' page
Attachment 217289 [details] pushed as c706873 - Warn about unavailability of KVM
Attachment 217290 [details] pushed as 8c467bc - More robust installer media clean-up
Attachment 217291 [details] pushed as 989cc8e - Error out on failure to create LibvirtMachine for new domains

Comment 116 Christophe Fergeau 2012-06-27 15:10:22 UTC

"452a35e - Separate methods for creation & launch of VM" breaks starting a VM from a LiveCD here.
There are at least 4 regressions caused by this series while before this basic Boxes use was starting to work well for most people. I'd be in favour of reverting this given that my concerns about this code robustness have been proven true, and I still don't think this change is a huge improvement.

Comment 117 Zeeshan Ali 2012-06-27 16:10:56 UTC

(In reply to comment #116)
> "452a35e - Separate methods for creation & launch of VM" breaks starting a VM
> from a LiveCD here.
> There are at least 4 regressions caused by this series while before this basic
> Boxes use was starting to work well for most people. 

I'm fixing those regressions. Its ok for things to sometimes break in git master as long as they are fixed at latest before a release.

> I'd be in favour of
> reverting this

Of course you are.

> given that my concerns about this code robustness have been
> proven true,

Your concerns were very much addressed and LOT of time was spent on that so I dont see any way you are going to be happy. About your concerns being true, I already myself filed bugs and provided patches to fix 2 of the issues.

Morever, as I wrote in one of the patches we *at least* need to create configuration before review. I implemented that and informed about the issues with only doing that. You were given a chance to decide if that approach should still be taken despite those issues.

There is still time to implement that so please let me know what you think of that approach.

> and I still don't think this change is a huge improvement.

We need this change warning user about unavailability of KVM. Later we'll also need this to allow changing of props at review page.. I must stress yet again, the real need is to have the config ready before review.

Comment 118 Marc-Andre Lureau 2012-06-27 16:36:30 UTC

(In reply to comment #117)
> I'm fixing those regressions. Its ok for things to sometimes break in git
> master as long as they are fixed at latest before a release.

It's not ok imho, but it's hard to avoid sometime, simply because it's hard to see them all before.

Comment 119 Christophe Fergeau 2012-06-27 16:40:43 UTC

(In reply to comment #117)
> I'm fixing those regressions. Its ok for things to sometimes break in git
> master as long as they are fixed at latest before a release.

I'm quite sure there are more to come... 
 
> Your concerns were very much addressed

Yeah, not by making changes, but by saying that all cases were covered, that it wasn't possible to leak VMs, yet there was one leak found during review, and you found an additional one after committing the patches. Sorry, but this does not make me confident at all that this is not a huge step backward stability-wise.

Comment 120 Zeeshan Ali 2012-06-27 16:58:29 UTC

(In reply to comment #118)
> (In reply to comment #117)
> > I'm fixing those regressions. Its ok for things to sometimes break in git
> > master as long as they are fixed at latest before a release.
> 
> It's not ok imho, but it's hard to avoid sometime, simply because it's hard to
> see them all before.

I didn't word it right, I meant to say what you said. :)

(In reply to comment #119)
> (In reply to comment #117)
> > I'm fixing those regressions. Its ok for things to sometimes break in git
> > master as long as they are fixed at latest before a release.
> 
> I'm quite sure there are more to come... 
> 
> > Your concerns were very much addressed
> 
> Yeah, not by making changes, but by saying that all cases were covered, that it
> wasn't possible to leak VMs, yet there was one leak found during review, and
> you found an additional one after committing the patches.

The one I found afterwards is NOT a VM leak but UI item for it. As I said in the bug about that, I'm pretty sure I tested those changes several times so its a mystery to me how I missed it in the end. I should have tested once again before pushing..

Comment 121 Zeeshan Ali 2012-07-02 16:30:22 UTC

*** Bug 679238 has been marked as a duplicate of this bug. ***