After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 781468 - Errors on deleting a VM
Errors on deleting a VM
Status: RESOLVED OBSOLETE
Product: gnome-boxes
Classification: Applications
Component: general
3.24.x
Other Linux
: Normal normal
: --
Assigned To: GNOME Boxes maintainer(s)
GNOME Boxes maintainer(s)
Depends on:
Blocks:
 
 
Reported: 2017-04-18 18:04 UTC by Fabiano Fidêncio
Modified: 2018-01-11 10:52 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Remove Source (1.01 KB, patch)
2017-07-14 20:19 UTC, Thiago
none Details | Review
delete a domain domain and remove a notification (4.09 KB, patch)
2017-07-18 19:49 UTC, Thiago
none Details | Review
Remove Signal Timeout (2.31 KB, patch)
2017-07-20 20:33 UTC, Thiago
committed Details | Review
To delete a domain (2.90 KB, patch)
2017-07-20 20:34 UTC, Thiago
reviewed Details | Review

Description Fabiano Fidêncio 2017-04-18 18:04:36 UTC
Reproducer:
- Execute Boxes from a terminal
- Delete a VM
- Keep an eye on the terminal

You'll see messages like:
(gnome-boxes:7910): Boxes-WARNING **: libvirt-machine.vala:483: Unable to delete domain: Domain not found: no domain with matching uuid '5e1cb90c-b6ef-4dfb-a0c4-24684d6571a7' (sles12sp2)

(gnome-boxes:7910): Boxes-WARNING **: libvirt-machine.vala:498: Unable to delete storage volume: Storage volume not found: no storage vol with matching name 'sles12sp2'


In case you try to install a new VM (may be the same system, may be other) the installation will fail unless you shut Boxes down, re-open it and be sure the removed VM is actually removed and there are no warnings on the command line.
Comment 1 Thiago 2017-07-14 20:19:40 UTC
Created attachment 355627 [details] [review]
Remove Source

Wondering if this will not break any other part of gnome-boxes. (I don't beliave ti will - help me here)

Need some help with this fix and patch!

I'm new here (coming from newcomers) and I was checking the code. It seems like the timeout for a Boxes.Notification never stops running. In this case every notification will stay running and executing the callback function until you close the gnome-boxes. 

 When you ask gnome-boxes to delete a virtual-machine it will create a Boxes.Notificatian and this Object will call the right function based on the user input. When the user confirm the delete by clicking on close or ignoring the message (timeout) this notification runs forever it will keep calling the delete function which will print the error messages until you close the gnome-boxes

With this patch this should be fixed. When the timeout expires it should be removed (https://valadoc.org/glib-2.0/GLib.SourceFunc.html)

Please, correct me if I'm wrong and if you have free time help me with this bug :)

Cheers,
Thiago
Comment 2 Felipe Borges 2017-07-17 10:59:42 UTC
Review of attachment 355627 [details] [review]:

(In reply to Thiago Mendes from comment #1)
> Created attachment 355627 [details] [review] [review]
> Remove Source
> 
> Wondering if this will not break any other part of gnome-boxes. (I don't
> beliave ti will - help me here)

No, it won't.

> I'm new here (coming from newcomers) and I was checking the code. It seems
> like the timeout for a Boxes.Notification never stops running. In this case
> every notification will stay running and executing the callback function
> until you close the gnome-boxes.

Correct. That's a bug.

>  When you ask gnome-boxes to delete a virtual-machine it will create a
> Boxes.Notificatian and this Object will call the right function based on the
> user input. When the user confirm the delete by clicking on close or
> ignoring the message (timeout) this notification runs forever it will keep
> calling the delete function which will print the error messages until you
> close the gnome-boxes
> 
> With this patch this should be fixed. When the timeout expires it should be
> removed (https://valadoc.org/glib-2.0/GLib.SourceFunc.html)

Yes, you are right. The callback is called infinitely.

(In reply to Fabiano Fidêncio from comment #0)
> In case you try to install a new VM (may be the same system, may be other)
> the installation will fail unless you shut Boxes down, re-open it and be
> sure the removed VM is actually removed and there are no warnings on the
> command line.


I am not sure yet whether it fixes Fabiano's issue. But anyway, the patch is desirable and welcome.

Please, address the commit message:

  notification: Remove signal source on timeout

  $WHAT_IS_THE_BUG?

  $WHAT_IS_THE_SOLUTION?

  https://bugzilla.gnome.org/show_bug.cgi?id=781468


If you have any doubt, see https://wiki.gnome.org/Git/CommitMessages

Thanks for the patch!

::: src/notification.vala
@@ +36,3 @@
         Timeout.add_seconds (this.timeout, () => {
             dismiss ();
+			/* After the timeout expires this Timeout should be removed */

no need for this comment here. The code is quite self-explanatory.
Comment 3 Thiago 2017-07-18 01:47:15 UTC
> I am not sure yet whether it fixes Fabiano's issue. But anyway, the patch is desirable and welcome.

It seems like the messages are a product of the callback being called after the domain is already deleted and it happens because the notification never dies.

After the patch I still getting a message saying it can't change the state of the domain (the one to be deleted) but that would be another bug I guess.
Comment 4 Thiago 2017-07-18 19:49:52 UTC
Created attachment 355879 [details] [review]
delete a domain domain and remove a notification

I have a question about the code to finish fixing this bug.

When I delete a domain boxes calls machine.delete this function...it also calls libvirt-machine.delete and there, it calls the machine.delete function again using: base.delete. Is this necessarie?
Comment 5 Felipe Borges 2017-07-20 13:08:17 UTC
Review of attachment 355879 [details] [review]:

> The stop function has to be called before the delete function

really? To "undefine" a domain isn't enough?

Also, the commit message should be about the file/component touched.

::: src/notification.vala
@@ +33,3 @@
         set_reveal_child (true);
+        bool ok_pressed = false;
+        bool close_pressed = false;

You should be able to make assumptions of the button state based on its availability.

If the notification was dismissed, there's no state for the button. This way, I don't think this is necessary.

@@ -40,3 @@
         });
 
-        bool ok_pressed = false;

any reason for moving this?
Comment 6 Felipe Borges 2017-07-20 13:09:45 UTC
the one-liner version of the patch that just replaced the "return true" instruction by Source.REMOVE seem to work just fine in order to prevent the multiple calls of the callback, didn't it?
Comment 7 Thiago 2017-07-20 13:52:23 UTC
(In reply to Felipe Borges from comment #6)
> the one-liner version of the patch that just replaced the "return true"
> instruction by Source.REMOVE seem to work just fine in order to prevent the
> multiple calls of the callback, didn't it?

Hi Felipe. If wait the info (undo) dialog to expire and do not click at the close button the REMOVE works just file but if you do click on the close button it does not because in that case the notification will still running until it expires and when it does it will not call the close_function again but it will run the dismiss() function for the second time and that is why I followed the same approach boxes had for the ok_pressed but now for the close_pressed.
Comment 8 Thiago 2017-07-20 14:04:37 UTC
(In reply to Felipe Borges from comment #5)
> Review of attachment 355879 [details] [review] [review]:
> 
> > The stop function has to be called before the delete function
> 
> really? To "undefine" a domain isn't enough?

the delete function was being called before the stop function and because of that libvirt wasn't able to find the domain to stop it.

Sorry because I lack on terms here (still reading about boxes) but what do you mean by: "undefine" a domain?
 
> 
> Also, the commit message should be about the file/component touched.
> 
> ::: src/notification.vala
> @@ +33,3 @@
>          set_reveal_child (true);
> +        bool ok_pressed = false;
> +        bool close_pressed = false;
> 
> You should be able to make assumptions of the button state based on its
> availability.
> 
> If the notification was dismissed, there's no state for the button. This
> way, I don't think this is necessary.
> 
> @@ -40,3 @@
>          });
>  
> -        bool ok_pressed = false;
> 
> any reason for moving this?

wrong move moving the ok_pressed up. I did because I'm used to have all the variables declared at the begining of the block but since it has nothing to do with the problem and the patch I'm moving it back for the next patch. sorry about that!
Comment 9 Felipe Borges 2017-07-20 14:11:02 UTC
(In reply to Thiago Mendes from comment #7)
> (In reply to Felipe Borges from comment #6)
> > the one-liner version of the patch that just replaced the "return true"
> > instruction by Source.REMOVE seem to work just fine in order to prevent the
> > multiple calls of the callback, didn't it?
> 
> Hi Felipe. If wait the info (undo) dialog to expire and do not click at the
> close button the REMOVE works just file but if you do click on the close
> button it does not because in that case the notification will still running
> until it expires and when it does it will not call the close_function again
> but it will run the dismiss() function for the second time and that is why I
> followed the same approach boxes had for the ok_pressed but now for the
> close_pressed.

Ok, so a cleaner way to do it would be:

  notification_timeout_id = Timeout.add_seconds (this.timeout, () => {
  ...
  ...

  close_button.clicked.connect (() => {
      dismiss ();

      Source.remove (notification_timeout_id);
  });

What do you think?
Comment 10 Felipe Borges 2017-07-20 14:15:15 UTC
(In reply to Thiago Mendes from comment #8)
> (In reply to Felipe Borges from comment #5)
> > Review of attachment 355879 [details] [review] [review] [review]:
> > 
> > > The stop function has to be called before the delete function
> > 
> > really? To "undefine" a domain isn't enough?
> 
> the delete function was being called before the stop function and because of
> that libvirt wasn't able to find the domain to stop it.
> 
> Sorry because I lack on terms here (still reading about boxes) but what do
> you mean by: "undefine" a domain?

To "undefine" in libvirt is pretty much "delete", and the case where the machine is running should be handled already, see https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Virtualization_Deployment_and_Administration_Guide/sect-virsh-delete.html
Comment 11 Thiago 2017-07-20 14:16:36 UTC
(In reply to Felipe Borges from comment #9)
> (In reply to Thiago Mendes from comment #7)
> > (In reply to Felipe Borges from comment #6)
> > > the one-liner version of the patch that just replaced the "return true"
> > > instruction by Source.REMOVE seem to work just fine in order to prevent the
> > > multiple calls of the callback, didn't it?
> > 
> > Hi Felipe. If wait the info (undo) dialog to expire and do not click at the
> > close button the REMOVE works just file but if you do click on the close
> > button it does not because in that case the notification will still running
> > until it expires and when it does it will not call the close_function again
> > but it will run the dismiss() function for the second time and that is why I
> > followed the same approach boxes had for the ok_pressed but now for the
> > close_pressed.
> 
> Ok, so a cleaner way to do it would be:
> 
>   notification_timeout_id = Timeout.add_seconds (this.timeout, () => {
>   ...
>   ...
> 
>   close_button.clicked.connect (() => {
>       dismiss ();
> 
>       Source.remove (notification_timeout_id);
>   });
> 
> What do you think?

I was looking for that solution Felipe. Don't really like the other way. Thanks a lot :D...gonna test as soon as I'm back to my computer!
Comment 12 Thiago 2017-07-20 20:33:44 UTC
Created attachment 356068 [details] [review]
Remove Signal Timeout
Comment 13 Thiago 2017-07-20 20:34:13 UTC
Created attachment 356069 [details] [review]
To delete a domain
Comment 14 Felipe Borges 2017-07-25 11:51:55 UTC
Review of attachment 356068 [details] [review]:

Thanks for your patch. lgtm
Comment 15 Felipe Borges 2017-07-25 11:53:47 UTC
Comment on attachment 356068 [details] [review]
Remove Signal Timeout

I took the liberty to took the patch a bit and push it.
Comment 16 Felipe Borges 2017-07-25 12:00:12 UTC
Review of attachment 356069 [details] [review]:

domain.delete () will stop the domain before it gets deleted, won't it?

::: src/libvirt-machine.vala
@@ +492,3 @@
+                    if (storage_volume != null) {
+                        try {
+                            storage_volume.delete (0);

Can you elaborate more about why do you think this change is necessary?
Comment 17 Thiago 2017-07-25 13:25:43 UTC
(In reply to Felipe Borges from comment #16)
> Review of attachment 356069 [details] [review] [review]:
> 
> domain.delete () will stop the domain before it gets deleted, won't it?

Asked this question at the #virt channel and people told me it won't! delete is the same as virsh undefined

> 
> ::: src/libvirt-machine.vala
> @@ +492,3 @@
> +                    if (storage_volume != null) {
> +                        try {
> +                            storage_volume.delete (0);
> 
> Can you elaborate more about why do you think this change is necessary?

What do you mean by that? Do you mean why did I move it to the try block?
Comment 18 Felipe Borges 2017-07-25 13:31:39 UTC
(In reply to Thiago Mendes from comment #17)
> Asked this question at the #virt channel and people told me it won't! delete
> is the same as virsh undefined

According to https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Virtualization_Deployment_and_Administration_Guide/sect-virsh-delete.html

"If domain is inactive, the configuration is removed completely. If the domain is active (running), it is converted to a transient domain. When the guest virtual machine becomes inactive, the configuration is removed completely."
Comment 19 GNOME Infrastructure Team 2018-01-11 10:52:40 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/gnome-boxes/issues/137.