GNOME Bugzilla – Bug 781468
Errors on deleting a VM
Last modified: 2018-01-11 10:52:40 UTC
Reproducer: - Execute Boxes from a terminal - Delete a VM - Keep an eye on the terminal You'll see messages like: (gnome-boxes:7910): Boxes-WARNING **: libvirt-machine.vala:483: Unable to delete domain: Domain not found: no domain with matching uuid '5e1cb90c-b6ef-4dfb-a0c4-24684d6571a7' (sles12sp2) (gnome-boxes:7910): Boxes-WARNING **: libvirt-machine.vala:498: Unable to delete storage volume: Storage volume not found: no storage vol with matching name 'sles12sp2' In case you try to install a new VM (may be the same system, may be other) the installation will fail unless you shut Boxes down, re-open it and be sure the removed VM is actually removed and there are no warnings on the command line.
Created attachment 355627 [details] [review] Remove Source Wondering if this will not break any other part of gnome-boxes. (I don't beliave ti will - help me here) Need some help with this fix and patch! I'm new here (coming from newcomers) and I was checking the code. It seems like the timeout for a Boxes.Notification never stops running. In this case every notification will stay running and executing the callback function until you close the gnome-boxes. When you ask gnome-boxes to delete a virtual-machine it will create a Boxes.Notificatian and this Object will call the right function based on the user input. When the user confirm the delete by clicking on close or ignoring the message (timeout) this notification runs forever it will keep calling the delete function which will print the error messages until you close the gnome-boxes With this patch this should be fixed. When the timeout expires it should be removed (https://valadoc.org/glib-2.0/GLib.SourceFunc.html) Please, correct me if I'm wrong and if you have free time help me with this bug :) Cheers, Thiago
Review of attachment 355627 [details] [review]: (In reply to Thiago Mendes from comment #1) > Created attachment 355627 [details] [review] [review] > Remove Source > > Wondering if this will not break any other part of gnome-boxes. (I don't > beliave ti will - help me here) No, it won't. > I'm new here (coming from newcomers) and I was checking the code. It seems > like the timeout for a Boxes.Notification never stops running. In this case > every notification will stay running and executing the callback function > until you close the gnome-boxes. Correct. That's a bug. > When you ask gnome-boxes to delete a virtual-machine it will create a > Boxes.Notificatian and this Object will call the right function based on the > user input. When the user confirm the delete by clicking on close or > ignoring the message (timeout) this notification runs forever it will keep > calling the delete function which will print the error messages until you > close the gnome-boxes > > With this patch this should be fixed. When the timeout expires it should be > removed (https://valadoc.org/glib-2.0/GLib.SourceFunc.html) Yes, you are right. The callback is called infinitely. (In reply to Fabiano Fidêncio from comment #0) > In case you try to install a new VM (may be the same system, may be other) > the installation will fail unless you shut Boxes down, re-open it and be > sure the removed VM is actually removed and there are no warnings on the > command line. I am not sure yet whether it fixes Fabiano's issue. But anyway, the patch is desirable and welcome. Please, address the commit message: notification: Remove signal source on timeout $WHAT_IS_THE_BUG? $WHAT_IS_THE_SOLUTION? https://bugzilla.gnome.org/show_bug.cgi?id=781468 If you have any doubt, see https://wiki.gnome.org/Git/CommitMessages Thanks for the patch! ::: src/notification.vala @@ +36,3 @@ Timeout.add_seconds (this.timeout, () => { dismiss (); + /* After the timeout expires this Timeout should be removed */ no need for this comment here. The code is quite self-explanatory.
> I am not sure yet whether it fixes Fabiano's issue. But anyway, the patch is desirable and welcome. It seems like the messages are a product of the callback being called after the domain is already deleted and it happens because the notification never dies. After the patch I still getting a message saying it can't change the state of the domain (the one to be deleted) but that would be another bug I guess.
Created attachment 355879 [details] [review] delete a domain domain and remove a notification I have a question about the code to finish fixing this bug. When I delete a domain boxes calls machine.delete this function...it also calls libvirt-machine.delete and there, it calls the machine.delete function again using: base.delete. Is this necessarie?
Review of attachment 355879 [details] [review]: > The stop function has to be called before the delete function really? To "undefine" a domain isn't enough? Also, the commit message should be about the file/component touched. ::: src/notification.vala @@ +33,3 @@ set_reveal_child (true); + bool ok_pressed = false; + bool close_pressed = false; You should be able to make assumptions of the button state based on its availability. If the notification was dismissed, there's no state for the button. This way, I don't think this is necessary. @@ -40,3 @@ }); - bool ok_pressed = false; any reason for moving this?
the one-liner version of the patch that just replaced the "return true" instruction by Source.REMOVE seem to work just fine in order to prevent the multiple calls of the callback, didn't it?
(In reply to Felipe Borges from comment #6) > the one-liner version of the patch that just replaced the "return true" > instruction by Source.REMOVE seem to work just fine in order to prevent the > multiple calls of the callback, didn't it? Hi Felipe. If wait the info (undo) dialog to expire and do not click at the close button the REMOVE works just file but if you do click on the close button it does not because in that case the notification will still running until it expires and when it does it will not call the close_function again but it will run the dismiss() function for the second time and that is why I followed the same approach boxes had for the ok_pressed but now for the close_pressed.
(In reply to Felipe Borges from comment #5) > Review of attachment 355879 [details] [review] [review]: > > > The stop function has to be called before the delete function > > really? To "undefine" a domain isn't enough? the delete function was being called before the stop function and because of that libvirt wasn't able to find the domain to stop it. Sorry because I lack on terms here (still reading about boxes) but what do you mean by: "undefine" a domain? > > Also, the commit message should be about the file/component touched. > > ::: src/notification.vala > @@ +33,3 @@ > set_reveal_child (true); > + bool ok_pressed = false; > + bool close_pressed = false; > > You should be able to make assumptions of the button state based on its > availability. > > If the notification was dismissed, there's no state for the button. This > way, I don't think this is necessary. > > @@ -40,3 @@ > }); > > - bool ok_pressed = false; > > any reason for moving this? wrong move moving the ok_pressed up. I did because I'm used to have all the variables declared at the begining of the block but since it has nothing to do with the problem and the patch I'm moving it back for the next patch. sorry about that!
(In reply to Thiago Mendes from comment #7) > (In reply to Felipe Borges from comment #6) > > the one-liner version of the patch that just replaced the "return true" > > instruction by Source.REMOVE seem to work just fine in order to prevent the > > multiple calls of the callback, didn't it? > > Hi Felipe. If wait the info (undo) dialog to expire and do not click at the > close button the REMOVE works just file but if you do click on the close > button it does not because in that case the notification will still running > until it expires and when it does it will not call the close_function again > but it will run the dismiss() function for the second time and that is why I > followed the same approach boxes had for the ok_pressed but now for the > close_pressed. Ok, so a cleaner way to do it would be: notification_timeout_id = Timeout.add_seconds (this.timeout, () => { ... ... close_button.clicked.connect (() => { dismiss (); Source.remove (notification_timeout_id); }); What do you think?
(In reply to Thiago Mendes from comment #8) > (In reply to Felipe Borges from comment #5) > > Review of attachment 355879 [details] [review] [review] [review]: > > > > > The stop function has to be called before the delete function > > > > really? To "undefine" a domain isn't enough? > > the delete function was being called before the stop function and because of > that libvirt wasn't able to find the domain to stop it. > > Sorry because I lack on terms here (still reading about boxes) but what do > you mean by: "undefine" a domain? To "undefine" in libvirt is pretty much "delete", and the case where the machine is running should be handled already, see https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Virtualization_Deployment_and_Administration_Guide/sect-virsh-delete.html
(In reply to Felipe Borges from comment #9) > (In reply to Thiago Mendes from comment #7) > > (In reply to Felipe Borges from comment #6) > > > the one-liner version of the patch that just replaced the "return true" > > > instruction by Source.REMOVE seem to work just fine in order to prevent the > > > multiple calls of the callback, didn't it? > > > > Hi Felipe. If wait the info (undo) dialog to expire and do not click at the > > close button the REMOVE works just file but if you do click on the close > > button it does not because in that case the notification will still running > > until it expires and when it does it will not call the close_function again > > but it will run the dismiss() function for the second time and that is why I > > followed the same approach boxes had for the ok_pressed but now for the > > close_pressed. > > Ok, so a cleaner way to do it would be: > > notification_timeout_id = Timeout.add_seconds (this.timeout, () => { > ... > ... > > close_button.clicked.connect (() => { > dismiss (); > > Source.remove (notification_timeout_id); > }); > > What do you think? I was looking for that solution Felipe. Don't really like the other way. Thanks a lot :D...gonna test as soon as I'm back to my computer!
Created attachment 356068 [details] [review] Remove Signal Timeout
Created attachment 356069 [details] [review] To delete a domain
Review of attachment 356068 [details] [review]: Thanks for your patch. lgtm
Comment on attachment 356068 [details] [review] Remove Signal Timeout I took the liberty to took the patch a bit and push it.
Review of attachment 356069 [details] [review]: domain.delete () will stop the domain before it gets deleted, won't it? ::: src/libvirt-machine.vala @@ +492,3 @@ + if (storage_volume != null) { + try { + storage_volume.delete (0); Can you elaborate more about why do you think this change is necessary?
(In reply to Felipe Borges from comment #16) > Review of attachment 356069 [details] [review] [review]: > > domain.delete () will stop the domain before it gets deleted, won't it? Asked this question at the #virt channel and people told me it won't! delete is the same as virsh undefined > > ::: src/libvirt-machine.vala > @@ +492,3 @@ > + if (storage_volume != null) { > + try { > + storage_volume.delete (0); > > Can you elaborate more about why do you think this change is necessary? What do you mean by that? Do you mean why did I move it to the try block?
(In reply to Thiago Mendes from comment #17) > Asked this question at the #virt channel and people told me it won't! delete > is the same as virsh undefined According to https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Virtualization_Deployment_and_Administration_Guide/sect-virsh-delete.html "If domain is inactive, the configuration is removed completely. If the domain is active (running), it is converted to a transient domain. When the guest virtual machine becomes inactive, the configuration is removed completely."
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/gnome-boxes/issues/137.