GNOME Bugzilla – Bug 624624
Closures are likely to cause memleaks
Last modified: 2018-05-28 11:47:56 UTC
Since closures will take a reference on "this" object if the lambda uses anything from the object, they are very likely to cause a memleak, consider this simple class: public class Foo : Object { public signal void sig (); private int my_i; public Foo (int i) { my_i = i; sig.connect (() => { message ("i was %d", i); }); } ~Foo () { debug ("Foo destroyed"); } } int main (string[] args) { var a = new Foo (0); return 0; } The closure in this example will have reference on the Foo object, which will cause it to not be destroyed. Vala already has the "unowned" keyword which helps with references on variables, so I think there should be a possibility to specify that the closure shouldn't take the reference somehow. One of the ways this could be done could be for example: sig.connect ((unowned Foo) => { ... });
*** Bug 630987 has been marked as a duplicate of this bug. ***
We should support something similar to g_signal_connect_object for closures, i.e., always use a weak reference to avoid reference cycles.
*** Bug 624652 has been marked as a duplicate of this bug. ***
I don't think changing closures to behave like g_signal_connect_object is a good idea, there really should be possibility for the developer to pick the desired behaviour - take for example g_idle_add function - in some most cases it's fine that it takes a reference on the object it operates on, otherwise it would be very crash-prone, but if you're using it in a way where you store the returned source id and manually remove the source in object destructor, it can't take a reference, otherwise the destructor will never be called.
*** Bug 639976 has been marked as a duplicate of this bug. ***
Bug #636782 is also a duplicate of this one (though "indirectly", through bug #639976).
Created attachment 191865 [details] [review] Incomplete patch Attaching an incomplete patch that tries to fix this issue, unfortunately the test case mentioned in the first post isn't fixed (but perhaps someone is able to complete it), though it does fix code like this > class Foo: Object { construct { bar = new Bar (); int i = 10; bar.notify["prop2"].connect (() => { this.tell (); debug ("%d", i); }); } ~Foo () { debug ("Foo died"); } public Bar bar; public void tell () { debug ("Foo told"); } } class Bar: Object { public int prop2 { get { return 0; } set { debug ("prop2 set"); } } ~Bar () { debug ("Bar died"); } } int main () { var a = new Foo (); var b = a.bar; b.prop2 = 3; a = null; debug ("b still alive"); b.prop2 = 1; return 0; }
Created attachment 192932 [details] [review] Patch This patch finishes the former - closures no longer take an extra reference of the object itself, and moreover signal connections now use g_signal_connect_closure together with g_object_watch_closure to prevent signal invocations on dead objects.
Adding some comments from review done on IRC: <juergbi> mhr3_: this will also affect closures not used with signals which i don't think we want here <juergbi> mhr3_: the fix for the bug should be to handle closures the same way as regular instance delegates. the patch achieves that in the case of signal handlers but not in other cases, afaict (in other cases, nothing should be changed atm, iirc)
Now that bug #639054 is fixed, this may be fixable. Also, I'm moving bug #639976's blockers onto this bug (since we marked the former as a duplicate of this bug).
Created attachment 283853 [details] Test vala program I just ran into this bug and have a different approach. Consider this test program (attachment) similar to the earlier examples. As expected of this bug, the output is: > ** Message: test.vala:15: Foo.sig: x = 1 > ** Message: test.vala:29: Foo.sig: y = 2 The objects are never disposed because of the reference cycle created by the signal handler.
Created attachment 283854 [details] c code generated by test.vala
Created attachment 283855 [details] proposed code that should be generated to fix the bug My proposal to fix this bug is to have vala generate the code in test.c.fixed instead of what it currently generates is test.c.broken. Run them through diff to see what has changed. In the explanation below, handler-owner (bar and bar2 in test.vala) refers to the object that has a lambda expression that results in a closure in the generated code and signal-owner (foo in test.vala) refers to the object that has the signal that handler-owner is connected to. First, as required to break the reference cycle, I have got rid of the lines the ref and unref the hander-owner. Of course, the problem then is what if someone calls the signal on signal-owner after the hander-owner has been finalized? To take care of that, I an using g_object_weak_ref () to hook the finalization of hander-owner. Just before the hander-owner is finalized, it disconnects the signal handler. What about the case when the signal-owner is finalized first? We call g_object_weak_unref () so that when the handler-owner is finalized, we don't try to disconnect the signal from the already finalized signal-owner. Thoughts? Did I miss any cases? If this is an acceptable solution, how would I add this to valac (i.e. which files should I be looking at)?
I almost forgot, the output of the fixed version is: >** Message: block1_weak_notify >** Message: test.vala:19: Bar finalized >** Message: test.vala:29: Foo.sig: y = 2 >** Message: block2_weak_notify >** Message: test.vala:33: Bar2 finalized >** Message: test.vala:5: Foo finalized which is what you would expect.
Created attachment 369918 [details] [review] codegen: Stop taking explicit references on 'this' for captured blocks Use g_signal_connect_closure() and g_object_watch_closure() in GObject environments to control their life-cycles. Based on patch by Michal Hruby
I recently resurrected Geary's custom reference tracking and it shows we are leaking a lot of objects. I fixed a bunch that were our fault, but couldn't track down a number of them. After building master with this patch most of the remaining ones have gone away. I'll keep running it to see if there's any regressions, but it built fine and seems to be running fine. Would be very keen to see this get released!
Attachment 369918 [details] pushed as 75e61cf - codegen: Stop taking explicit references on 'this' for captured blocks
*** Bug 794985 has been marked as a duplicate of this bug. ***
I now have a segmentation fault on application shutdown in gnome-latex. The problem comes from valac, I've done a git bisect, the result is: ``` 75e61cfbadb3d98f44835665d25fa3b836cbceb5 is the first bad commit commit 75e61cfbadb3d98f44835665d25fa3b836cbceb5 Author: Michal Hruby <michal.mhr@gmail.com> Date: Tue Mar 20 16:50:14 2018 +0100 codegen: Stop taking explicit references on 'this' for captured blocks Use g_signal_connect_closure() and g_object_watch_closure() in GObject environments to control their life-cycles. https://bugzilla.gnome.org/show_bug.cgi?id=624624 ``` The gnome-latex bug, with a backtrace: https://gitlab.gnome.org/GNOME/gnome-latex/issues/57
It's probably a bug in gnome-latex that is now uncovered, sorry for the noise.
commit 75e61cfbadb3d98f44835665d25fa3b836cbceb5 causes several crashes in gnome-latex. It's difficult for me to know how to properly fix the issues when reading the Vala code, I don't know what's wrong. I'm sure it causes crashes in other apps, I don't think gnome-latex is an exception. Some Linux distributions like Debian re-compile the Vala code with valac, even if the tarball already contains the generated C code. So it's a recipe for disaster, this will maybe causes a lot of crashes in various Vala applications. It would be nice if valac detects the problems, and refuses to compile the code, showing an error message to explain what's wrong and how to fix the code.
I wanted to add: what about apps that are no longer maintained, or in low-maintenance mode? The developer will not have enough time/energy to investigate the problems. Or apps that have a different development cycle than Vala/GNOME. A new version of the app that fixes the crashes is not guaranteed to be released in time, before Debian re-compiles the Vala code…
@swilmet: Please use the issue tracker on gitlab: https://gitlab.gnome.org/GNOME/vala/issues