GNOME Bugzilla – Bug 546802
Pygtk destroys cycle too early
Last modified: 2017-10-11 09:45:54 UTC
Please describe the problem: The following test case creates a reference cycle between a Gtk widget and an event handler. Python's GC believes the cycle is no longer externally referenced and destroys it. However, it is still referenced, and the event handler fails when Gtk invokes it. This occurs on Ubuntu hardy, which has Pygtk 2.12.1. Steps to reproduce: The following program reproduces the problem: import gc import gtk import gobject def make_hyperlink(): event_box = gtk.EventBox() def on_hyperlink_mapped(*args): event_box.window.set_cursor(gtk.gdk.Cursor(gtk.gdk.HAND2)) event_box.connect("map-event", on_hyperlink_mapped) return event_box window = gtk.Window() window.add(make_hyperlink()) gc.collect() window.show_all() # Need to enter the Gtk main loop to trigger this. # Can do gtk.main(), but the following will exit after triggering the exception. for i in range(10): gobject.main_context_default().iteration() Actual results: It gives the following output: Traceback (most recent call last):
+ Trace 204803
event_box.window.set_cursor(gtk.gdk.Cursor(gtk.gdk.HAND2))
Expected results: No traceback is expected. Does this happen every time? Yes. Other information: The traceback does not happen on Ubuntu feisty, which has Pygtk 2.10.4.
This is interesting, but I am more inclined to believe it to be a Python bug rather than a PyGObject one, pending further investigation.
My question was going to be: How does the C GObject keep a refcount on the Python GObject wrapper so that that wrapper (and the cycle it is involved in with the Python callback) does not get freed? I found that the answer is: It uses g_object_add_toggle_ref() so that PyGObject gets notified when the only remaining refcount to the GObject is the one it holds. However, g_object_add_toggle_ref() is only used conditionally. See the calls to pygobject_switch_to_toggle_ref(): the calls are only made if the PyGObject's instance dictionary is used, presumably on the grounds that if the PyGObject does not contain any state it can be discarded. (The GObject docs say "you should only ever use a toggle reference if there is important state in the proxy object".) This logic does not take into account the Python callbacks, whose refcounts are owned by the PyGObject rather than by the underlying GObject. I can trigger a call to pygobject_switch_to_toggle_ref() by adding the line: event_box.x = 1 (which calls pygobject_setattro()) or event_box.__dict__ (which calls pygobject_get_dict()) So if I change make_hyperlink() to the following, the problem goes away: def make_hyperlink(): event_box = gtk.EventBox() event_box.__dict__ def on_hyperlink_mapped(*args): event_box.window.set_cursor(gtk.gdk.Cursor(gtk.gdk.HAND2)) event_box.connect("map-event", on_hyperlink_mapped) return event_box
So you discovered that the problem does not happen if the wrapper is using a toggle reference. This means only means that the GObject keeps a reference back to the wrapper. Without the toggle ref, the wrapper may die while the GObject stays alive, in case the wrapper refcount reaches zero. However, in this particular case the python function closure (fun_closure attribute in the function object) should be holding a reference to the wrapper, thus ensuring the wrapper stays alive even though the GObject does not reference it anymore. In my tests I printed the func_closure before and during invocation. A closure contains a list of "cells", each cell is a variable to bind in the function namespace. Before the invocation, the cells are all there and correct. But during invocation the problematic cell containing a reference to our object appears 'empty'. Why was the python function closure cell cleared I have no idea, but I know it shouldn't have happened. That's why I suspect a Python bug. I could be wrong, but it's the only explanation I have for the time being...
Ah.. I think I know where the problem is: pygobject_traverse. We are always declaring to the Python GC that the wrapper holds a reference to the GClosures. Then, when tp_traverse is being called on the wrapper, Python sees a reference cycle: wrapper => closure->callback => func_closures => wrapper However, the mistake here is that Python GC believes that by clearing wrapper, the python function object is unreffed as a consequence, but that is only true if the wrapper is holding the last reference to the object.
Created attachment 116163 [details] [review] patch (pygobject-2-14 branch) This should fix the problem.
This only goes into trunk right? Or should we really do a follow up 2.14 release?
I think it's far from critical bug. Not really worth making a 2.14 release for it IMHO.
I notice that your patch appears to be equivalent to a patch that Arjan J. Molenaar proposed for fixing Bug 92955 (see comments 6 and 7: http://bugzilla.gnome.org/show_bug.cgi?id=92955#c7). I was going to propose this myself until I discovered the use of the toggle refs mechanism, which appeared to be a replacement for looking at the GObject's refcount. As I understand it now (based on Bug 320428), the reason for not using toggle refs all the time is that it would prevent the object from being freed if it were wrapped by two sets of language bindings.
Gustavo: are you confident enough about the patch to include it into the trunk (would-be-2.16)?
I am confident about the patch core. The unit testing part could use more love though, as it opens a window which is temporarily visible.
Created attachment 117599 [details] another test I don't really understand this, but here is my attempt at a test. However, it fails weirdly both without and with your fix. Is that another bug?
(In reply to comment #11) > Created an attachment (id=117599) [edit] > another test > > I don't really understand this, but here is my attempt at a test. However, it > fails weirdly both without and with your fix. Is that another bug? > Please. First this isn't a unit test, it is not clear what conditions should be evaluated to check if it "passes" or not. Also you are using gio which had reference counting errors last time (a few weeks ago) I checked (python complains if you have a python built with --with-pydebug --without-pymalloc).
Created attachment 117640 [details] [review] better now?
Also, if cancellable (push|pop|get)_current are inherently buggy in our wrappers they must be either fixed or removed altogether.
About the patch itself: looks like the condition cannot change inside the loop, but I doubt compiler will spot this. Won't it be better to move it outside the loop then? Also, a comment would help, doesn't look like an obvious fix. Finally, a test for memory leaks (to be sure this patch doesn't suddenly make PyGObject leak lots of stuff) would be really helpful.
> This should fix the problem. Sadly no. I got Nokia to move this patch into the recent version of pymaemo. Here's what I'm getting though: If you create a closure with a lambda expression over 'self', the patch works great now. But if you create a closure with a lambda expression over a local variable, pygtk is still deleting the variable early, causing Bad Things to happen. This didn't happen in much earlier revisions. Example: <in a method> self.split.connect("destroy", lambda widget, data=None : setPersistenceData(name, self.split.get_position())) 'name' is a local variable holding a string. Inside the lambda, 'self' is now not getting released, but 'name' is still getting released, converting into <none>.
That's weird; the patch should be fixing both cases. Any chance you can come up with a minimal test program to reproduce the problem? Just to make sure we don't misunderstand the issue...
In fact the minimal test is fundamental because I at least have been unable to reproduce the problem based on your description.
(In reply to comment #18) > In fact the minimal test is fundamental because I at least have been unable to > reproduce the problem based on your description. A minimal test may be difficult to construct. I do have a two-file python application which exhibits the behavior but it'd be too long to post here. It's about 1800 lines. Too big to rummage through?
This bug has been causing problems for me as well. Did a fix ever go into pygobject? I seem to be hitting the bug on at least pygobject 2.14.
Also, I think this rates as higher than 'minor'. I'm looking at porting a lot of code that ran fine on an older pygobject and it uses closures freely.
I'm having the same issue in gobject 2.14-2 using the original syntax from Mark Seaborn. NameError: free variable 'event_box' referenced before assignment in enclosing scope
Is this still an issue? I tried the test code and it works fine here.
yes it is still an issue... I tested it today on a computer at work. - (debian, stable) - python 2.5.2 - gtk 2.14.2 Traceback (most recent call last):
+ Trace 223863
Thanks for testing. What version of pygobject is used? I have 2.21.1. Might have been fixed in later versions.
(In reply to comment #25) > Thanks for testing. What version of pygobject is used? I have 2.21.1. Might > have been fixed in later versions. That seems like the case. I am using an older version than yourself. In order to upgrade in debian, I believe I would need to move to unstable, it would be nice to have a patch instead as upgrading is just not possible with the current situation. debian: python-gobject 2.14.2-2 -python- In [1]: import gobject In [2]: gobject.pygobject_version Out[2]: (2, 14, 2)
If it is important to you I would file this downstream with Debian. They can add the patch to their packages and push to stable. It seems to be fixed in the current code base so I'm going to close the bug.
The problem still exists in pygobject 2.28.6 and 3.13.1. I don't seem to have access to reopen this bug; could someone please do it? The reproducer above is not quite correct: the map-event never fires at all, because it has not been added to the window's event mask. If window.add_events(gtk.gdk.SUBSTRUCTURE_MASK) is called immediately after creating the gtk.Window, the error will be triggered: Traceback (most recent call last):
+ Trace 233663
This thread from 2008 has more discussion and another reproducer: http://www.daa.com.au/pipermail/pygtk/2008-August/015767.html The patch in comment 13 fixes the problem for me. Here is a port of the reproducer to pygobject 3: === import gc from gi.repository import Gdk, GLib, Gtk def make_hyperlink(): event_box = Gtk.EventBox() def on_hyperlink_mapped(*args): event_box.get_window().set_cursor(Gdk.Cursor(Gdk.CursorType.HAND2)) event_box.connect("map-event", on_hyperlink_mapped) return event_box window = Gtk.Window() window.add_events(Gdk.EventMask.SUBSTRUCTURE_MASK) window.add(make_hyperlink()) gc.collect() window.show_all() # Need to enter the Gtk main loop to trigger this. # Can do Gtk.main(), but the following will exit after triggering the # exception. for i in range(10): GLib.main_context_default().iteration() === And here is a reproducer that doesn't use Python closures (based on the patch in comment 13): === import gc from gi.repository import Gio, GLib, Gtk def on_add(container, widget): print container print 'refcount', container.__grefcount__ def setup_frame(loop): frame = Gtk.Frame() # This works #frame.connect('add', on_add) # This should be equivalent, but fails frame.connect_object('add', on_add, frame) # This is a dummy object to avoid keeping a Python reference to the # frame (the only reference will be held from C). x = Gio.FileInfo() x.set_attribute_object('frame', frame) GLib.idle_add(lambda: x.get_attribute_object('frame').add(Gtk.Label(label='foo'))) GLib.idle_add(lambda: loop.quit()) #gc.set_debug(gc.DEBUG_OBJECTS | gc.DEBUG_COLLECTABLE) loop = GLib.MainLoop() setup_frame(loop) gc.collect() loop.run() ===