After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 303266 - uncollectable refount cycle
uncollectable refount cycle
Status: RESOLVED OBSOLETE
Product: pygobject
Classification: Bindings
Component: gobject
2.9.0
Other Linux
: Normal normal
: ---
Assigned To: Nobody's working on this now (help wanted and appreciated)
Python bindings maintainers
Depends on: 303482 303483
Blocks:
 
 
Reported: 2005-05-06 16:26 UTC by Benjamin Otte (Company)
Modified: 2012-02-10 08:18 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
consume all memory. Run with ./test.py bigimage (815 bytes, application/x-python)
2005-05-06 16:29 UTC, Benjamin Otte (Company)
  Details
diagram showing what is happening (1.91 KB, application/x-dia-diagram)
2005-05-06 22:49 UTC, Gustavo Carneiro
  Details
improved diagram showing why python cyclic GC fails (3.14 KB, application/x-dia-diagram)
2005-05-07 11:49 UTC, Gustavo Carneiro
  Details
improved diagram showing why python cyclic GC fails (PNG) (39.53 KB, image/png)
2005-05-07 11:50 UTC, Gustavo Carneiro
  Details
alternate test case (796 bytes, text/x-python)
2005-05-07 20:58 UTC, Gustavo Carneiro
  Details
a fix using g_object_traverse (1.82 KB, patch)
2005-05-08 17:28 UTC, Gustavo Carneiro
needs-work Details | Review
example of collectable GStreamer element (2.84 KB, application/x-python)
2005-05-10 10:48 UTC, Benjamin Otte (Company)
  Details
example of uncollectable GStreamer element (2.84 KB, application/x-python)
2005-05-10 10:50 UTC, Benjamin Otte (Company)
  Details
example of collectable GStreamer element (2.85 KB, text/plain)
2005-05-10 14:20 UTC, Benjamin Otte (Company)
  Details
adaptation of test program showing number and new objects (1.88 KB, text/plain)
2008-09-09 20:53 UTC, Paul Pogonyshev
  Details

Description Benjamin Otte (Company) 2005-05-06 16:26:42 UTC
The attached program shows an uncollectable cycle in pygtk. Whenever you click
the button, one more such cycle is added to the consumed memory. Feel free to
click the button until your whole computer happily swaps.
Comment 1 Benjamin Otte (Company) 2005-05-06 16:29:55 UTC
Created attachment 46098 [details]
consume all memory. Run with ./test.py bigimage

The cycle here is this:
Parent: glib-refs child
child: python-refs signal
signal: python-refs parent
Comment 2 Gustavo Carneiro 2005-05-06 22:06:42 UTC
Minor observation: python tells me it is leaking 12 python object refs per
button press.
Comment 3 Gustavo Carneiro 2005-05-06 22:21:22 UTC
The following patch fixes the memory leak.  Of course it is an absolutely
incorrect patch (may collect other objects too soon); it's just a small
experiment.  It confirms the theory that the cycle
python(parent)->gobject(parent)->gobject(child)->python(parent) is causing  the
gobject refcount on the child gobject to be == 2 (one from parent gobject,
another from python wrapper), and refcount == 2 makes the pygobject deceive the
python GC to stay alive, although in this case erroneously.

--- pygobject.c 27 Nov 2004 19:12:34 -0000      1.39
+++ pygobject.c 6 May 2005 22:14:25 -0000
@@ -506,7 +506,7 @@ pygobject_traverse(PyGObject *self, visi
        if (ret != 0) return ret;
     }

-    if (self->obj && self->obj->ref_count == 1)
+    if (self->obj && self->obj->ref_count <= 2)
        ret = visit((PyObject *)self, arg);
     if (ret != 0) return ret;
Comment 4 Gustavo Carneiro 2005-05-06 22:49:25 UTC
Created attachment 46111 [details]
diagram showing what is happening
Comment 5 Gustavo Carneiro 2005-05-06 23:40:53 UTC
One possible solution that comes to mind would be to, when creating a
PyGClosure, check if the callable is a python bound method, and if it is,
instead of keeping a reference to the bound method, keep a reference to the
corresponding unbound method plus a _weak reference_ to the method instance.
Comment 6 Gustavo Carneiro 2005-05-06 23:42:25 UTC
Bah.. never mind, it's a stupid idea.  It fixes this scenario, but introduces
other unwanted side-effects.  I need sleep :-P
Comment 7 Gustavo Carneiro 2005-05-07 11:49:49 UTC
Created attachment 46126 [details]
improved diagram showing why python cyclic GC fails
Comment 8 Gustavo Carneiro 2005-05-07 11:50:34 UTC
Created attachment 46127 [details]
improved diagram showing why python cyclic GC fails (PNG)
Comment 9 Gustavo Carneiro 2005-05-07 20:58:54 UTC
Created attachment 46145 [details]
alternate test case

A modified test case; it shows that the problem is not specific to signal
handlers.  In this case, the signal handler is a plain function, but a cycle is
created this time using a simple member variable pointing from the button to
the box.
Comment 10 John Ehresman 2005-05-08 04:59:12 UTC
I don't see the pygtk bug -- every gtk.Widget that is created must be destroyed
and the box is left undestroyed because it is removed from the window before the
window is destroyed.  Therefore it can leak memory.  This is unfortunate and can
lead to long debugging sessions to try and figure out where the leak is, but it
is not solvable without modifying the C gtk code to add something like Python's
cyclic GC.
Comment 11 Benjamin Otte (Company) 2005-05-08 07:10:56 UTC
No, this is not true. At least in the C code it is not necessary to call
destroy() at all.

Also note the documentation for gtk_object/widget_destroy, which says that this
function is only used to explicitly break references.
And requiring to explicitly remove references in Python sounds like a bad idea
to me.
Comment 12 Gustavo Carneiro 2005-05-08 13:23:47 UTC
John: Note that while it is true that GtkWindows should be destroy()ed, that's
only because gtk always holds the last reference to any GtkWindow, so unreffing
it from user code will not destroy it.  For any other widget, removing all
references causes the widget to be destroy()ed automatically.

Anyway, GtkWidget is only an example, but this could happen too for other
GObject types.  Any GObject that can contain/own other GObjects potentially
sufferes from this bug.  All you need is for the python wrapper for the
contained object to keep some reference to the python wrapper of the container
object.  The reason why destroy() in GtkWidget fixes the leak is because it 1)
destroys signal handlers, and 2) unlinks containers from its children.

I also agree we shouldn't make the programmer depend on some magic
destroy()-like method for GC to work.  For one thing, it's complicated enough
having to remember which objects need destroy() and which don't.  I think
gtk.Window() needing to be destroy()ed as an exception is acceptable, otherwise
it's too complicated for python.

I have not proposed any solution, but here are a couple of proposals:

  1. Stop relying on Python's cyclic GC, and instead try to use the new "toggle
references" being proposed now in glib;

  2. Try to push into GObject a new tp_traverse-like virtual method, so that a
GObject may report which child GObjects it "owns".

Unfortunately neither is easy to implement... :|
Comment 13 Gustavo Carneiro 2005-05-08 17:28:40 UTC
Created attachment 46176 [details] [review]
a fix using g_object_traverse
Comment 14 John Ehresman 2005-05-08 21:49:56 UTC
I agree it would be nice to solve this -- I've spent more time than I want to
remember getting my custom tree models to not leak.  My point was really that it
can't be done in pygtk alone and changes to gobject and many other places would
be required.  I don't think the toggle references will solve this; I believe
Owen identified this as one of the cases that toggle references would not solve
in his initial proposal.
Comment 15 James Henstridge 2005-05-09 09:50:06 UTC
Note that there are reference cycles in certain GTK widgets that are only broken
with gtk_object_destroy(), independent of the Python bindings.

The simplest forming with GtkMenuItem and GtkAccelLabel:
 1. the menu item holds a reference to the label contained inside
 2. accel labels hold a reference to the widget they are tracking the shortcut
    key combo for in ->accel_widget.

In the general case of a menu item holding an accel label that watches it, you
have a fairly tight reference loop that is only broken when gtk_object_destroy()
is called on one of the widgets.

If I write a C function that removes such a menu item, destroys the window and
retains no external references to the menu item, I'll have a leak too.

Expecting arbitrary GTK programs to be leak free without destroy() would
probably require adding functionality like the Python cycle GC, which seems out
of the scope for pygtk.
Comment 16 Gustavo Carneiro 2005-05-09 10:13:22 UTC
OK, but I'd rather have scenarios where leaks happen without destroy() as
exceptions and not the rule.

Although the test case in this bug report is gtk based, I think it was opened
because of a similar program in gst-python, right Benjamin?  Is there a
destroy() equivalent in the GStreamer objects that can fix the leak?
Comment 17 Benjamin Otte (Company) 2005-05-10 10:40:39 UTC
Yes, the problem originally comes from GStreamer. And no, GStreamer elements do
not have a destroy function, and it is not necessary to do anything but release
all references to GStreamer objects via g_object_unref.
You could start requiring to call g_object_run_dispose or similar on all
elements, but that doesn't sound like a good idea.

(GStreamer internals following)
I discovered the problem when setting pad functions like
gst_pad_set_link/chain/get/getcaps_function. These are implemented as closures
in gst-python, so the pad's PyGObject takes care of them. However, most of the
time you are going to set element functions on the pad, so you end up with a
cycle like this:
element =gref=> pad =pyref=> function =pyref=> element


On a sidenote, I'd consider refcount cycles in Gtk a bug and would always have
expected that one reference in that cycle is a weak reference. (In the example
case I'd have made the GtkAccelLabel weakref the accel_widget) Gtk should
probably prominently state that all widgets require an explicit destroy.
Comment 18 Benjamin Otte (Company) 2005-05-10 10:48:42 UTC
Created attachment 46276 [details]
example of collectable GStreamer element
Comment 19 Benjamin Otte (Company) 2005-05-10 10:50:53 UTC
Created attachment 46277 [details]
example of uncollectable GStreamer element

The attached files nicely show the difference between a collectable and an
uncollectable GStreamer element: Move the chain function out of the element
class. The elements are otherwise identical.
Comment 20 Benjamin Otte (Company) 2005-05-10 14:20:25 UTC
Created attachment 46289 [details]
example of collectable GStreamer element

Now here's the real one. And it's attached as text/plain so poor Epiphany users
don't get their security threatened.
Comment 21 Gustavo Carneiro 2005-07-10 11:03:48 UTC
Unfortunately the required GObjectClass API didn't make it to glib 2.8 API
freeze :-(

Benjamin, if you have any suggestions, even hacks, now is the time to bring them
forward :|  I guess later when/if the new glib API is accepted we can deprecate
any hacks we introduce now and use the real thing instead.
Comment 22 Johan (not receiving bugmail) Dahlin 2005-10-19 11:47:53 UTC
Comment on attachment 46098 [details]
consume all memory. Run with ./test.py bigimage

If you add a del self.pixbuf in the close() method this will stop leaking
memory.
Comment 23 Gustavo Carneiro 2005-10-19 12:17:14 UTC
The question is not whether or not memory can be freed by help from the
programmer.  The issue is that the memory should be freed _automatically_,
without resorting to del or destroy().  And the pixbuf is there only to
"amplify" the memory leak; removing the pixbuf doesn't mean there's no leak,
only that the ammount of memory leaked is too small be easily noticed by a
resource monitor.
Comment 24 Andy Wingo 2006-01-13 16:32:13 UTC
Ping?
Comment 25 Gustavo Carneiro 2006-01-14 17:00:35 UTC
Alas.. this needs API from GObject, that's why it hasn't been fixed yet :(
Of course we could also add a pygobject specific hack... letting 3rd party modules register 'traverse' functions..
Comment 26 John Ehresman 2006-01-14 17:43:04 UTC
Just my $.02, but it took quite a while to get the bugs out of Python's GC (and there still might be some remaining) where it is easier to traverse most objects can just their traverse dictionaries or lists.  A traverse for gobject would mean that nearly every object implementation would need a function that visited everything it holds a reference to.

The better solution may be to design api's that don't rely on ref counting to release resources, e.g. destroy in GtkObject.
Comment 27 Tim Janik 2006-10-02 12:14:30 UTC
the main point in having g_object_dispose() is to have a function at the very base of all object implementations to break all object references.
this was introduced precisely to help language bindings to break reference cycles. that's because implementing a traverse function that reveals all object references an object implemetnation uses was found to be much too hard to implement (especially with custom non-gtk objects and in the presence of *multiple* language bindings in the same process) and because relying solely on ref/unref simply isn't going to work to break cycles. i.e. Benjamin's assertion that in a cycle of references, one reference would always be weak isn't implementable in practice (and may even be undecidable in terms of which reference canonically had to be weak).

trying to clear/break/collect cycles by bypassing g_object_dispose() will definitely fail, because no single entity (libgobject, gtk, python bindings, scheme bindings, etc...) can track *all* references that possibly form cycles. so use g_object_dispose() rigorously to get your issues fixed (which is also why gtk always demands _destroy() which is just the GtkObject name for _dispose()).
Comment 28 Gustavo Carneiro 2006-10-02 12:40:34 UTC
GObject.dispose is not currently exposed in pygobject.  But if we were to wrap it, we'd also need to wrap GObjectClass->dispose, calling e.g. __gdispose__ in a python class.  Then, for each class we had in python that contains other objects, it would need to define a __gdispose__ which would call dispose() on each contained object reference.

I have no problem with that, but this starts to look a lot like C, not Python!  Even Python developers are talking about getting rid of __del__ in Python 3; should we be thinking of adding dispose/__gdispose__ in pygobject?

I wonder what do C#/Perl/lisp/whatever bindings do; do they wrap dispose?
Comment 29 Tim Janik 2006-10-02 12:44:26 UTC
well g_object_dispose() is usually used by just obejct implementations to get their destroy/destruct/delete/dispose/breakrefs/whateveryoucallit semantics straight. so for e.g. Gtk+, you'd not wrap and offer dispose, but destroy. For other object hierarchies this may vary, the respective hierarchy authors should know though.
Comment 30 Paul Pogonyshev 2008-09-09 20:53:44 UTC
Created attachment 118387 [details]
adaptation of test program showing number and new objects

According to this program, Python (2.5) leaks 5 objects per button press (6 new and garbage-collects 1 back).  I just enhanced attachment 46145 [details] a bit.
Comment 31 Martin Pitt 2012-02-10 08:18:42 UTC
codegen and the static bindings are obsolete since pygobject 3.0, which is GI only.

Thanks!