After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 92955 - gc.collect() destroys __dict__?
gc.collect() destroys __dict__?
Status: RESOLVED FIXED
Product: gnome-python
Classification: Deprecated
Component: general
1.99.x
Other Linux
: Normal normal
: ---
Assigned To: Python bindings maintainers
Python bindings maintainers
: 107697 (view as bug list)
Depends on:
Blocks: 693111
 
 
Reported: 2002-09-10 17:36 UTC by Thomas Leonard
Modified: 2013-02-03 21:40 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Test case exposing the "__dict__ is empty" bug. (4.39 KB, text/plain)
2003-02-10 07:34 UTC, Arjan J. Molenaar
  Details
Fix. (667 bytes, patch)
2003-02-10 20:16 UTC, Arjan J. Molenaar
none Details | Review
patch with my proposed solution (7.84 KB, patch)
2003-03-05 15:42 UTC, James Henstridge
none Details | Review
fixed patch (8.49 KB, patch)
2003-03-06 13:34 UTC, James Henstridge
none Details | Review
fix up the dealloc routine so we ignore the tp_free slot again (8.03 KB, patch)
2003-03-07 01:24 UTC, James Henstridge
none Details | Review

Description Thomas Leonard 2002-09-10 17:36:42 UTC
Using today's CVS version. This test prog should display a window
and print 42 when the window is closed.
This works, unless we do a gc.collect() in which case it prints an error
("AttributeError: 'T' object has no attribute 'a'").
This error happens in more complex programs without gc.collect().

import pygtk; pygtk.require('2.0')
import gtk
import gc

class T(gtk.Window):
	def __init__(self):
		gtk.Window.__init__(self)
		self.a = 42
		self.connect('destroy', self.destroy)

	def destroy(self, w):
		print self.a
		gtk.mainquit()

T().show()

print "gc:", gc.collect()

gtk.mainloop()
Comment 1 James Henstridge 2002-11-20 13:41:49 UTC
mass reassign of open pygtk and gnome-python bugs.
Comment 2 Johan (not receiving bugmail) Dahlin 2002-12-23 16:28:23 UTC
Instead of doing:

  T().show()

do:
  t = T()
  t.show()

T().show() seems to have the same effect as t = (); t.show(); del t.

Is this really a problem with PyGtk and not the way python handle
references to instances?
Comment 3 Thomas Leonard 2002-12-27 15:56:38 UTC
Obviously if you keep an extra reference to the object in a global
variable then you can hide the reference counting bug. But it should
be fixed; a reference to an object via a signal callback is just as
valid as any other.

Thanks,
Comment 4 Arjan J. Molenaar 2003-01-06 08:48:29 UTC
I have the same kind of problem (I think). It happens with my
DiaCanvas2  python bindings: when I create a new instance of an object
(a python object, which is inherited from a DiaCanvas2 class written
in C). I add some variables to the item and add the item to the canvas
(to which I hold a reference). I release the reference to the item
(but the item is referenced by the canvas). After a gc.collect() the
dictionary of the item seems empty.

Note that the item is still referenced by the canvas, so it's not
deleted itself. I'll try to catch the problem in a test case.
Comment 5 Arjan J. Molenaar 2003-02-10 07:34:32 UTC
Created attachment 14232 [details]
Test case exposing the "__dict__ is empty" bug.
Comment 6 Arjan J. Molenaar 2003-02-10 07:40:18 UTC
I've added a small test case which exposes the behavior. I've been
adding quite some documentation to the test case.

In short: it exposes itself when there are no (direct) references from
the python main environment (read: your code) to a PyGObject object
and the PyGObject owns a reference to your GObject.

The problem can be described as:
- we have a GObject 'win' and a GObject 'l', and their
  Python wrappers.
- 'win' and 'l' are connected to each other on a GObject level
  (such as gtk_container_add() for example).
- Now, we create a cyclic reference which includes 'l' (the GC
  is used to
  unwrap cyclic references). In this example this is done with
  the simple Python instance 'a' of class A.
- Now we can remove our references to 'l' and 'a'. Since 'l'
  references 'a' and visa versa, there is no need to need to
  add an extra reference on 'l'..
- The garbage collector examines our objects and sees 'l' and
  'a' being connected to each other and cyclic. No relation is
  found with class 'win'..
- 'l', 'a' and their dictionaries (l.__dict__) are set for
  garbage collection.
  Their 'tp_clear' functions are invoked:
  - 'l' clears its connected signals
  - 'a' has nothing to do
  - the dictionaries of both 'a' and 'l' are cleared (causing 'a'
    to be destroyed)

Possible solution:
- Do not traverse (tp_traverse) if the GObject has a refcount > 1.
  (note that the GC only clear cyclic references).
Comment 7 Arjan J. Molenaar 2003-02-10 20:16:44 UTC
Created attachment 14241 [details] [review]
Fix.
Comment 8 Arjan J. Molenaar 2003-02-10 20:19:05 UTC
... There is one 'but' about this approach: GObjects should not be
involved in cyclic references which are part GObject, part Python
objects... I also updated the test case, showing proper output.
Comment 9 James Henstridge 2003-03-04 04:37:10 UTC
Here is my take on the cycles caused by the example given in the
initial comment.  Lets call the unnamed instance 't':

Calling tp_traverse() on t after the T().show() call will traverse to
the following objects:
  - t's __dict__
  - the bound method object 't.destroy', which is used as a signal
    handler (code for this added in bug 71435).
  - an empty tuple as arguments to the signal.

Calling tp_traverse() on 't.destroy' will traverse to:
  - t (the instance this method is bound to)
  - T.destroy (the unbound method).

So the main cycle we have is t -> t.destroy -> t.

The following references to 't' are held:
  - t.destroy's "im_self" pointer

Because this reference isn't held, the hack in pygobject.c to keep the
python wrapper alive doesn't trigger, so the C level GObject doesn't
grab a reference.

When the cycle GC runs, it sees that the only references to 't' are
held by cycles, so calls the tp_clear() slot on 't', and other objects
implementing cycle GC that are only alive because of 't'.

This kills off the dict.  (I am a bit surprised it didn't kill the
signal handler connection too, as tp_clear() on 't' invalidates all
the watched closures).

Maybe we need to rethink the entire "single wrapper per GObject"
implementation so that it looks less like a big hack.  From talks with
Owen a few years back, he described how the Perl bindings handle this.
  - each LB wrapper holds a reference to its corresponding
    GObject.
  - each GObject holds a reference to its corresponding LB
    wrapper (through g_object_set_data())
  - maintain a linked list of Python wrappers
  - In a timeout hanlder, go through the list looking for wrappers
    with a reference count of 1 (which means that the ref is owned
    by the GObject).  If it finds such a wrapper, it breaks the link
    between the wrapper and GObject so that they can both be released.

This of course won't work quite right with Python cycle GC (it would
represent a cycle that it doesn't know about, so it would mean that
wrappers wouldn't be able to participate in cycle GC (however, it
would get rid of the hackish tp_dealloc implementation we currently have).

I wonder if it would be possible to inject knowledge that there is a
"wrapper -> gobject -> wrapper" cycle, but the gobject has other
references, into the Python cycle GC, as that would be the other way
to handle it.
Comment 10 James Henstridge 2003-03-05 11:16:40 UTC
Been thinking a bit more about this problem, and I think I have come
up with a possible solution.  It uses some of Arjan's ideas, and
should simplify things greatly.

1. since we want a single wrapper per GObject, it makes sense to have
   the wrapper hold a ref to the GObject and vice versa (creating a
   cycle).  We know the following:
     refcount(wrapper)==1 -> no references to wrapper other than
                             GObject.
     refcount(GObject)==1 -> no references to GObject other than
                             wrapper.
     refcount(wrapper)==refcount(GObject)==1 -> wrapper and object
                                                should be released.

2. As the GObject is not a PyObject, we need some way to feed
   information about references to the GObject into Python's cycle
   GC.  I think this can be represented by presence or absence of a
   direct cycle wrapper -> wrapper:
     if GObject has external references, don't add cycle
     if GObject has refcount==1, add cycle

   This way, the cycle GC will consider the reference the GObject
   holds on the wrapper a cycle if the GObject has no other
   references.

So the plan would be to:

- change things so that when setting up a wrapper for a GObject, the
  GObject gets a ref to the PyObject and vice versa
- cut all the wrapper resurrection stuff out of pygobject_dealloc()
- Add the following code to pygobject_traverse():
     if (self->obj->ref_count == 1)
         ret = visit(self, arg);
     if (ret != 0) return ret;
- Add code to pygobject_clear() to break the cycle between the
  GObject and wrapper.
- Change pygobject_new() so that it does the following:
     if the GObject has no wrapper associated with it:
         create the wrapper (with reference cycle), and
         associate it with the GObject.
     Incref the wrapper associated with the GObject and return it.

I'll need to think about this a bit more, but I think this should work
quite well, and be less likely to break with new versions of Python.

It does mean that PyGTK programs will require the cycle GC in order to
not leak, but I don't think that is a problem.  The default build for
Python 2.2.x is with the GC, and Python 2.3 will require the cycle GC
as well.
Comment 11 James Henstridge 2003-03-05 15:42:26 UTC
Created attachment 14788 [details] [review]
patch with my proposed solution
Comment 12 James Henstridge 2003-03-05 15:48:36 UTC
The above patch implements my proposed solution.  It would be useful
if some people want to test this.  I have only done a little testing,
but will do some more.

The patch reduces the number of lines of code, which is good:
 ChangeLog   |   15 +++++
 pygobject.c |  127 ++++++++++++++++++++------------------------------
 pygobject.h |    1
 3 files changed, 66 insertions(+), 77 deletions(-)

Some of Arjan's older unit tests will probably not pass with this
modification, but that is not necessarily a bad thing.  In the usual
case, the wrapper will have one extra reference.
Comment 13 James Henstridge 2003-03-06 02:44:48 UTC
Still debugging this one.  I am using a modified version of Thomas'
original example with a bit of instrumentation.  When the gc runs
(with debugging stats turned on), it prints:
  gc: collecting generation 2...
  gc: objects in each generation: 0 0 4364
  gc: done, 2 unreachable, 0 uncollectable.
The two objects are the wrapper and its instance dictionary.  Neither
is classed as uncollectable.

For some reason, it doesn't seem to be calling the tp_clear() on my
wrapper though, which means things aren't getting freed.  However, the
'a' attribute in the instance dict goes away, so the dictionary's
tp_clear() method must be getting called.

If I turn on DEBUG_SAVEALL, the instance finds its way into gc.garbage
list though.
Comment 14 James Henstridge 2003-03-06 05:55:27 UTC
*** Bug 107697 has been marked as a duplicate of this bug. ***
Comment 15 James Henstridge 2003-03-06 13:34:30 UTC
Created attachment 14812 [details] [review]
fixed patch
Comment 16 James Henstridge 2003-03-06 14:04:44 UTC
New patch that actually works.  pygobject_register_class() was not
propagating the tp_clear() implementation for subclasses of GObject. 
This is why the tp_clear() function was not being called :)

With the above patch, the garbage collector successfully freed the object.

One difference with this patch applied is that subclasses with __del__
methods won't get freed.  This is simply a limitation of the cycle GC.
 I am not sure how much of a problem this will be in practice though ...
Comment 17 Jon Trowbridge 2003-03-06 17:07:51 UTC
I'm getting segfaults when testing jamesh's patch on a largeish app
where we've seen the disappearing __dict__ bug.

The beginning of the (very long) backtrace looks like:
  • #0 chunk_free
    from /lib/libc.so.6
  • #1 free
    from /lib/libc.so.6
  • #2 pygobject_dealloc
    at pygobject.c line 230
  • #3 collect
    at Modules/gcmodule.c line 374
  • #4 collect_generations
    at Modules/gcmodule.c line 518
  • #5 _PyObject_GC_New
    at Modules/gcmodule.c line 879
  • #6 PyDict_New
    at Objects/dictobject.c line 161

I tried running under valgrind, but didn't get much more useful
information... just many, many warnings about an invalid free at the
point where the above segfault occurred.

The bit of code in pygobject_dealloc that is causing the problem
says:

    /* the following causes problems with subclassed types */
    if (self->ob_type->tp_free)
	self->ob_type->tp_free((PyObject *)self);
    else
	PyObject_GC_Del(self);

Since our app is full of subclassed objects, the comment seems to be
accurate. :)
Comment 18 James Henstridge 2003-03-07 01:24:31 UTC
Created attachment 14829 [details] [review]
fix up the dealloc routine so we ignore the tp_free slot again
Comment 19 James Henstridge 2003-03-07 01:28:41 UTC
The above patch reverts that small part of the patch.

John: I am not quite sure why it would cause problems for you unless
the tp_free implementation for your subclass was incorrect.  With the
Python cycle GC, an extra header is added to the start of container
objects, so  calling the wrong free function will cause problems. 
Maybe the function you had in the tp_free slot was doing the wrong
type of free?
Comment 20 Arjan J. Molenaar 2003-03-07 08:21:44 UTC
How can tp_free be overriden? There's no field tp_free created in the
TypeObject (by codegen.py). Is it default NULL then?
Comment 21 James Henstridge 2003-03-08 04:17:18 UTC
Checked the patch into CVS, so resolving this bug.
Comment 22 Jon Trowbridge 2003-03-08 14:47:41 UTC
I've tested the fix in CVS and it takes care of the problem I'd been
seeing.  Thanks James.