After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 683599 - Rework void pointer field assignments on boxed types
Rework void pointer field assignments on boxed types
Status: RESOLVED FIXED
Product: pygobject
Classification: Bindings
Component: gobject
unspecified
Other Linux
: Normal normal
: GNOME 3.8
Assigned To: Simon Feltman
Python bindings maintainers
Depends on:
Blocks: 682933 699435
 
 
Reported: 2012-09-07 23:11 UTC by Simon Feltman
Modified: 2013-05-02 02:11 UTC
See Also:
GNOME target: ---
GNOME version: 3.7/3.8


Attachments
Add deprecation warning when setting gpointers to anything other than int (1.34 KB, patch)
2012-09-14 04:02 UTC, Simon Feltman
none Details | Review
Changed UserWarning to RuntimeWarning (1.34 KB, patch)
2012-09-14 04:45 UTC, Simon Feltman
committed Details | Review
Clean up deprecation message for assigning gpointers to objects (1.24 KB, patch)
2012-09-20 02:15 UTC, Simon Feltman
committed Details | Review
Deprecate void pointer fields as general PyObject storage. (9.45 KB, patch)
2012-10-14 22:02 UTC, Simon Feltman
none Details | Review
Rebased patch (9.33 KB, patch)
2012-10-16 22:33 UTC, Simon Feltman
committed Details | Review
Deprecate void pointer fields as general PyObject storage. (9.33 KB, patch)
2012-10-23 10:32 UTC, Simon Feltman
committed Details | Review

Description Simon Feltman 2012-09-07 23:11:29 UTC
Currently when assigning a void pointer field on a boxed type, the previously held value will be blindly decrefed as a python object. This is followed by the pointer of the python object being assigned and stored on the void pointer and increfed. Essentially, pygobject attempts to treat assignment and retrieval of void pointers as python object storage. This somewhat works, however, there are no callbacks for when the boxed types are freed, copied or destroyed to help manage the reference count of these python object pointers being held. It can be difficult to impossible avoid reference leaks and also makes it possible for crashes with this strategy.

Ideally it would be nice if we did not allow void pointers as general python object storage and instead restrict it to direct int values. Only objects which can convert to an int would be assignable and only the converted int value stored. This is essentially how ctypes deals with void pointers, any object which cannot convert to int will raise an exception upon assignment. The solution for referencing general objects is then left up to the application or layer on top of pygobject and removes pygobject from trying to make a general assumption about how to deal with python objects stored on void pointers which will surely be wrong some of the time.

This mostly seems to apply to GtkTreeIters as there are three "user_data" void pointer fields which can currently be used for general storage of python object pointers. To utilize this with the proposed strategy, GtkTreeModel implementations can manage a pool of user_data which has been assigned to the void pointer fields. The fields themself would hold the id() of the python object and the pool would be a dict using the id as the key for retrieving the actual object. This removes potential pitfalls in regards to reference leaks and crashes.

I realize the current behavior is a relatively new submission to pygobject so it might not be that difficult to change if people are in agreement. But we could also deprecate the current behavior by type checking assignments to void pointers and if the input is not convertible to an int, issue a deprecation warning.

To work around this I must use ctypes to modify the user_data fields directly as int values so that the pointers to python int objects are not assigned and ref counted. I think this should be the rule not the exception.
Comment 1 Simon Feltman 2012-09-07 23:15:47 UTC
Cédric,
I've added you to the CC list as it would be great to get your feedback to my specific comments as well as hear about your use case in regards to the original bug you filed and patched (https://bugzilla.gnome.org/show_bug.cgi?id=668356)

Thanks
Comment 2 Simon Feltman 2012-09-14 04:02:39 UTC
Created attachment 224299 [details] [review]
Add deprecation warning when setting gpointers to anything other than int

This is a first pass which does not change anything except add a warning when anything other than an int is set on a gpointer on a boxed type.

I opted to use the python warning system instead of g_warning as it does not repeat the same warnings.

It would be great to get another devs input on this as a whole and if possible get the deprecation warning into the upcomming release if others agree with it?
Comment 3 Simon Feltman 2012-09-14 04:45:44 UTC
Created attachment 224300 [details] [review]
Changed UserWarning to RuntimeWarning

Changed to RuntimeWarning as it seems a better fit. I would have preferred DeprecationWarning but for some reason python stopped printing these by default in Python 2.7 which is not what I would expect.
Comment 4 Martin Pitt 2012-09-17 06:46:40 UTC
Comment on attachment 224300 [details] [review]
Changed UserWarning to RuntimeWarning

Yes, this makes sense to me. At this point of the release cycle, the runtime warning avoids breaking behaviour and is not intrusive, so please go ahead and commit.
Comment 5 Simon Feltman 2012-09-17 09:26:19 UTC
Re-opened as the submitted patch was just the first part of the deprecation.
Comment 6 Benny Malengier 2012-09-19 13:33:40 UTC
We hit this deprecation warning in our application. Our database keys we store in the userdata of a TreeIter, for use in a TreeView/ListView. The database keys are strings, not integer (bsddb database keys). 

Is it not possible to allow for string? Limiting to int is problematic, how can we store the database key in the TreeIter then? Or some other workaround for TreeIter?
Comment 7 Simon Feltman 2012-09-19 20:29:32 UTC
Hi Benny,

I assume you are also implementing a custom TreeModel? It is a little extra work but it can be done using a pool of user_data localized to the model. You can use accessor methods on the model to get and set the data properly. I know it can be a drag to hit a deprecation like this, but it is the only way to avoid reference leaks and potential instability.

class CustomModel(GObject.GObject, Gtk.TreeModel):
    def __init__(self):
        super(CustomModel, self).__init__()
        self.pool = {}

    def set_user_data(self, it, user_data):
        data_id = id(user_data)
        it.user_data = data_id
        self.pool[data_id] = user_data

    def get_user_data(self, it):
        return self.pool[it.user_data]

    def create_iter(self, user_data):
        it = Gtk.TreeIter()
        self.set_user_data(it, user_data)
        return it

Any time you create an iter, get, or set its data you would use these accessor methods. There is also a "GenericTreeModel" implementation in the works which will take care of this all for you transparently and is compatible with what was available in pygtk. I will submit an additional patch for this later today if you are interested in trying it out: 
https://bugzilla.gnome.org/show_bug.cgi?id=682933
Comment 8 Benny Malengier 2012-09-19 21:07:22 UTC
Hi Simon.
We are already fully ported to GTK3, not released yet, so going back to a GenericTreeModel would be quite some work. We want to support pygobject 3.3.2 anyway. 
I immediately thought about using a dictionary as workaround, but obviously we use a custom treemodel because the liststore and such are (were) too slow when last tested out (we want to support somewhat reasonable performance on 10000+ lists). Everything extra that can be avoided is then a plus off course.

We trow away the iters now, only need to find the path, so for a flat list the user data could probably be the path I suppose. We already have a map from database key to path anyway. 
For treeviews, the path could be used to construct an integer, if p is number larger than biggest path number, then for path (a,b,c), use integer a p^2+ bp+c. I'm not into the C code of GTK, but looking at it like that, a bit strange the iter could be used to save the path integer.
Comment 9 Simon Feltman 2012-09-19 23:08:12 UTC
So it sounds like you can get by using integers? There are also a few more options. If you only have a maximum depth of three (a,b,c) for the path, you could store each of those individually on the iter as there are three user_data fields (additionally user_data2 and user_data3).

Another approach would be to manage assignments of raw pointers to the user data manually using ctypes. This puts the burden of leaking refs on the application developer. I have some examples of doing this if you are interest, actually the GenericTreeModel implementation will need to do this to work around the current behavior that is being deprecated.

If you are trying to release soon and don't want to change too much in the immediate term, assignment of any object type still works it just prints the warning. You could suppress the warning globally using something like:
>>> import warnings
>>> warnings.filterwarnings('ignore', 'Usage of gpointers.*', RuntimeWarning)

For more info, see: http://docs.python.org/library/warnings.html
Comment 10 Simon Feltman 2012-09-20 02:15:11 UTC
Created attachment 224798 [details] [review]
Clean up deprecation message for assigning gpointers to objects

The previous deprecation message was worded as if the deprecation had already occurred and it has not. Rather it should just be a warning that the deprecation will occur in the future.
Comment 11 Martin Pitt 2012-09-20 04:39:07 UTC
Comment on attachment 224798 [details] [review]
Clean up deprecation message for assigning gpointers to objects

Thanks! Looks fine.
Comment 12 Benny Malengier 2012-09-20 08:19:59 UTC
Ok, I will see how converting to int works out, and if problems, post back. I don't see any leaks at the moment using strings though (with gc module). Perhaps gc does not show this issue ...

As to GenericTreeModel, I don't see the point of it. Working with a class that inherits TreeModel works perfectly, so what would be the benefit of GenericTreeModel. In our application, the TreeModel implementations are:

http://gramps.svn.sourceforge.net/viewvc/gramps/trunk/src/gui/views/treemodels/flatbasemodel.py?view=log
and
http://gramps.svn.sourceforge.net/viewvc/gramps/trunk/src/gui/views/treemodels/treebasemodel.py?view=log

Now that in python TreeModel can be inherited, an extra layer of GenericTreeModel seems just another piece of code to maintain. I'm actually already worried about the fact that overrides/Gtk.py already contains so much code. Things start to divert from what you would expect from looking at the C code documentation.
Comment 13 Simon Feltman 2012-09-20 10:05:32 UTC
Any assignment to TreeIter.user_data will increment the count of the object being assigned without a decref when the iter becomes unused by the Gtk internals. How were you using gc to analyse this?

A brief look through the code and it looks like only the FlatBaseModel will need modification? The TreeBaseModel already uses the id() of the node for the user_data so that should already be working without warning. As for FlatBaseModel it seems like using the index/path instead of the handle for the user_data should work.

The goal of GenericTreeModel is to help porting from pygtk with an implementation that contains better reference management than the original. It will also help in the case of custom models which use objects as iterators without having to think about id/object mapping. For the case of the models I just looked at, it doesn't make much sense as they are already work with TreeModel and they already manage custom mappings.
Comment 14 Benny Malengier 2012-09-20 10:16:23 UTC
Simon, 
gc.garbage
We have a built in tool to analyse leaks:
http://gramps.svn.sourceforge.net/viewvc/gramps/trunk/src/plugins/tool/leak.py?revision=20424&view=markup

See the display method there. However, looking now at the docs of gc, only objects with a __del__, so I suppose pure strings are indeed not shown.
Comment 15 Martin Pitt 2012-09-24 07:52:41 UTC
Comment on attachment 224798 [details] [review]
Clean up deprecation message for assigning gpointers to objects

Committed the cleaned up deprecation warning (will be in 3.4.1).
Comment 16 Simon Feltman 2012-10-14 22:02:50 UTC
Created attachment 226427 [details] [review]
Deprecate void pointer fields as general PyObject storage.

Complete deprecation of gpointer struct fields as general
PyObject storage. Only int types are now allowed.
Assignment of anything other than an int or None raises
a TypeError stating the error and  associated bug URL.
Comment 17 Simon Feltman 2012-10-16 22:33:45 UTC
Created attachment 226592 [details] [review]
Rebased patch

Complete deprecation of gpointer struct fields as general
PyObject storage. Only int types are now allowed.
Assignment of anything other than an int or None raises
a TypeError stating the error and  associated bug URL.
Comment 18 Martin Pitt 2012-10-23 07:32:16 UTC
Comment on attachment 226592 [details] [review]
Rebased patch

Looks good to me, thanks! Please get it in now, so that we can test it during the whole 3.7.x period.
Comment 19 Simon Feltman 2012-10-23 10:32:08 UTC
The following fix has been pushed:
326218a Deprecate void pointer fields as general PyObject storage.
Comment 20 Simon Feltman 2012-10-23 10:32:13 UTC
Created attachment 227046 [details] [review]
Deprecate void pointer fields as general PyObject storage.

Complete deprecation of gpointer struct fields as general
PyObject storage. Only int types are now allowed.
Assignment of anything other than an int or None raises
a TypeError stating the error and  associated bug URL.