GNOME Bugzilla – Bug 653151
seg fault after get_value in GTK, unable to inherit Gtk.TreeModel
Last modified: 2012-06-01 10:37:57 UTC
Created attachment 190411 [details] example code of inheriting Gtk.TreeModel Inheriting from Gtk.TreeModel in python seems to work up to a certain point. As soon as get_value has returned, a segmentation error occurs on my box. I would foremost be interested if this is due to my install or not, so if somebody could just run the python file in attachment to see if get_value can be passed, that would be great. This is a blocker for porting Gramps to GTK3, as the different views into the database are implemented using custom Treemodels. The file treeviewgramps.py in attachment is my trial code showing the error and printing what is called to terminal. The code is due to this bug only debugged up to the point where do_get_value is called. Any pointers on how to proceed to port this part of my application to GTK3 are welcome.
Created attachment 190447 [details] backtrace with gdb Uploaded backtrace. The error is in type check in setting the attribute of a cell renderer for the column. I'll investigate my column building code.
The error is in if (node->data && NODE_REFCOUNT (node) > 0 && node->data->common.value_table->value_init) of the function type_check_is_value_type_U, gtype.c I see node is a TypeNode, and that data there is volatile, so I assume the pointer is no longer valid, so the test node->data is not catching this?? I fail to see how it can become invalid, I'm afraid my gobject C knowledge only goes that far. My guess would be that for ListStore the caller apply_cell_attributes in http://git.gnome.org/browse/gtk+/tree/gtk/gtkcellarea.c?id=3.0.8 sets GValue value = { 0, }; and ListStore get_value being pure C just updates this, while my python code creates a new GValue in python which the introspection layer then must convert and something goes wrongs there.... Hope somebody has time to look at this or my 1Mb patch for gtk3 upgrade is worthless :-(
Is it possible that the annotation for gtk_tree_model_get_value is not ok: * @tree_model: a #GtkTreeModel * @iter: the #GtkTreeIter * @column: the column to lookup the value at * @value: (out) (transfer none): an empty #GValue to set Should @value really be out, transfer none? If I look at how it is used in gtk_list_store_get_value, the GValue obtained is filled with data, no care is given about passing data of a gtktreedatalist. Nevertheless, g_value_unset will be called immediately after use, eg in gtkcellarea. Where and how can I find how the conversion happens between python call and gobject? Is that http://git.gnome.org/browse/pygobject/tree/gi/pygi-argument.c ? If so, the release code contains around line 1977: if ( (direction == GI_DIRECTION_IN && transfer == GI_TRANSFER_NOTHING) || (direction == GI_DIRECTION_OUT && transfer != GI_TRANSFER_NOTHING)) { g_slice_free (GValue, value); } Which would mean the GValue is freed, causing possibly this crash?
Annotation is correct. Those are out caller-allocated, which means the caller allocates the storage for the out value. We do that automatically and then return the actual value, freeing the GValue. There is a chance we get it wrong somewhere. I am currently tracking a similar issue down in my invoke-rewrite branch which uses different code for the same concept.
Comment on attachment 190411 [details] example code of inheriting Gtk.TreeModel Some comments inline: >#------------------------------------------------------------------------- ># ># FlatBaseModel ># >#------------------------------------------------------------------------- >##TODO GTK3: no more GenericTreemodel to inherit from! >class FlatBaseModel(GObject.GObject, Gtk.TreeModel): Shouldn't that be GObject.Object? > """ > The base class for all flat treeview models. > It keeps a FlatNodeMap, and obtains data from database as needed > """ > __gtype_name__ = 'FlatBaseModel' > > fmap = [ > lambda x: x.is_fixed, > lambda x: x.number, > lambda x: x.severity, > lambda x: x.description, > lambda x: 0, > lambda x: 'add', > lambda x: False, > lambda x: False, > ] > > COLTYPE = [ > GObject.TYPE_BOOLEAN, > GObject.TYPE_INT, > GObject.TYPE_STRING, > GObject.TYPE_STRING, > GObject.TYPE_INT, > GObject.TYPE_STRING, > GObject.TYPE_BOOLEAN, > GObject.TYPE_BOOLEAN > ] > > def __init__(self): > cput = time.clock() > GObject.GObject.__init__(self) I'm not sure but you may need to initialize TreeModel also even if it is just an interface since Python doesn't have interfaces. Better yet, just use super to handle the inheritance.
I'll try the changes to my test file this evening when back at my devel machine. If I remember correctly, doing init on TreeModel gives a not implemented error, but I did not use super.
Object and super give the same segfault. I tried to install latest pygobject 2.28.6 so as to have a stab at seeing where the passing of the GValue argument goes wrong, but failed with problems cannot install `_glib.la' to a directory not ending in /usr/local/lib/python2.7/site-packages/glib If I install in default location without --prefix, the default libs take precedence.
Created attachment 205634 [details] shorter test file that gives seg fault Ok, based on a recent mail of OLPC ( http://mail.gnome.org/archives/python-hackers-list/2011-December/msg00010.html ) I created a much shorter test code that also gives the seg fault on my box. First, it seems I can't get values to pass to the model. If you run the code you see the output: do get_value <GtkTreeIter at 0x2459320> 0 /usr/lib/python2.7/dist-packages/gi/types.py:43: Warning: unable to set property `text' of type `gchararray' from value of type `(null)' So perhaps this later on causes issues. You see that otherwise the code follows quite nicely up to the segfault.
Created attachment 205802 [details] cleaned up the short test file to have correct return values
Created attachment 206178 [details] Minimal test case I also have a segfault with do_get_value. The python script is a really short testcase. But the backtrace is different (I use debian debug packages so line number may be a bit different)
+ Trace 229521
Can someone of the pygobject hackers at least run the short example code and put this bug finally on Confirmed?
I confirm that Benny's test program in comment 9 still triggers the segfault. @Nicolas: With your script and pygobject 3.1.0 I get ** WARNING **: (/build/buildd/pygobject-3.1.0/gi/pygi-argument.c:717):_pygi_argument_from_object: runtime check failed: (transfer == GI_TRANSFER_NOTHING) Traceback (most recent call last):
+ Trace 229671
iter_.user_data = t
and your backtrace. Accessing gpointer struct values was recently fixed in trunk: http://git.gnome.org/browse/pygobject/commit/?id=4aeb27efc43e131de5d0bc so in jhbuild I now get $ jhbuild run python do_get_value_bug.py do_get_iter 0 140260265615752 do_get_value Traceback (most recent call last): File "do_get_value_bug.py", line 37, in <module> main() File "do_get_value_bug.py", line 33, in main print store.get_value(it, 0) File "/home/martin-scratch/gnome/lib/python2.7/site-packages/gi/types.py", line 43, in function return info.invoke(*args, **kwargs) TypeError: unknown type (null) and no segfault any more. This looks like a different bug than Benny's, so it would be less confusing if you could report this separately. Thanks!
I changed the test case by Benny a little bit by putting the following code into do_get_value method: value_container = GObject.Value() value_container.init(GObject.TYPE_STRING) value_container.set_string('test') return value_container If you first set a break point at gtk_tree_model_get_value and once that has been reached at g_value_set_string you will see something similar to Breakpoint 2, g_value_set_string (value=0xad9100, v_string=0xb0b770 "test") at gvaluetypes.c:1035 Here, 'value' will eventually look like $2 = {g_type = 64, data = {{v_int = 11581328, v_uint = 11581328, v_long = 11581328, v_ulong = 11581328, v_int64 = 11581328, v_uint64 = 11581328, v_float = 1.62288971e-38, v_double = 5.7219362980193122e-317, v_pointer = 0xb0b790}, {v_int = 0, v_uint = 0, v_long = 0, v_ulong = 0, v_int64 = 0, v_uint64 = 0, v_float = 0, v_double = 0, v_pointer = 0x0}}} after continuing execution you will reach the SIGSEGV in type_check_is_value_type_U. After jumping to frame #3 apply_cell_attributes you can print the GValue that was obtained by gtk_tree_model_get_value: $3 = {g_type = 11374816, data = {{v_int = 0, v_uint = 0, v_long = 0, v_ulong = 0, v_int64 = 0, v_uint64 = 0, v_float = 0, v_double = 0, v_pointer = 0x0}, {v_int = 0, v_uint = 0, v_long = 0, v_ulong = 0, v_int64 = 0, v_uint64 = 0, v_float = 0, v_double = 0, v_pointer = 0x0}}} Obviously, that is not the GValue that was initially created. In fact, the correct GValue is located at address 0xad9100 = 11374816 which is used as the g_type here. My guess is that we are dereferencing the GValue pointer which we shouldn't. Where exactly this is happening needs more investigation, though.
I did some more digging. The wrong value gets assigned in _pygi_closure_assign_pyobj_to_out_argument (default case). arg.v_pointer points to the correct GValue. The function is called in _pygi_closure_assign_pyobj_to_out_argument where the void pointer of GIArgument.v_pointer is provided as argument. I'm not familiar with this part of the code, we might have to ping J5.
I had a look at this and it is pretty in depth. I think the issue is that caller allocates hasn't been tested going the other way and we might be reassigning with a new gvalue or copying the same gvalue into itself and then freeing the gvalue we think we caller allocated. I don't have time to look into it until the weekend and even then I'm not sure if this is something I have time to figure out. In any case we shouldn't be returning anything from do_get_value. The caller allocated gvalue should be set in user code and we shouldn't be touching it in the pygobject code. If you want to make it transparent to the user you wouldn't even send in the gvalue. You would take the return object and marshal it to the gvalue based on its type. The user shouldn't have to interact with the gvalue directly. This should all be handled by the vfunc/callback closure code. The basic idea would be for any out parameter there should be a returned python object. When marshalling those objects if it is caller-allocated, you memcpy the actual value from the python object to the caller allocated structure. However if it is a caller-allocated gvalue (and this most likely is the same for variants though I don't know if we caller-allocate those), you must check the gvalue's type and correctly marshall the contents of the python object to that. I'm pretty sure we have a helper function for that in the closure code.
Created attachment 214260 [details] Updated testcase The testcase in comment 9 had a couple of gotchas that fooled GtkTreeModel itself, this updated testcase gets those out of the equation
Created attachment 214261 [details] [review] patch to fix the issue From what I investigated, the trouble seems to happen on structs passed by reference as out parameters, as _pygi_closure_assign_pyobj_to_out_argument() doesn't handle copying the result into the location passed by the native caller, the model gets fed with weirdly filled in GtkTreeIters and GValues. The attached patch fixes this issue.
Created attachment 214263 [details] a more complete testcase, implementing all vfuncs
This is good news indeed! In the meanwhile, I've started porting my program to Vala, so I don't really have the ability to test this anymore. Or more precisely, it'll take a while before I'll get the ability to test it again. :) My issue is/was very similar to Benny's, so I'm looking forward to seeing responses to this. No doubt, it'll be highly useful. I'm very grateful for your work, Mr. Garnacho!
If the fix can be in the autumn releases of the big distributions, then we can do a migrate to pygobject in the next full release in spring 2013. If it could be in the bug fix releases earlier, even better.
Thanks for the patch Carlos! Your test program does display correctly now for me. As well the one from Nicolas #c10 does not produce a TypeError anymore for get_value and does return the correct value.
I am currently trying to write a minimal test case for this using only the G-I test classes and GLib/GObject. The smallest reduction of the Gtk.TreeModel case that I can get to is: ----------- 8< --------------- from gi.repository import Gtk, GObject class Model(GObject.GObject, Gtk.TreeModel): def do_get_value(self, iter_, col_idx): print("do_get_value", iter_, col_idx) return 1 store = Model() print(store.get_value(Gtk.TreeIter(), 0)) ----------- 8< --------------- While working on that, I added a test which tests a caller-allocated GValue parameter in a C function: http://git.gnome.org/browse/gobject-introspection/commit/?id=784b60e85561a3b4a98e33df159a77ae2daef1c7 http://git.gnome.org/browse/pygobject/commit/?id=e1aaf4a48453be0e69e7f3a70a2e7a790871a4d2 This does not trigger this bug, as it is from C, not from Python (thus the other way round), but I think it cannot hurt to ensure that this keeps working as well.
I now committed an API to g-i's marshalling tests which matches this bug, i. e. a vfunc taking a caller-allocated GValue argument: http://git.gnome.org/browse/gobject-introspection/commit/?id=932bf2a700368e46b30b58ec9b6beb3ff09a081c With that, it's trivial to reproduce. I committed your patch together with a test case: http://git.gnome.org/browse/pygobject/commit/?id=853e6a71234ebd66af5a64dfb296e323c2c905a6 While I was at it, I also added some more checks for other types of vfuncs, as GIMarshallingTestsObject already provides them: http://git.gnome.org/browse/pygobject/commit/?id=bac9d526f6a9774821d1c9c0e7b35cc6db942975 Thank you for figuring this out!