After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 653151 - seg fault after get_value in GTK, unable to inherit Gtk.TreeModel
seg fault after get_value in GTK, unable to inherit Gtk.TreeModel
Status: RESOLVED FIXED
Product: pygobject
Classification: Bindings
Component: introspection
unspecified
Other Linux
: Normal major
: ---
Assigned To: Nobody's working on this now (help wanted and appreciated)
Python bindings maintainers
Depends on:
Blocks:
 
 
Reported: 2011-06-22 07:46 UTC by Benny Malengier
Modified: 2012-06-01 10:37 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
example code of inheriting Gtk.TreeModel (30.82 KB, text/plain)
2011-06-22 07:46 UTC, Benny Malengier
  Details
backtrace with gdb (17.06 KB, text/plain)
2011-06-22 15:31 UTC, Benny Malengier
  Details
shorter test file that gives seg fault (2.67 KB, text/x-python)
2012-01-19 16:04 UTC, Benny Malengier
  Details
cleaned up the short test file to have correct return values (3.21 KB, text/x-python)
2012-01-22 11:02 UTC, Benny Malengier
  Details
Minimal test case (751 bytes, text/plain)
2012-01-26 12:39 UTC, Nicolas Évrard
  Details
Updated testcase (3.21 KB, text/plain)
2012-05-17 15:31 UTC, Carlos Garnacho
  Details
patch to fix the issue (1.49 KB, patch)
2012-05-17 15:38 UTC, Carlos Garnacho
none Details | Review
a more complete testcase, implementing all vfuncs (2.83 KB, text/plain)
2012-05-17 15:40 UTC, Carlos Garnacho
  Details

Description Benny Malengier 2011-06-22 07:46:35 UTC
Created attachment 190411 [details]
example code of inheriting Gtk.TreeModel

Inheriting from Gtk.TreeModel in python seems to work up to a certain point. 
As soon as get_value has returned, a segmentation error occurs on my box. 

I would foremost be interested if this is due to my install or not, so if somebody could just run the python file in attachment to see if get_value can be passed, that would be great.

This is a blocker for porting Gramps to GTK3, as the different views into the database are implemented using custom Treemodels.

The file treeviewgramps.py in attachment is my trial code showing the error and printing what is called to terminal. The code is due to this bug only debugged up to the point where do_get_value is called.

Any pointers on how to proceed to port this part of my application to GTK3 are welcome.
Comment 1 Benny Malengier 2011-06-22 15:31:48 UTC
Created attachment 190447 [details]
backtrace with gdb

Uploaded backtrace. The error is in type check in setting the attribute of a cell renderer for the column.
I'll investigate my column building code.
Comment 2 Benny Malengier 2011-06-22 20:05:47 UTC
The error is in 
      if (node->data && NODE_REFCOUNT (node) > 0 &&
	  node->data->common.value_table->value_init)
of the function type_check_is_value_type_U, gtype.c
I see node is a TypeNode, and that data there is volatile, so I assume the pointer is no longer valid, so the test node->data is not catching this?? 

I fail to see how it can become invalid, I'm afraid my gobject C knowledge only goes that far. My guess would be that for ListStore the caller apply_cell_attributes in http://git.gnome.org/browse/gtk+/tree/gtk/gtkcellarea.c?id=3.0.8 sets
GValue         value = { 0, };
and ListStore get_value being pure C just updates this, while my python code creates a new GValue in python which the introspection layer then must convert and something goes wrongs there.... 

Hope somebody has time to look at this or my 1Mb patch for gtk3 upgrade is worthless :-(
Comment 3 Benny Malengier 2011-06-23 14:47:39 UTC
Is it possible that the annotation for gtk_tree_model_get_value is not ok:
 * @tree_model: a #GtkTreeModel
 * @iter: the #GtkTreeIter
 * @column: the column to lookup the value at
 * @value: (out) (transfer none): an empty #GValue to set

Should @value really be out, transfer none? If I look at how it is used in gtk_list_store_get_value, the GValue obtained is filled with data, no care is given about passing data of a gtktreedatalist. 
Nevertheless, g_value_unset will be called immediately after use, eg in gtkcellarea.

Where and how can I find how the conversion happens between python call and gobject? Is that http://git.gnome.org/browse/pygobject/tree/gi/pygi-argument.c ?

If so, the release code contains around line 1977:

   if ( (direction == GI_DIRECTION_IN && transfer == GI_TRANSFER_NOTHING)
          || (direction == GI_DIRECTION_OUT && transfer != GI_TRANSFER_NOTHING)) {
       g_slice_free (GValue, value);
   }

Which would mean the GValue is freed, causing possibly this crash?
Comment 4 johnp 2011-06-24 04:54:46 UTC
Annotation is correct.  Those are out caller-allocated, which means the caller allocates the storage for the out value.  We do that automatically and then return the actual value, freeing the GValue.  There is a chance we get it wrong somewhere.  I am currently tracking a similar issue down in my invoke-rewrite branch which uses different code for the same concept.
Comment 5 johnp 2011-06-24 04:59:59 UTC
Comment on attachment 190411 [details]
example code of inheriting Gtk.TreeModel

Some comments inline:

>#-------------------------------------------------------------------------
>#
># FlatBaseModel
>#
>#-------------------------------------------------------------------------
>##TODO GTK3: no more GenericTreemodel to inherit from!
>class FlatBaseModel(GObject.GObject, Gtk.TreeModel):

Shouldn't that be GObject.Object?

>    """
>    The base class for all flat treeview models. 
>    It keeps a FlatNodeMap, and obtains data from database as needed
>    """
>    __gtype_name__ = 'FlatBaseModel'
>    
>    fmap = [
>        lambda x: x.is_fixed,
>        lambda x: x.number,
>        lambda x: x.severity,
>        lambda x: x.description,
>        lambda x: 0,
>        lambda x: 'add',
>        lambda x: False,
>        lambda x: False,
>        ]
>    
>    COLTYPE = [
>                GObject.TYPE_BOOLEAN,
>                GObject.TYPE_INT,
>                GObject.TYPE_STRING,
>                GObject.TYPE_STRING,
>                GObject.TYPE_INT,
>                GObject.TYPE_STRING,
>                GObject.TYPE_BOOLEAN,
>                GObject.TYPE_BOOLEAN
>                ]
>
>    def __init__(self):
>        cput = time.clock()
>        GObject.GObject.__init__(self)

I'm not sure but you may need to initialize TreeModel also even if it is just an interface since Python doesn't have interfaces.  Better yet, just use super to handle the inheritance.
Comment 6 Benny Malengier 2011-06-24 07:14:11 UTC
I'll try the changes to my test file this evening when back at my devel machine. If I remember correctly, doing init on TreeModel gives a not implemented error, but I did not use super.
Comment 7 Benny Malengier 2011-06-25 10:01:32 UTC
Object and super give the same segfault. 
I tried to install latest pygobject 2.28.6 so as to have a stab at seeing where the passing of the GValue argument goes wrong, but failed with problems 
cannot install `_glib.la' to a directory not ending in /usr/local/lib/python2.7/site-packages/glib
If I install in default location without --prefix, the default libs take precedence.
Comment 8 Benny Malengier 2012-01-19 16:04:26 UTC
Created attachment 205634 [details]
shorter test file that gives seg fault

Ok, based on a recent mail of OLPC ( http://mail.gnome.org/archives/python-hackers-list/2011-December/msg00010.html ) I created a much shorter test code that also gives the seg fault on my box. 

First, it seems I can't get values to pass to the model. If you run the code you see the output:

do get_value <GtkTreeIter at 0x2459320> 0
/usr/lib/python2.7/dist-packages/gi/types.py:43: Warning: unable to set property `text' of type `gchararray' from value of type `(null)'

So perhaps this later on causes issues. You see that otherwise the code follows quite nicely up to the segfault.
Comment 9 Benny Malengier 2012-01-22 11:02:40 UTC
Created attachment 205802 [details]
cleaned up the short test file to have correct return values
Comment 10 Nicolas Évrard 2012-01-26 12:39:17 UTC
Created attachment 206178 [details]
Minimal test case

I also have a segfault with do_get_value. The python script is a really short testcase.
But the backtrace is different (I use debian debug packages so line number may be a bit different)

  • #0 type_get_qdata_L
    at /tmp/buildd/glib2.0-2.30.2/./gobject/gtype.c line 3630
  • #1 g_type_get_qdata
    at /tmp/buildd/glib2.0-2.30.2/./gobject/gtype.c line 3670
  • #2 pyg_type_lookup
    at /home/nicoe/build/pygobject-3.0.3/gi/_gobject/pygtype.c line 630
  • #3 pyg_value_as_pyobject
    at /home/nicoe/build/pygobject-3.0.3/gi/_gobject/pygtype.c line 1150
  • #4 _pygi_marshal_to_py_interface_struct
    at /home/nicoe/build/pygobject-3.0.3/gi/pygi-marshal-to-py.c line 698
  • #5 _invoke_marshal_out_args
    at /home/nicoe/build/pygobject-3.0.3/gi/pygi-invoke.c line 572
  • #6 _wrap_g_callable_info_invoke
    at /home/nicoe/build/pygobject-3.0.3/gi/pygi-invoke.c line 645
  • #7 PyEval_EvalFrameEx
  • #8 PyEval_EvalCodeEx
  • #9 PyEval_EvalFrameEx
  • #10 PyEval_EvalFrameEx
  • #11 PyEval_EvalCodeEx
  • #12 PyRun_FileExFlags
  • #13 PyRun_SimpleFileExFlags
  • #14 Py_Main
  • #15 __libc_start_main
    from /lib/x86_64-linux-gnu/libc.so.6
  • #16 _start

Comment 11 Benny Malengier 2012-02-14 09:58:16 UTC
Can someone of the pygobject hackers at least run the short example code and put this bug finally on Confirmed?
Comment 12 Martin Pitt 2012-02-15 07:23:58 UTC
I confirm that Benny's test program in comment 9 still triggers the segfault.


@Nicolas: With your script and pygobject 3.1.0 I get

** WARNING **: (/build/buildd/pygobject-3.1.0/gi/pygi-argument.c:717):_pygi_argument_from_object: runtime check failed: (transfer == GI_TRANSFER_NOTHING)
Traceback (most recent call last):
  • File "do_get_value_bug.py", line 18 in do_get_iter
    iter_.user_data = t
RuntimeError: unable to set value for field

and your backtrace. Accessing gpointer struct values was recently fixed in trunk:
http://git.gnome.org/browse/pygobject/commit/?id=4aeb27efc43e131de5d0bc

so in jhbuild I now get

$ jhbuild run python do_get_value_bug.py 
do_get_iter 0 140260265615752
do_get_value
Traceback (most recent call last):
  File "do_get_value_bug.py", line 37, in <module>
    main()
  File "do_get_value_bug.py", line 33, in main
    print store.get_value(it, 0)
  File "/home/martin-scratch/gnome/lib/python2.7/site-packages/gi/types.py", line 43, in function
    return info.invoke(*args, **kwargs)
TypeError: unknown type (null)

and no segfault any more. This looks like a different bug than Benny's, so it would be less confusing if you could report this separately.

Thanks!
Comment 13 Sebastian Pölsterl 2012-02-17 20:29:30 UTC
I changed the test case by Benny a little bit by putting the following code into do_get_value method:

    value_container = GObject.Value()
    value_container.init(GObject.TYPE_STRING)
    value_container.set_string('test')

    return value_container

If you first set a break point at gtk_tree_model_get_value and once that has been reached at g_value_set_string you will see something similar to

Breakpoint 2, g_value_set_string (value=0xad9100, v_string=0xb0b770 "test")
    at gvaluetypes.c:1035

Here, 'value' will eventually look like

$2 = {g_type = 64, data = {{v_int = 11581328, v_uint = 11581328, 
      v_long = 11581328, v_ulong = 11581328, v_int64 = 11581328, 
      v_uint64 = 11581328, v_float = 1.62288971e-38, 
      v_double = 5.7219362980193122e-317, v_pointer = 0xb0b790}, {v_int = 0, 
      v_uint = 0, v_long = 0, v_ulong = 0, v_int64 = 0, v_uint64 = 0, 
      v_float = 0, v_double = 0, v_pointer = 0x0}}}

after continuing execution you will reach the SIGSEGV in type_check_is_value_type_U. After jumping to frame #3 apply_cell_attributes you can print the GValue that was obtained by gtk_tree_model_get_value:

$3 = {g_type = 11374816, data = {{v_int = 0, v_uint = 0, v_long = 0, 
      v_ulong = 0, v_int64 = 0, v_uint64 = 0, v_float = 0, v_double = 0, 
      v_pointer = 0x0}, {v_int = 0, v_uint = 0, v_long = 0, v_ulong = 0, 
      v_int64 = 0, v_uint64 = 0, v_float = 0, v_double = 0, v_pointer = 0x0}}}

Obviously, that is not the GValue that was initially created. In fact, the correct GValue is located at address 0xad9100 = 11374816 which is used as the g_type here. My guess is that we are dereferencing the GValue pointer which we shouldn't. Where exactly this is happening needs more investigation, though.
Comment 14 Sebastian Pölsterl 2012-02-17 23:37:11 UTC
I did some more digging. The wrong value gets assigned in _pygi_closure_assign_pyobj_to_out_argument (default case). arg.v_pointer points to the correct GValue. The function is called in _pygi_closure_assign_pyobj_to_out_argument where the void pointer of GIArgument.v_pointer is provided as argument. I'm not familiar with this part of the code, we might have to ping J5.
Comment 15 johnp 2012-02-20 18:01:40 UTC
I had a look at this and it is pretty in depth.  I think the issue is that caller allocates hasn't been tested going the other way and we might be reassigning with a new gvalue or copying the same gvalue into itself and then freeing the gvalue we think we caller allocated.  

I don't have time to look into it until the weekend and even then I'm not sure if this is something I have time to figure out.  In any case we shouldn't be returning anything from do_get_value.  The caller allocated gvalue should be set in user code and we shouldn't be touching it in the pygobject code.

If you want to make it transparent to the user you wouldn't even send in the gvalue.  You would take the return object and marshal it to the gvalue based on its type.  The user shouldn't have to interact with the gvalue directly.

This should all be handled by the vfunc/callback closure code.  The basic idea would be for any out parameter there should be a returned python object.  When marshalling those objects if it is caller-allocated, you memcpy the actual value from the python object to the caller allocated structure.  However if it is a caller-allocated gvalue (and this most likely is the same for variants though I don't know if we caller-allocate those), you must check the gvalue's type and correctly marshall the contents of the python object to that.  I'm pretty sure we have a helper function for that in the closure code.
Comment 16 Carlos Garnacho 2012-05-17 15:31:24 UTC
Created attachment 214260 [details]
Updated testcase

The testcase in comment 9 had a couple of gotchas that fooled GtkTreeModel itself, this updated testcase gets those out of the equation
Comment 17 Carlos Garnacho 2012-05-17 15:38:10 UTC
Created attachment 214261 [details] [review]
patch to fix the issue

From what I investigated, the trouble seems to happen on structs passed by reference as out parameters, as _pygi_closure_assign_pyobj_to_out_argument() doesn't handle copying the result into the location passed by the native caller, the model gets fed with weirdly filled in GtkTreeIters and GValues. The attached patch fixes this issue.
Comment 18 Carlos Garnacho 2012-05-17 15:40:50 UTC
Created attachment 214263 [details]
a more complete testcase, implementing all vfuncs
Comment 19 Jo-Erlend Schinstad 2012-05-17 15:59:43 UTC
This is good news indeed! In the meanwhile, I've started porting my program to Vala, so I don't really have the ability to test this anymore. Or more precisely, it'll take a while before I'll get the ability to test it again. :)

My issue is/was very similar to Benny's, so I'm looking forward to seeing responses to this. No doubt, it'll be highly useful. I'm very grateful for your work, Mr. Garnacho!
Comment 20 Benny Malengier 2012-05-20 16:27:28 UTC
If the fix can be in the autumn releases of the big distributions, then we can do a migrate to pygobject in the next full release in spring 2013. If it could be in the bug fix releases earlier, even better.
Comment 21 Simon Schampijer 2012-05-24 12:18:55 UTC
Thanks for the patch Carlos! Your test program does display correctly now for me. As well the one from Nicolas #c10 does not produce a TypeError anymore for get_value and does return the correct value.
Comment 22 Martin Pitt 2012-06-01 10:06:28 UTC
I am currently trying to write a minimal test case for this using only the G-I test classes and GLib/GObject. The smallest reduction of the Gtk.TreeModel case that I can get to is:

----------- 8< ---------------
from gi.repository import Gtk, GObject
    
class Model(GObject.GObject, Gtk.TreeModel):
    def do_get_value(self, iter_, col_idx):
        print("do_get_value", iter_, col_idx)
        return 1
        
            
store = Model()
print(store.get_value(Gtk.TreeIter(), 0))
----------- 8< ---------------

While working on that, I added a test which tests a caller-allocated GValue parameter in a C function:

http://git.gnome.org/browse/gobject-introspection/commit/?id=784b60e85561a3b4a98e33df159a77ae2daef1c7
http://git.gnome.org/browse/pygobject/commit/?id=e1aaf4a48453be0e69e7f3a70a2e7a790871a4d2

This does not trigger this bug, as it is from C, not from Python (thus the other way round), but I think it cannot hurt to ensure that this keeps working as well.
Comment 23 Martin Pitt 2012-06-01 10:36:42 UTC
I now committed an API to g-i's marshalling tests which matches this bug, i. e. a vfunc taking a caller-allocated GValue argument:

http://git.gnome.org/browse/gobject-introspection/commit/?id=932bf2a700368e46b30b58ec9b6beb3ff09a081c

With that, it's trivial to reproduce. I committed your patch together with a test case:

http://git.gnome.org/browse/pygobject/commit/?id=853e6a71234ebd66af5a64dfb296e323c2c905a6

While I was at it, I also added some more checks for other types of vfuncs, as GIMarshallingTestsObject already provides them:

http://git.gnome.org/browse/pygobject/commit/?id=bac9d526f6a9774821d1c9c0e7b35cc6db942975

Thank you for figuring this out!
Comment 24 Martin Pitt 2012-06-01 10:37:57 UTC
I now committed an API to g-i's marshalling tests which matches this bug, i. e. a vfunc taking a caller-allocated GValue argument:

http://git.gnome.org/browse/gobject-introspection/commit/?id=932bf2a700368e46b30b58ec9b6beb3ff09a081c

With that, it's trivial to reproduce. I committed your patch together with a test case:

http://git.gnome.org/browse/pygobject/commit/?id=853e6a71234ebd66af5a64dfb296e323c2c905a6

While I was at it, I also added some more checks for other types of vfuncs, as GIMarshallingTestsObject already provides them:

http://git.gnome.org/browse/pygobject/commit/?id=bac9d526f6a9774821d1c9c0e7b35cc6db942975

Thank you for figuring this out!