After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 638915 - Array of GVariants causes segfault
Array of GVariants causes segfault
Status: RESOLVED FIXED
Product: pygobject
Classification: Bindings
Component: introspection
Git master
Other Linux
: Normal normal
: ---
Assigned To: Nobody's working on this now (help wanted and appreciated)
Python bindings maintainers
Depends on: 638929
Blocks:
 
 
Reported: 2011-01-07 15:18 UTC by Mikkel Kamstrup Erlandsen
Modified: 2011-10-25 07:18 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch against gobject-introspection adding extra test func (1.59 KB, patch)
2011-01-07 15:18 UTC, Mikkel Kamstrup Erlandsen
committed Details | Review
Patch against pygobject to add test test_gi.TestArray.test_array_gvariant_in (1.07 KB, patch)
2011-01-07 15:21 UTC, Mikkel Kamstrup Erlandsen
none Details | Review
Try to fix marshalling of GVariant*[] (2.30 KB, patch)
2011-01-24 08:23 UTC, Tomeu Vizoso
none Details | Review
Stack trace of deadlock as described (16.27 KB, text/plain)
2011-01-24 12:51 UTC, Mikkel Kamstrup Erlandsen
  Details
Updated trace of segfault against today's master (92.99 KB, text/plain)
2011-09-28 10:55 UTC, Mikkel Kamstrup Erlandsen
  Details
Patch against pygobject to add test test_gi.TestArray.test_array_gvariant_in (v2) (1.11 KB, patch)
2011-09-28 10:58 UTC, Mikkel Kamstrup Erlandsen
none Details | Review
WIP, fumbling about trying to get this working :-) (1.40 KB, patch)
2011-09-28 14:50 UTC, Mikkel Kamstrup Erlandsen
none Details | Review
Still WIP patch, but cleaned up. SIGABRT in test case (1.28 KB, patch)
2011-10-03 17:46 UTC, Mikkel Kamstrup Erlandsen
none Details | Review
Patch for gobject-introspection fixing return type annotations for gi_marshalling_tests_array_gvariant_in() (914 bytes, patch)
2011-10-04 10:05 UTC, Mikkel Kamstrup Erlandsen
none Details | Review
Patch for pygobject adding testcase and fix for marshaling of arrays of GVariants (4.46 KB, patch)
2011-10-04 10:34 UTC, Mikkel Kamstrup Erlandsen
accepted-commit_after_freeze Details | Review
Patch for gobject-introspection adding and fixing tests for arrays of GVariants (4.19 KB, patch)
2011-10-05 12:53 UTC, Mikkel Kamstrup Erlandsen
committed Details | Review
Patch for pygobject adding testcases and fixes for marshaling of arrays of GVariants (9.85 KB, patch)
2011-10-05 12:56 UTC, Mikkel Kamstrup Erlandsen
committed Details | Review

Description Mikkel Kamstrup Erlandsen 2011-01-07 15:18:02 UTC
Created attachment 177755 [details] [review]
Patch against gobject-introspection adding extra test func

If I pass in a tuple or list of GLib.Variants() to C function expecting a and array of GVariants (eg. GVariant**) then the C function receives a NULL disregarding the Python arguments.

Attaching a patch to gobject-introspection to add a new test function. Will follow with a failing test case for pygobject.
Comment 1 Mikkel Kamstrup Erlandsen 2011-01-07 15:21:04 UTC
Created attachment 177756 [details] [review]
Patch against pygobject to add test test_gi.TestArray.test_array_gvariant_in
Comment 2 Martin Pitt 2011-01-17 11:02:19 UTC
I committed the g-i test part to trunk.

I confirm that I get the same problem on pygobject git head.
Comment 3 Tomeu Vizoso 2011-01-17 12:48:05 UTC
See this ticket in glib/introspection for the root cause. PyGObject at least should raise an exception if it cannot figure out the size for the array elements.

https://bugzilla.gnome.org/show_bug.cgi?id=638929
Comment 4 Mikkel Kamstrup Erlandsen 2011-01-21 15:25:06 UTC
I guess the test for pygobject can be merged as well now that https://bugzilla.gnome.org/show_bug.cgi?id=638929 has been resolved?
Comment 5 Martin Pitt 2011-01-24 07:19:51 UTC
Mikkel,

it fails differently now, though:

test_array_gvariant_in (test_gi.TestArray) ... **
GLib:ERROR:/build/buildd/glib2.0-2.27.91/glib/gvarianttypeinfo.c:165:g_variant_type_info_check: assertion failed: (info->alignment == 0 || info->alignment == 1 || info->alignment == 3 || info->alignment == 7)

Perhaps you have some time to look into this? Tomeu said that we shouldn't commit tests which are known-failing, as it disturbs jhbuild, and packages which run the test suite during package build.
Comment 6 Tomeu Vizoso 2011-01-24 08:23:07 UTC
Created attachment 179135 [details] [review]
Try to fix marshalling of GVariant*[]

Last week gave it a try but haven't managed yet to either find the exact cause
nor figure out a fix.

I'm attaching it just in case it will help whoever tries to fix it, as I'm not
likely to have time for this in the near future.
Comment 7 Mikkel Kamstrup Erlandsen 2011-01-24 12:51:37 UTC
Created attachment 179158 [details]
Stack trace of deadlock as described

If I run 'make check TEST_NAMES=test_gi' I can reproduce Martin's error. Most worryingly I can trigger a deadlock inside GVariant if I run it like 'make check TEST_NAMES=test_gi.TestArray' instead...

If I use the check.gdb target and Ctrl-C on the deadlock I can see the attached stack trace. I'll be looking a bit more into this...
Comment 8 Mikkel Kamstrup Erlandsen 2011-01-28 10:23:09 UTC
After more digging, it seems that the deadlock is due to bad ref counting on the gvariants. See the bug I filed against gvariant here for details: https://bugzilla.gnome.org/show_bug.cgi?id=640807
Comment 9 Mikkel Kamstrup Erlandsen 2011-01-28 10:35:51 UTC
(In reply to comment #8)
> After more digging, it seems that the deadlock is due to bad ref counting on
> the gvariants. See the bug I filed against gvariant here for details:
> https://bugzilla.gnome.org/show_bug.cgi?id=640807

To clarify: There is an issue with pygi ref counting on the variants. Then there is a separate issue with gvariants being this susceptible to deadlocks (the latter being the glib bug I pointed to).
Comment 10 Tomeu Vizoso 2011-01-28 11:35:02 UTC
(In reply to comment #9)
> (In reply to comment #8)
> > After more digging, it seems that the deadlock is due to bad ref counting on
> > the gvariants. See the bug I filed against gvariant here for details:
> > https://bugzilla.gnome.org/show_bug.cgi?id=640807
> 
> To clarify: There is an issue with pygi ref counting on the variants. Then
> there is a separate issue with gvariants being this susceptible to deadlocks
> (the latter being the glib bug I pointed to).

So chances are that this line is wrong, or that function is being called at the wrong moment:

http://git.gnome.org/browse/pygobject/tree/gi/pygi-foreign-gvariant.c#n51
Comment 11 Mikkel Kamstrup Erlandsen 2011-09-28 10:45:06 UTC
This bug changed a bit in pygobject 3.0, but is still relevant.

With pygobject3.0 I get this when writing lenses for Unity. The original test case I attached, test_gi.TestArray.test_array_gvariant_in(), still reproduces the segfault.

I'll update stack traces and patches to pygobject master.
Comment 12 Mikkel Kamstrup Erlandsen 2011-09-28 10:55:18 UTC
Created attachment 197647 [details]
Updated trace of segfault against today's master
Comment 13 Mikkel Kamstrup Erlandsen 2011-09-28 10:58:18 UTC
Created attachment 197648 [details] [review]
Patch against pygobject to add test test_gi.TestArray.test_array_gvariant_in (v2)
Comment 14 Mikkel Kamstrup Erlandsen 2011-09-28 13:31:00 UTC
In _pygi_marshal_from_py_array() lines 776-785 in pygi-marshal-from-py.c I think there is some suspicious logic:

                    gboolean is_boxed = g_type_is_a (item_iface_cache->g_type, G_TYPE_BOXED);
                    gboolean is_gvalue = item_iface_cache->g_type == G_TYPE_VALUE;
// FIXME: GVariant is neither boxed nor gvalue, and you definitely can't just memcpy it
                    if (!is_boxed || is_gvalue) {
                        memcpy (array_->data + (i * item_size), item.v_pointer, item_size);
                        if (from_py_cleanup)
                            from_py_cleanup (state, item_arg_cache, item.v_pointer, TRUE);
                    } else {
                        g_array_insert_val (array_, i, item);
                    }

Since the GVariant sequence members are not boxed types or GValues (I would have thought GVariants were boxed, but apparently they are not) this means we do a memcpy() to write their values into the destination array. I don't think this works.

And indeed, if I run it through gdb and check just before we return from _pygi_marshal_from_py_array():

  (gdb) print g_variant_print (((GVariant**)arg->v_pointer)[0], 1)

Then I get a segfault.
Comment 15 Mikkel Kamstrup Erlandsen 2011-09-28 13:35:18 UTC
Sorry bugzilla really butchered that snippet :-) And for the record, that FIXME statement I added, I am not 100% sure of either ;-)
Comment 16 Mikkel Kamstrup Erlandsen 2011-09-28 14:50:01 UTC
Created attachment 197677 [details] [review]
WIP, fumbling about trying to get this working :-)

Here's a WIP patch. It doesn't make the test pass, but it does demonstrate how to add the variants to the array.

With this patch the test now fails with:

** (process:9589): DEBUG: -------- VARIANT: 27
** (process:9589): DEBUG: -------- VARIANT: 'Hello'
**
GLib:ERROR:/build/buildd/glib2.0-2.30.0/./glib/gvarianttypeinfo.c:165:g_variant_type_info_check: assertion failed: (info->alignment == 0 || info->alignment == 1 || info->alignment == 3 || info->alignment == 7)
Comment 17 johnp 2011-09-29 17:26:17 UTC
Comment on attachment 197677 [details] [review]
WIP, fumbling about trying to get this working :-)


y
>diff --git a/gi/pygi-marshal-from-py.c b/gi/pygi-marshal-from-py.c
>index fa74a22..fe5db9b 100644
>--- a/gi/pygi-marshal-from-py.c
>+++ b/gi/pygi-marshal-from-py.c
>@@ -775,8 +775,15 @@ _pygi_marshal_from_py_array (PyGIInvokeState   *state,
>                     PyGIMarshalCleanupFunc from_py_cleanup = item_arg_cache->from_py_cleanup;
>                     gboolean is_boxed = g_type_is_a (item_iface_cache->g_type, G_TYPE_BOXED);
>                     gboolean is_gvalue = item_iface_cache->g_type == G_TYPE_VALUE;
>-
>-                    if (!is_boxed || is_gvalue) {
>+                    gboolean is_pointer = g_type_is_a (item_iface_cache->g_type, G_TYPE_POINTER) ||
>+                                          g_type_is_a (item_iface_cache->g_type, G_TYPE_VARIANT) ||
>+                                          item_iface_cache->g_type  == G_TYPE_NONE;


I don't like this is_pointer stuff.  First is_pointer is total borked/undefined as a GI concept that I get a creepy feeling every time I see it.  Second, why isn't it done the same as the is_boxed and is_gvalue?  Third, why are we checking for G_TYPE_NONE and G_TYPE_POINTER.  These are not part of the bug and should be added as a separate patch once we fix GVariants.  This is all edge case code that should be removed once GI gets the ability to determine level of indirection of arrays.  Because it passes tests only means we don't have coverage for all the edge cases so I would rather only have one fixed at a time.
                   
>+                    // FIXME: Foreign types?
>+                    if (is_pointer) {
>+                        g_array_insert_val (array_, i, item.v_pointer);
>+                        g_debug ("-------- VARIANT: %s", g_variant_print (((GVariant**)array_->data)[i], 1));

The array was created with the cached item_size.  You need to make certain what _pygi_g_type_info_size (pygi-info.c) is sending back is the size of a pointer on both 32bit and 64bit systems when a GVariant info is passed to it.

>+                    } else if (!is_boxed || is_gvalue) {
>                         memcpy (array_->data + (i * item_size), item.v_pointer, item_size);
>                         if (from_py_cleanup)
>                             from_py_cleanup (state, item_arg_cache, item.v_pointer, TRUE);
Comment 18 Mikkel Kamstrup Erlandsen 2011-09-30 07:01:32 UTC
Agreed on all accounts John. This was mostly a WIP hack. If I can get something working I'll make sure that it's in line with your comments.
Comment 19 Mikkel Kamstrup Erlandsen 2011-10-03 17:46:59 UTC
Created attachment 198126 [details] [review]
Still WIP patch, but cleaned up. SIGABRT in test case

Still haven't found the root cause of the problem, but I cleaned up the patch. Stepping through the marshaling code it looks as if everything is just right, but in reality not so much because I still get the SIGABRT later on.
Comment 20 Mikkel Kamstrup Erlandsen 2011-10-03 17:51:23 UTC
The assertions in gi_marshalling_tests_array_gvariant_in() actually does pass. So it would seem the SIGABRT is either on the return or some ref counting snafu on the variants
Comment 21 Mikkel Kamstrup Erlandsen 2011-10-04 10:05:22 UTC
Created attachment 198206 [details] [review]
Patch for gobject-introspection fixing return type annotations for gi_marshalling_tests_array_gvariant_in()

Ok, got it now. Here's a first patch needed against gobject-introspection. My initial test case missed an annotation on the return type.

Attaching patch for pygobject to go with it in a sec.
Comment 22 Mikkel Kamstrup Erlandsen 2011-10-04 10:34:09 UTC
Created attachment 198208 [details] [review]
Patch for pygobject adding testcase and fix for marshaling of arrays of GVariants

I have the unit test working with this patch. I have squashed the testcase into this commit in order not to break git bisect.

There are two fixmes in the patch. Which may or may not need resolution before being committed.

(and note that the unit test requires the patch to gobject-introspection to fix the annotations on gi_marshalling_tests_array_gvariant_in())
Comment 23 Tomeu Vizoso 2011-10-04 12:17:15 UTC
Review of attachment 198208 [details] [review]:

Patch looks great to me, just have some minor comments.

> Requires latest gobject-introspection.

Then should we increase the required version in configure.ac?

We may need to branch before this can land in master, please ping J5 or Nacho about it.

::: gi/pygi-marshal-from-py.c
@@ +781,3 @@
+                        /* Item size will always be that of a pointer,
+                         * since GVariants are opaque hence always passed by ref */
+                        g_assert (item_size == sizeof (item.v_pointer));

I think sizeof (gpointer) is more common.

::: gi/pygi-marshal-to-py.c
@@ +294,3 @@
         }
+        
+        // FIXME: leak array_->data ?

If we are adding FIXMEs to the sources, I think we should have a corresponding bug for each. And reference the ticket number in the FIXME and use C comments.

@@ +333,3 @@
                 } else if (item_arg_cache->type_tag == GI_TYPE_TAG_INTERFACE) {
                     PyGIInterfaceCache *iface_cache = (PyGIInterfaceCache *) item_arg_cache;
+                    gboolean is_gvariant = iface_cache->g_type == G_TYPE_VARIANT;

I would move it down to the block where it's used.

@@ +337,1 @@
+                    // FIXME: This probably doesn't work with boxed types or gvalues. See fx. _pygi_marshal_from_py_array()

Same comment as above.

@@ +340,3 @@
+                            if (is_gvariant) {
+                              g_assert (item_size == sizeof (gpointer));
+                              if (arg_cache->transfer == GI_TRANSFER_EVERYTHING)

Would be good to have a test for each transfer mode.
Comment 24 Mikkel Kamstrup Erlandsen 2011-10-04 14:45:15 UTC
Thanks for the review Tomeu. I added the extra tests for the different transfer modes as you requested, but it has opened what seems like a can of worms. I noticed some peculiarities in the cleanup and tracked it down to something that looks like pygobject has been using the wrong cleanup function for arrays al along.

I made this change to rectify it, but that but it was not without fallout that I am trying to clean up before I resubmit the patch:

--- a/gi/pygi-cache.c
+++ b/gi/pygi-cache.c
@@ -498,7 +498,7 @@ _arg_cache_from_py_array_setup (PyGIArgCache *arg_cache,
         callable_cache->args_cache[seq_cache->len_arg_index] = child_cache;
     }
 
-    arg_cache->from_py_cleanup = _pygi_marshal_cleanup_to_py_array;
+    arg_cache->from_py_cleanup = _pygi_marshal_cleanup_from_py_array;
Comment 25 Mikkel Kamstrup Erlandsen 2011-10-05 12:53:04 UTC
Created attachment 198327 [details] [review]
Patch for gobject-introspection adding and fixing tests for arrays of GVariants

Fix some of the tests for arrays of gvariants in gobject-introspection and add some more tests for different transfer modes.
Comment 26 Mikkel Kamstrup Erlandsen 2011-10-05 12:56:01 UTC
Created attachment 198331 [details] [review]
Patch for pygobject adding testcases and fixes for marshaling of arrays of GVariants

Updated patch against pygobject. This became a bigger task than anticipated because of some related bugs in the array cleanup code. All tests pass here now at least.

I tried simplifying and unifying some of the array cleanup code a bit, which may have grown the diff somewhat, but I think the resulting code is easier for humans to parse :-)
Comment 27 Mikkel Kamstrup Erlandsen 2011-10-19 07:25:20 UTC
Just a note: We've had a few people testing the patch for some time now and they are all happy. No leaks or crashes reported so far.
Comment 28 johnp 2011-10-21 18:30:34 UTC
Ok, we should branch, make a release and then add this to master to see who it breaks, if anyone.  Thanks for following up and getting this to work Mikkel, it wasn't trivial.  Now if only gobject-introspection could get the indirection markups.
Comment 29 Martin Pitt 2011-10-25 06:48:00 UTC
Comment on attachment 198327 [details] [review]
Patch for gobject-introspection adding and fixing tests for arrays of GVariants

The gobject-introspection test case updates look straightforward, and as pygobject's test suite does not currently use them, it does not break either. Committed to master.
Comment 30 Martin Pitt 2011-10-25 07:18:39 UTC
Comment on attachment 198331 [details] [review]
Patch for pygobject adding testcases and fixes for marshaling of arrays of GVariants

The patch has proven to work for a lot of people, was reviewed by John, and we just did a 3.0.2 release, so it has some time to bake in jhbuild. Committed, with agreement of Tomeu.
Comment 31 Martin Pitt 2011-10-25 07:18:54 UTC
Thanks Mikkel!