After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 678663 - support for array field defined via a length field, for GstMapInfo
support for array field defined via a length field, for GstMapInfo
Status: RESOLVED OBSOLETE
Product: gobject-introspection
Classification: Platform
Component: general
unspecified
Other All
: Normal normal
: ---
Assigned To: gobject-introspection Maintainer(s)
gobject-introspection Maintainer(s)
: 687550 (view as bug list)
Depends on: 709976 710561
Blocks: 688792
 
 
Reported: 2012-06-23 11:59 UTC by Jason Gerard DeRose
Modified: 2018-06-05 12:45 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
explicitely forbid array length for non callable paramters in gi and avoid discarding array conversion (1.71 KB, patch)
2012-06-29 08:49 UTC, Alban Browaeys
none Details | Review
gi support for array field defined via a length field (7.63 KB, patch)
2012-07-25 03:49 UTC, Alban Browaeys
none Details | Review
pygobject support for array field size defined via another field (1.76 KB, patch)
2012-07-25 03:50 UTC, Alban Browaeys
none Details | Review
vala support for array field size defined via another field (4.94 KB, patch)
2012-07-25 03:51 UTC, Alban Browaeys
none Details | Review
gi tests for array field defined via size field (2.72 KB, patch)
2012-07-25 21:25 UTC, Alban Browaeys
none Details | Review
Add gst_map_info_to_bytes() (1.36 KB, patch)
2012-08-06 15:32 UTC, Colin Walters
none Details | Review
gi support for array field defined via a length field (v2) (7.87 KB, patch)
2012-08-29 21:02 UTC, Alban Browaeys
none Details | Review
pygobject: give mutable access to the initial array (5.93 KB, patch)
2012-08-29 21:09 UTC, Alban Browaeys
none Details | Review
gstreamer map-as-bytes patch, v2 (3.83 KB, patch)
2012-09-10 21:48 UTC, Colin Walters
none Details | Review
pygobject: give mutable access to the initial array (v2) (3.68 KB, patch)
2012-09-28 23:39 UTC, Alban Browaeys
needs-work Details | Review
pygobject support for array field size defined via another field (v3) (7.35 KB, patch)
2012-09-28 23:41 UTC, Alban Browaeys
none Details | Review
the read test for support for array field size defined via another field . Slight modification of tim code (1.55 KB, text/plain)
2012-09-28 23:42 UTC, Alban Browaeys
  Details
wite testcase (requires memoryview and array field defined via alength field) (2.15 KB, text/plain)
2012-09-29 14:10 UTC, Alban Browaeys
  Details
buffer: add gst_buffer_map_read_as_bytes() API for bindings (4.58 KB, patch)
2012-12-21 00:48 UTC, Tim-Philipp Müller
none Details | Review
gi support for array field defined via a length field (v3) (8.47 KB, patch)
2013-03-27 17:06 UTC, Alban Browaeys
none Details | Review
gi tests for array field defined via size field (v3) (2.71 KB, patch)
2013-03-27 17:07 UTC, Alban Browaeys
none Details | Review
Add zero copy buffer management to Gst.MapInfo (5.18 KB, patch)
2016-10-23 09:01 UTC, Simon Feltman
reviewed Details | Review

Description Jason Gerard DeRose 2012-06-23 11:59:18 UTC
(05:52:36 AM) __tim: we can always just add a gst_map_info_get_data() if there's no other way
(05:52:41 AM) __tim: please file a bug about it

In PyGST, you could get the buffer data like this:

def on_preroll_handoff(element, buf, pad):
    data = buf.data

But there doesn't seem to be a way to do this with GStreamer 1.0 and PyGI. More awesome would be had if you could. BTW, you can get the MapInfo from the Buffer like this:

def on_preroll_handoff(element, buf, pad):
    map_info = buf.map_range(0, -1, Gst.MapFlags.READ)[1]
Comment 1 Tim-Philipp Müller 2012-06-23 12:11:51 UTC
As far as I can tell, the problem is that g-i doesn't recognise GstMapInfo.data as an array (even with appropriate annotation).
Comment 2 Jason Gerard DeRose 2012-06-23 12:13:31 UTC
This idea is way above my introspection and PyGI pay-grade... but could the
Python Buffer Protocol be used as a way to expose the Buffer.data as a `bytes`
object without copying it first?

http://docs.python.org/py3k/c-api/buffer.html#bufferobjects
Comment 3 Sebastian Dröge (slomo) 2012-06-27 10:20:01 UTC
As discussed on IRC, a possible solution would be

gboolean gst_map_info_get_bytes (mapinfo, &data, &size, GST_MAP_READ | GST_MAP_WRITE);

this would fail whenever the mapping is done with incompatible flags and otherwise provide the data and size.
Comment 4 Alban Browaeys 2012-06-29 08:49:16 UTC
Created attachment 217589 [details] [review]
explicitely forbid array length for non callable paramters in gi and avoid discarding array conversion

this is a patch for gobject-introspection though the annotation is wrong for MapInfo.data (length=<field> is not supported by gi and breaks the conversion to an array).
So the issue is twofolds:
- make gi tells what wrong (and do not abort the routine, simply warn)
- the GstMapInfo annotation for data field needs to get fixed
Comment 5 Tim-Philipp Müller 2012-06-29 09:07:37 UTC
> though the annotation is wrong for
> MapInfo.data (length=<field> is not supported by gi and breaks the conversion
> to an array).
> So the issue is twofolds:
> - make gi tells what wrong (and do not abort the routine, simply warn)
> - the GstMapInfo annotation for data field needs to get fixed

What is the correct annotation? Are you sure the length=size bit actually "breaks" the array conversion? Because I'm pretty sure I tried just (array) first, and it didn't make any difference at all. Besides, how would it determine the length of the array otherwise?
Comment 6 Alban Browaeys 2012-06-30 21:39:08 UTC
I am confident length=<not a callbale parameter> fails because the code that handle this in gobject-introspection:
giscanner/maintransformer.py
_apply_annotations_array
(....)
        length = array_values.get(OPT_ARRAY_LENGTH)
        if length is not None:
             paramname = self._get_validate_parameter_name(parent, length, node)
             if paramname:
                 param = parent.get_parameter(paramname)

where _apply_annotations_array is  called from _adjust_container_type itself called from _apply_annotations_field.( _pass_read_annotations :
if isinstance(node, (ast.Class, ast.Interface, ast.Record, ast.Union)):
            block = self._get_block(node)
            for field in node.fields:
                self._apply_annotations_field(node, block, field)
            name = self._get_annotation_name(node)
            section_name = 'SECTION:' + name.lower()
            block = self._blocks.get(section_name)
            if block:
                node.doc = block.comment
)
> here parent.get_parameter is a method that exists only for parent == Callable. , ie a callable parameter.

I tried
 * @data: (array): a pointer to the mapped data
annotation and it works fine. I wonder if the bug was fixed by above patch or a change in gobject-introspection master, still it does the job.
Comment 7 Tim-Philipp Müller 2012-06-30 22:18:28 UTC
How did you check/determine that it works fine?
Comment 8 Tim-Philipp Müller 2012-06-30 22:20:16 UTC
And even if it does actually work fine in this case, I am fairly sure I can easily come up with a structure where it would fail without additional annotations where to get the size of the array from, unless I'm missing something.
Comment 9 Alban Browaeys 2012-07-01 07:02:00 UTC
Sorry I should have told it translate fine into a array in the gir.
     <field name="data" writable="1">
        <array zero-terminated="0" c:type="guint8*">
          <type name="guint8" c:type="guint8"/>
        </array>
      </field>
instead of:
      <field name="data" writable="1">
        <type name="guint8" c:type="guint8*"/>
      </field>
which indeed does not mean it "works".
Comment 10 Jason Gerard DeRose 2012-07-24 03:09:37 UTC
I'm a bit confused as to the current status of this. Does this require both a change in gstreamer and a change in gi to work?
Comment 11 Alban Browaeys 2012-07-24 06:51:42 UTC
Good question : the above patch produces fixes the gir generation but as tim told it might not "work" (I believe at runtime).
For g-i one thing is for sure it currently does not let one define an array size from a record to a record (struct { .size, . array} . This feature is only available to functions/methods func (size, array). I cannot tell if this is a bug or if there is no way to support the record to record at runtime (so no use to let one set it).
Comment 12 Tim-Philipp Müller 2012-07-24 09:57:12 UTC
Migh be easy to test: if you map a buffer of 100 bytes, and then access map.data[200] will you get an appropriate exception?
Comment 13 Alban Browaeys 2012-07-25 03:49:08 UTC
Created attachment 219613 [details] [review]
gi support for array field defined via a length field
Comment 14 Alban Browaeys 2012-07-25 03:50:11 UTC
Created attachment 219614 [details] [review]
pygobject support for array field size defined via another field
Comment 15 Alban Browaeys 2012-07-25 03:51:00 UTC
Created attachment 219615 [details] [review]
vala support for array field size defined via another field
Comment 16 Johan (not receiving bugmail) Dahlin 2012-07-25 19:32:01 UTC
(In reply to comment #13)
> Created an attachment (id=219613) [details] [review]
> gi support for array field defined via a length field

This needs at a minimum a .c/.h/gir test.
Comment 17 Alban Browaeys 2012-07-25 21:25:22 UTC
Created attachment 219661 [details] [review]
gi tests for array field defined via size field
Comment 18 Tim-Philipp Müller 2012-08-03 19:06:43 UTC
Does someone feel like testing and pushing the g-i parts already?

Martin, Jürg: any comments on the pygobject/vala bits?


<tpm>: jdahlin, tomeu, walters: how about https://bugzilla.gnome.org/show_bug.cgi?id=678663 ? Does that look ok as well? :)
<jdahlin> tpm: the g-i parts look fine
<tpm> and the other stuff is not fine, or for someone else to review? :)
<jdahlin> someone else
<tpm> cool, thanks
<tpm> can the g-i parts go in already ?
<tpm> who would be good to ask to look at the other stuff?
<jdahlin> all good to me, but I'm not pushing
<jdahlin> for pygobject pitti
<jdahlin> for vala, juergbi
<tpm> why are you not pushing? (ooc)
<jdahlin> I don't have a development environment setup, I cannot run the testsuite
<tpm> ah ok
<tpm> thanks for having a look
Comment 19 Colin Walters 2012-08-06 15:32:52 UTC
Created attachment 220458 [details] [review]
Add gst_map_info_to_bytes()

This is usable by bindings.
Comment 20 Colin Walters 2012-08-06 15:33:55 UTC
How about this (untested) patch which adds API to convert GstMapInfo to a GBytes?

GBytes should be handled by bindings in a natural way, it's a boxed type, and we know it's a byte array.
Comment 21 Sebastian Dröge (slomo) 2012-08-06 21:54:50 UTC
GBytes is immutable though
Comment 22 Jason Gerard DeRose 2012-08-14 09:10:39 UTC
I tried the gst_map_info_to_bytes() patch (slightly tweaked, doesn't currently apply against git master), and it seems to work fine.

From PyGI I can do this:

map_info = buf.map_range(0, -1, Gst.MapFlags.READ)[1]
data = map_info.to_bytes()

And data will be a `GLib.Bytes` instance. This still doesn't get me Python `bytes`, though, or at least I haven't figured that out yet.

I'm running Ubuntu Precise ATM, perhaps I need a newer Glib?
Comment 23 Jason Gerard DeRose 2012-08-14 09:18:07 UTC
Ah, Glib.Bytes.unref_to_array() does the trick:

map_info = buf.map_range(0, -1, Gst.MapFlags.READ)[1]
data = map_info.to_bytes().unref_to_array()

Now *data* is a Python3 `bytes` instance. So that gst_map_info_to_bytes() patch indeed makes the use case I was originally concerned with possible.
Comment 24 Tim-Philipp Müller 2012-08-14 09:37:32 UTC
This (GBytes indirection) does not strike me as a nice solution - is there any reason not to make things work as they should? I was surprised that didn't work already tbh, but we have patches here, so why settle for something else when we can just fix it properly?

Martin, Jürg: any chance you could have a look at the pygobject/vala bits of the patch?
Comment 25 Colin Walters 2012-08-14 13:46:02 UTC
(In reply to comment #24)
> This (GBytes indirection) does not strike me as a nice solution - is there any
> reason not to make things work as they should? 

Well, this approach will work for older versions of gobject-introspection too.  That doesn't mean that we shouldn't improve g-i too; in this particular case it does seem like a potentially useful feature elsewhere.

But in general fixing problems in two ways is often a good idea.

I'll look at the g-i patch.
Comment 26 Tim-Philipp Müller 2012-08-23 12:35:18 UTC
> Well, this approach will work for older versions of gobject-introspection too. 

If a copy is made of the data in the struct, it won't actually work properly though, as I understand it. The data needs to be modified in place, or a READWRITE or WRITE mapping won't work properly, since you'd be modifying the copy, which is not really useful.


> That doesn't mean that we shouldn't improve g-i too; in this particular case it
> does seem like a potentially useful feature elsewhere.
> 
> But in general fixing problems in two ways is often a good idea.
> 
> I'll look at the g-i patch.

Have you had a chance to look at this yet ? This is one of the single most important issues for GStreamer, since without it you can't access buffer data in applications.
Comment 27 Colin Walters 2012-08-24 21:48:19 UTC
(In reply to comment #26)
> > Well, this approach will work for older versions of gobject-introspection too. 
> 
> If a copy is made of the data in the struct, it won't actually work properly
> though, as I understand it. The data needs to be modified in place, or a
> READWRITE or WRITE mapping won't work properly, since you'd be modifying the
> copy, which is not really useful.

Ok.  I guess as https://bugzilla.gnome.org/show_bug.cgi?id=678663#c21 says, then GBytes may not be appropriate.

But...is it really going to work to mutate binary array data from language bindings?  When pygobject gets an array of uint8, it will pass through writes rather than copying the data?

> Have you had a chance to look at this yet ? This is one of the single most
> important issues for GStreamer, since without it you can't access buffer data
> in applications.

For 3.6?  I know I'm really behind on g-i reviews and patches.  Um...it's possible.

So these patches actually solve the problem?  Can I see the GStreamer python test case?
Comment 28 Tim-Philipp Müller 2012-08-24 22:53:51 UTC
> So these patches actually solve the problem?  Can I see the GStreamer python
> test case?

I made a (slightly convoluted) test here: http://people.freedesktop.org/~tpm/code/map-buffer-test.py - I *think* that's what Alban was using as well. I don't know if writing has been tested.
Comment 29 Colin Walters 2012-08-25 17:21:40 UTC
(In reply to comment #28)
> > So these patches actually solve the problem?  Can I see the GStreamer python
> > test case?
> 
> I made a (slightly convoluted) test here:
> http://people.freedesktop.org/~tpm/code/map-buffer-test.py - I *think* that's
> what Alban was using as well. I don't know if writing has been tested.

Thanks for the test.  But it's writing to the buffer that's the main point of these patches, right?  Otherwise one could just use _get_as_bytes().

Note gjs from what I can tell always duplicates the byte array memory.  This is unfortunate.  So if GBytes is the immutable GLib byte array type, clearly GByteArray is the mutable version.  We could adjust bindings to ensure they don't copy in that case.

"git grep GByteArray" in pygobject shows nothing at the moment now though...

So I'm skeptical that these patches allow writability without more work there.
Comment 30 Tim-Philipp Müller 2012-08-26 09:31:17 UTC
The immediate goal is to make the data readable, which covers 90% of application use cases (to get cover art, thumbnails, get data out of a pipeline etc.).

I'm reluctant to add API to GStreamer that covers the reading case in one way, and then requires different API later for the writing case.

If these patches make things work read-only now and need additional work later for writing support, but it can be done in a way that's API-neutral to applications, that's fine with me. The reading case is more urgent. It's just that the semantics need to cover both cases in principle.

I also think it's highly desirable for bindings to be able to use the same API/access patterns as C code in this case (meaning: direct use of mapinfo.data[]). Would be a bit funny if bindings people had to jump through another set of hoops on top of the already existing buffer.map() dance, no?
Comment 31 Colin Walters 2012-08-26 16:22:26 UTC
(In reply to comment #30)
> The immediate goal is to make the data readable, which covers 90% of
> application use cases (to get cover art, thumbnails, get data out of a pipeline
> etc.).
> 
> I'm reluctant to add API to GStreamer that covers the reading case in one way,
> and then requires different API later for the writing case.

Right, but that's my worry - that we'll add this ability to read the bytes, and then it turns out that pygobject copies the data, but we can't change that because it would break some application.

> I also think it's highly desirable for bindings to be able to use the same
> API/access patterns as C code in this case (meaning: direct use of
> mapinfo.data[]). Would be a bit funny if bindings people had to jump through
> another set of hoops on top of the already existing buffer.map() dance, no?

Yes...but two things:

1) How "nice" an API needs to be is a function of how often it's used; for example, packing GtkWidgets into a GtkContainer should be easy and obvious.  It's not clear to me how often applications map buffers, but it's probably not that often, right?

2) Predictability and consistency is more important to me than minimizing characters.  If for example we were forced to add some special annotation later like (array mutable), that's a whole other type of thing that binding consumers need to understand.

We may in fact be forced to do something like 2), because from my read of the pygobject source code, it also creates copies of any arrays it sees.  Just grep for GI_TYPE_TAG_ARRAY.

Maybe what we should do though is improve pygobject so when it sees arrays as field members it keeps them mutable.

Again though, the point is these patches don't help for writing without corresponding, uncertain pygobject patches.  So my recommendation for applications that want to *read* the data to add an accessor function to GStreamer now, like the GBytes one I proposed.

We can then look at mutable arrays for 3.7?
Comment 32 Alban Browaeys 2012-08-29 21:02:08 UTC
Created attachment 222848 [details] [review]
gi support for array field defined via a length field (v2)

make check works better if make clean is done before ... fix the patch therafter
Comment 33 Alban Browaeys 2012-08-29 21:09:43 UTC
Created attachment 222863 [details] [review]
pygobject: give mutable access to the initial array
Comment 34 Colin Walters 2012-09-10 21:48:01 UTC
Created attachment 223961 [details] [review]
gstreamer map-as-bytes patch, v2

Attaching a totally untested patch to add a better read-as-GBytes option to Gstreamer.

Often it makes sense to fix something in two places - GStreamer having a GBytes-based API could work in parallel with g-i supporting array reads (and possibly writes).
Comment 35 Alban Browaeys 2012-09-27 15:06:59 UTC
ie comment 33 and attach answer the does pygobejct copy arrays (ie it does currently) and provide a way to avoid this copy by patching pygobject.
I am more looking after a reply than a commit . Ie knowing this will not be applied ever , I will know there is no point in the  maintainence of this patch. 
Also it will answer the point of does pygobject will ever support writing to MapInfo.data in the way it is currently made available by gstreamer 1.0.
Thus we will know if there is a point in providing a read only getter for this data. Or if we should strive for read/write.
Comment 36 Alban Browaeys 2012-09-28 23:39:14 UTC
Created attachment 225363 [details] [review]
pygobject: give mutable access to the initial array (v2)

I made an update v2 which apply first (ie no need for the "pygobject support for array field size defined via another field" patch to be applied to try this one out.
The only issue I encountered so far was one pitivi line doing get_data().split(... . Ie code taking for granted that a guint8[] is a string. It requred a change aka "get_data().tobytes().split()" ie meoryview .tobytes() .
Comment 37 Alban Browaeys 2012-09-28 23:41:17 UTC
Created attachment 225364 [details] [review]
pygobject support for array field size defined via another field (v3)

This one requires the memory leak fix from bug 685082 to get applied first (ie the code interleaves).
Comment 38 Alban Browaeys 2012-09-28 23:42:41 UTC
Created attachment 225365 [details]
the read test for support for array field size defined via another field . Slight modification of tim code
Comment 39 Alban Browaeys 2012-09-29 14:10:13 UTC
Created attachment 225382 [details]
wite testcase (requires memoryview and array field defined via alength field)
Comment 40 Colin Walters 2012-11-14 20:05:56 UTC
(In reply to comment #34)
> Created an attachment (id=223961) [details] [review]
> gstreamer map-as-bytes patch, v2
> 
> Attaching a totally untested patch to add a better read-as-GBytes option to
> Gstreamer.
> 
> Often it makes sense to fix something in two places - GStreamer having a
> GBytes-based API could work in parallel with g-i supporting array reads (and
> possibly writes).

Hi Tim,

Any opinions about patches in this direction for GStreamer?
Comment 41 Tim-Philipp Müller 2012-11-14 20:55:55 UTC
(In reply to comment #40)
> Any opinions about patches in this direction for GStreamer?

I don't mind something like that (though perhaps for GstBuffer rather than GstMemory, since that's the most-used API, GstMemory is much more low-level), but if I'm not mistaken it's only a work-around for part of the problem, e.g. covers only reading, not writing?
Comment 42 Colin Walters 2012-11-14 21:46:48 UTC
(In reply to comment #41)
> (In reply to comment #40)
> > Any opinions about patches in this direction for GStreamer?
> 
> I don't mind something like that (though perhaps for GstBuffer rather than
> GstMemory, since that's the most-used API, GstMemory is much more low-level),
> but if I'm not mistaken it's only a work-around for part of the problem, e.g.
> covers only reading, not writing?

Right.  So...there's basically no generic way to do this in introspection right now.  We like just recently got done supporting GBytes in a nicer way in pygobject, and I'm still working on the gjs patches ( https://bugzilla.gnome.org/show_bug.cgi?id=685431 ).

*Ideally* if GStreamer had an API that returned a GByteArray*, bindings wouldn't take a copy.  Unfortunately at the moment, gjs does.  I'm not sure whether that's an oversight or intentional.  It's a little worrying to change it now, but on the other hand, anyone returning a GByteArray* from public API is kind of explicitly opening up themselves for external mutation.  So maybe we can change it in gjs.

From what I can tell reading the pygobject source...it seems to treat GByteArray as a generic boxed type.  But let me investigate more.
Comment 43 Colin Walters 2012-11-14 21:55:19 UTC
Martin, can you comment from the pygobject side here?  I am not really parsing what Alban is saying or what's going on with his patches unfortunately...
Comment 44 Martin Pitt 2012-11-19 06:47:31 UTC
Quick overview about GByteArray support in PyGObject:

- I just recently fixed the handling of GByteArrays arguments (this fix is in master and also in 3.4.2). That direction does not copy the data (as far as I know), it simply sets the GArray pointer to the GIArgument data.

- Constructing GByteArrays through GLib's API has been fixed in GLib 2.34.2 through the annotations.

- Returning GByteArrays has already worked for some time, but the originally returned GByteArray pointer would not be mutable, as PyGObject calls PyBytes_FromStringAndSize() (see http://docs.python.org/3.0/c-api/bytes.html) to turn a C byte array into a Python object; this function always takes a copy; I wouldn't know how to tell Python to merely wrap an existing byte array, or whether that would even work with Python's internal logic.

Now, here we do not have any of these three cases, but we read the GByteArray from a struct field. However, currently the .gir for GstMapInfo looks wrong:

    <record name="MapInfo" c:type="GstMapInfo">
[...]
      <field name="data" writable="1">
        <type name="guint8" c:type="guint8*"/>
      </field>
      <field name="size" writable="1">
        <type name="gsize" c:type="gsize"/>
      </field>
[...]

So "'data" is not even marked as an array, which is supposedly what "gi support for array field defined via a length field (v2)" fixes?

As for "pygobject: give mutable access to the initial array (v2)", I haven't tried this yet (as it is hard to test right now), but I do like the idea. If this works, I'm happy to get this into PyGObject, so that at least uint8 array struct members become writable.

The second PyGObject patch "Add support for length of an array field defined from  another field" looks ok at first sight, but it would need tests. Once GI annotations for array struct members start to work, I'm happy to create some tests in PyGObject (presumably GI would add such a case to Regress.gir, to ensure that the annotations come out correctly; we can then use that test struct in the pygobject tests).

Thanks!
Comment 45 Simon Feltman 2012-11-20 10:58:27 UTC
I ran into the same problems described here and logged: https://bugzilla.gnome.org/show_bug.cgi?id=687550

The problematic gir generation was also logged as: https://bugzilla.gnome.org/show_bug.cgi?id=687545
Comment 46 Simon Feltman 2012-11-20 10:59:38 UTC
*** Bug 687550 has been marked as a duplicate of this bug. ***
Comment 47 Simon Feltman 2012-11-21 02:45:40 UTC
Review of attachment 225363 [details] [review]:

I was unable to apply the patch for testing as I think it needs to be re-based.

::: gi/pygi-argument.c
@@ +1141,2 @@
             if (g_type_info_get_tag (item_type_info) == GI_TYPE_TAG_UINT8 &&
+                PyMemoryView_Check(object)) {

The docs say MemeoryView is only available in 2.7+
http://docs.python.org/2/c-api/buffer.html#memoryview-objects

If this is indeed true, we would either need to drop support for 2.6 in pygobject or figure something else out. (this isn't the first time 2.6 support has been a source of contention)

@@ +1679,3 @@
+                res = PyBuffer_FillInfo(&pybuffer, 0, array->data, array->len, 0, PyBUF_CONTIG);
+                if (res == -1) {
+                    g_critical ("Failure to fill buffer from array for %u items", array->len);

I don't think the g_critical is needed as python is already setting an exception. This will just become extra noise and I think as a general guideline we should be sticking with python errors/warnings in the python bindings. Otherwise python programmers need to use two different APIs (sometimes simultaneously) for catching or working with errors and warnings.
Comment 48 Simon Feltman 2012-11-21 05:50:53 UTC
Review of attachment 222848 [details] [review]:

I could not verify this patch fixes the problem I was seeing in bug 687545. The following error is printed when building gobject-introspection during "GISCAN GObject-2.0.gir":
ERROR: Failed to re-parse gir file; scanned='/tmp/tmptN6WZg.gir' passthrough='/tmp/tmpxSC0fq.gir'
Comment 49 Simon Feltman 2012-11-21 05:54:09 UTC
Review of attachment 225364 [details] [review]:

Verified this patch fixes bug 687550. With a hacked gir and typelib installed I now get:

$ python3 -c "import gi.module; print(gi.module.get_introspection_module('GObject').signal_query(1).param_types)"
[<GType GParam (76)>]

Instead of:
<GType invalid (28749744)>

Thanks!
Comment 50 Martin Pitt 2012-11-21 06:19:06 UTC
For the record, I wouldn't be unhappy about dropping 2.6 support (I already mentioned that in the last GNOME cycle). It's already being tested poorly (I don't really test it at all), and 2.7 was released over two years ago. This would also allow us to drop some shims to make stuff work on 2.6.
Comment 51 Colin Walters 2012-11-22 00:02:22 UTC
(In reply to comment #44)
>
> I wouldn't know how to tell Python to merely wrap an existing byte array, or
> whether that would even work with Python's internal logic.

That should totally be possible as far as I understand it, if you provide an object which implements the "sequence protocol":
http://docs.python.org/3.3/c-api/sequence.html

The question I have is whether or not it's a compatible change for pygobject to suddenly start writing to the underlying array rather than copying.  For methods that return a GByteArray, I argued in
https://bugzilla.gnome.org/show_bug.cgi?id=678663#c42
that it's "compatible enough".  

I think I'd like to try making gjs and pygobject behave the same here, so that C authors can return a GByteArray and expect bindngs to *not* copy.
Comment 52 Colin Walters 2012-11-22 00:04:04 UTC
Also note my focus so far has been on method call and type semantics - we can obviously do something different for structure fields, so that things work without changing GStreamer.

But given the above, I think it's worth a GStreamer patch which returns a GByteArray.
Comment 53 Simon Feltman 2012-11-22 03:32:01 UTC
(In reply to comment #51)
> (In reply to comment #44)
> >
> > I wouldn't know how to tell Python to merely wrap an existing byte array, or
> > whether that would even work with Python's internal logic.
> 
> That should totally be possible as far as I understand it, if you provide an
> object which implements the "sequence protocol":
> http://docs.python.org/3.3/c-api/sequence.html

This is what the PyMemoryView stuff Alban added allows for.

> The question I have is whether or not it's a compatible change for pygobject to
> suddenly start writing to the underlying array rather than copying.  For
> methods that return a GByteArray, I argued in
> https://bugzilla.gnome.org/show_bug.cgi?id=678663#c42
> that it's "compatible enough".  
> 

> I think I'd like to try making gjs and pygobject behave the same here, so that
> C authors can return a GByteArray and expect bindngs to *not* copy.

Good point. Overall I think it is a good idea to avoid the copies if we can, simply for efficiency reasons, but this ends up opening the doors for more problems as you've hinted at. Looking at the patch a bit more, a number of additional issues have popped up. I will follow up with another patch review.
Comment 54 Simon Feltman 2012-11-22 06:59:29 UTC
(In reply to comment #52)
> Also note my focus so far has been on method call and type semantics - we can
> obviously do something different for structure fields, so that things work
> without changing GStreamer.
> 
> But given the above, I think it's worth a GStreamer patch which returns a
> GByteArray.

Confusingly my last comment was written in regards to struct field access. However, what is funny is returning a GByteArray seems like it would get us right back in the same pickle because a GByteArray is a struct containing a "data" array field. Accessing GByteArray.data would make a copy based on the same struct field access code path. So we either have to fix that or add a static binding for GByteArray which also implements the sequence protocol.
Comment 55 Simon Feltman 2012-11-22 22:37:27 UTC
Attempting to clarify things a bit. There are currently two (related) options on the table for adding mutable array support to pygobject.

1) Expose array fields as a shared mutable chunk of memory (MemoryView). The issue with this (in addition to only working in py2.7) is it opens the door for memory problems which I don't think would be acceptable. For instance, if someone held onto the memory view, we could get into a situation where the struct holding it is collected and the Gst API frees it. For example:

def on_preroll_handoff(self, element, buf, pad):
    self.data = buf.data

The above callback is retaining the memory view and an attempt to access it later could result in an invalid memory read or write. I propose adding an explicit context managing accessor for getting back a memory view instead:

def on_preroll_handoff(self, element, buf, pad):
    with buf.get_memory_view('data') as data:
        # do something with data

This way we can leave the existing field array code as is and basic attribute access maintains safety by making managed copies. I think the context manager provides clearer semantics as to what is going on with the memory.

2) Add a new Gst API which returns GByteArray. This would require changing pygobject to remove the automatic coercion of a GByteArray into an immutable copy as PyBytes. GByteArray would need to implement the mutable sequence protocol. This can be achieved with either static bindings or implementing the proposed "get_memory_view" method described above along with a python override. For example:

class ByteArray(GLib.ByteArray):
    def __iter__(self):
        with self.get_memory_view('data') as data:
            return iter(data)
    def __setitem__(self, index, value):
        with self.get_memory_view('data') as data:
            data[index] = value

However, at this point the same could just as easily be done with a GstMapInfo and the additional Gst API returning GByteArray would be unnecessary. What do people think of the explicit "get_memory_view" method in general?
Comment 56 Simon Feltman 2012-12-07 09:38:13 UTC
In an effort to move forward with some of the patches here before they get stale. Does it make sense to break pygobject related parts into separate tickets or is it common to have a single ticket and patch set which affects multiple projects?
Comment 57 Tim-Philipp Müller 2012-12-21 00:48:47 UTC
Created attachment 232018 [details] [review]
buffer: add gst_buffer_map_read_as_bytes() API for bindings

Updated patch for GStreamer that maps a buffer and returns the data wrapped inside a GBytes. (Colin's original patch was for GstMemory, which is less-important in practice, and was also missing some bits: didn't keep ref to memory in structure, leaked helper structure if memory couldn't be mapped; unit test was missing the end marker so didn't compile, and wasn't added to the lists of tests, so presumably never run; unit test also leaked the GBytes and the memory).

In any case, now I get a GBytes in python, but how do I get to the actual data now? Also, all the comments in this bug refer to GByteArray not GBytes, so which one is prefered for this in the end? It all doesn't really feel right, we'll end up with something more convoluted for bindings than the original C API, which is quite an achievement :)


Did anyone have comments on Simon's comment #55 ? It seems we're so close...
Comment 58 Colin Walters 2013-01-07 23:00:32 UTC
(In reply to comment #57)
> 
> In any case, now I get a GBytes in python, but how do I get to the actual data
> now? 

Looks like bytes.get_data() will return it.

> Also, all the comments in this bug refer to GByteArray not GBytes, so
> which one is prefered for this in the end?

For an immutable buffer, GBytes is right.  If it's mutable, the question is more flexible.  

> It all doesn't really feel right,
> we'll end up with something more convoluted for bindings than the original C
> API, which is quite an achievement :)

Yeah...not all cases are going to be better.

> Did anyone have comments on Simon's comment #55 ? It seems we're so close...

I am not an expert on the Python side - it seems to me that option 2) is nicer there.
Comment 59 Colin Walters 2013-01-07 23:02:04 UTC
I'll take a look at having gjs take a reference on byte arrays rather than copying.
Comment 60 Alban Browaeys 2013-03-27 17:06:31 UTC
Created attachment 239976 [details] [review]
gi support for array field defined via a length field (v3)

Update vs today gobject-introspection
Comment 61 Alban Browaeys 2013-03-27 17:07:48 UTC
Created attachment 239977 [details] [review]
gi tests for array field defined via size field (v3)
Comment 62 Alban Browaeys 2013-03-27 17:09:22 UTC
Is there a sample of a context managing accessor (for to implement  get_memory_view one) ?
Comment 63 Olivier Crête 2013-03-27 17:19:04 UTC
Using GBytes won't work here, because you have no guarantee of when the _unref() is really called, it could be from a garbage collector later. GBytes.get_data() is also broken from pygi, so not so useful right now.

My current proposed solution is to have an object implemented in each bindings that would basically wrap the GstMapInfo, but do language-appropriate exceptions if you try to access it after an _unmap(). I guess it could be something like GByteArray with an added _invalidate() ?

Short of that, I think that the only safe way to do it right now is to have a _extract_dup() function (as the current gst_buffer_extract() is not g-i friendly). So I'm adding that right now to gst. But having a better API would be good..
Comment 64 Simon Feltman 2013-03-28 10:01:40 UTC
(In reply to comment #62)
> Is there a sample of a context managing accessor (for to implement 
> get_memory_view one) ?

This commit shows a context manager written in C being replaced by one written in Python:
https://git.gnome.org/browse/pygobject/commit/?id=463f660cd6bb7

See comments below as I think there is an overall safer solution to all this without using a context manager.

(In reply to comment #63)
> Using GBytes won't work here, because you have no guarantee of when the
> _unref() is really called, it could be from a garbage collector later.
> GBytes.get_data() is also broken from pygi, so not so useful right now.

After looking at the Python buffer APIs a bit more, I was hoping we could tie the lifetime of the struct wrapper to the resulting memory view returned from one of its fields. Because PyBuffer_FillInfo function accepts and "exporter" PyObject which it will manage for the duration of the buffers lifetime. However, PyMemoryView_FromBuffer seems broken in Python 3.3 in that it doesn't actually manage the given buffers "obj" field for you.

Here you can see PyMemoryView_FromBuffer actually nulls out the held object instead of managing it like I expected:
http://hg.python.org/cpython/file/8bcda4f2b3a1/Objects/memoryobject.c#l748

I think we can work around this by creating a simple sub-class of PyMemoryView which  holds a strong reference to our struct wrapper. This would guarantee the lifetime of the wrapped GBytes or GstMapInfo as long as the memory view exists.

Additionally I would like to do a bunch of re-factoring to the PyGObject marshaling code before adding new features like this as it will only compound the current mess. See bug 693405. I will specifically try to consolidate array marshaling as much as can happen so this can move forward.
Comment 65 Simon Feltman 2013-10-23 03:02:26 UTC
Review of attachment 217589 [details] [review]:

Note that this is also fixed with the patch in bug 710561
Comment 66 Simon Feltman 2013-10-28 21:50:12 UTC
Comment on attachment 217589 [details] [review]
explicitely forbid array length for non callable paramters in gi and avoid discarding array conversion

Marking as obsolete now that the patches in bug 710561 have landed.
Comment 67 Flavian 2014-07-01 15:43:31 UTC
So, in the end, how do you do it?
Good job on the topic though, lots of ideas!
Comment 68 Simon Feltman 2014-08-31 06:30:26 UTC
Comment on attachment 225364 [details] [review]
pygobject support for array field size defined via another field (v3)

This was fixed with commits:
https://git.gnome.org/browse/pygobject/commit/?id=c55d029
https://git.gnome.org/browse/pygobject/commit/?id=4e130d7
Comment 69 Simon Feltman 2014-08-31 06:33:01 UTC
Comment on attachment 239976 [details] [review]
gi support for array field defined via a length field (v3)

Fixed in bug 710561
Comment 70 Simon Feltman 2014-08-31 06:35:26 UTC
Note that PyGI supports read access to the buffer:

def on_preroll_handoff(element, buf, pad):
    data = buf.data


But you still cannot write to it. Hoping to solve this in bug 709976.
Comment 71 Tim-Philipp Müller 2014-09-01 14:49:28 UTC
buf.data sounds like GStreamer 0.10 ?
Comment 72 Simon Feltman 2014-09-02 08:29:51 UTC
(In reply to comment #71)
> buf.data sounds like GStreamer 0.10 ?

Well it's what the OP was asking about and what is used in your test from comment #28 (which now works).
Comment 73 Will Manley 2014-10-20 19:27:42 UTC
I have a workaround which I've been using since porting stb-tester to GStreamer 1.0.  It's not pretty and I'm not proud of it but it does work:  I call `gst_buffer_map` using `ctypes`[1].  The data can then be read/written in-place with the contextmanager map_gst_buffer.

The implementation can be found in the appropriately named `gst_hacks.py` in the stb-tester sources.

[1]: https://github.com/drothlis/stb-tester/blob/592e5dfefa83b4b60d4ad40e2ec2c785b38ebb92/_stbt/gst_hacks.py#L41

Some demonstrative test cases from that same file:

    def test_map_buffer_reading_data():
        Gst.init([])

        b = Gst.Buffer.new_wrapped("hello")
        with map_gst_buffer(b, Gst.MapFlags.READ) as a:
            assert 'hello' == ''.join(chr(x) for x in a)


    def test_map_buffer_modifying_data():
        Gst.init([])

        b = Gst.Buffer.new_wrapped("hello")
        with map_gst_buffer(b, Gst.MapFlags.WRITE | Gst.MapFlags.READ) as a:
            a[2] = 1

        assert b.extract_dup(0, 5) == "he\x01lo"

I'm sorry.
Comment 74 André Klapper 2015-02-07 17:17:55 UTC
[Mass-moving gobject-introspection tickets to its own Bugzilla product - see bug 708029. Mass-filter your bugmail for this message: introspection20150207 ]
Comment 75 Simon Feltman 2016-10-23 08:57:11 UTC
Comment on attachment 225363 [details] [review]
pygobject: give mutable access to the initial array (v2)

Marking as obsolete in preparation for _pygi_getbufferinfo_ support.
Comment 76 Simon Feltman 2016-10-23 09:01:49 UTC
Created attachment 338281 [details] [review]
Add zero copy buffer management to Gst.MapInfo

This is a patch for gst-python, putting it in this ticket for now. Although, it looks like the original intent of this ticket has been solved (mapinfo.data seems to work via. buffer copy but is not writable).
Comment 77 Thibault Saunier 2016-11-03 11:04:55 UTC
Review of attachment 232018 [details] [review]:

Any reason for not getting that in?
Comment 78 Thibault Saunier 2016-11-03 11:05:01 UTC
Review of attachment 232018 [details] [review]:

Any reason for not getting that in?
Comment 79 Thibault Saunier 2016-11-04 14:42:32 UTC
Review of attachment 338281 [details] [review]:

Sounds nice, but you should make it so it is added only if gi.version_info > 2.23 I think
Comment 80 Thibault Saunier 2016-11-04 14:42:38 UTC
Review of attachment 338281 [details] [review]:

Sounds nice, but you should make it so it is added only if gi.version_info > 2.23 I think
Comment 81 GNOME Infrastructure Team 2018-02-08 12:17:52 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/gobject-introspection/issues/69.