GNOME Bugzilla – Bug 688242
g_bytes_unref_to_*() are unsafe for introspection
Last modified: 2015-02-07 17:01:31 UTC
In current pygobject I have some tests that call GLib.bytes.unref_to_array(). However, this causes double-free errors under some conditions which are unfortunately not that easy to reproduce. The problem seems to be that unref_to_array() and unref_to_data() act like a destructor as they deallocate the boxed Bytes object, but GI clients would not know about that and then try to deallocate it again during garbage collection. I propose to just skip these methods to avoid that. GI clients can/should use g_bytes_get_data(), which is safe and works fine (I have tests for that in pygobject, too).
Created attachment 228877 [details] [review] GBytes: skip g_bytes_unref_to_*() Does that seem ok to you?
The only alternative is that we have an annotation for a method that consumes the reference of its instance. We do have a few of those...
Review of attachment 228877 [details] [review]: Well, they act as a regular unref, and conceptually it's the same issue as someone calling g_object_unref() directly from a binding - you can do it if you also hold a ref. Dunno. We don't (skip) g_object_unref either; someone actually made it work explicitly by adding (type GObject.Object) so we wouldn't see the "gpointer" and bail. But on the other hand, these are always actively wrong to use from a binding because the runtime must hold a ref, and if the app holds a ref, it's just the same as doing a copy, which they could do by just directly calling GLib.ByteArray.new_from_bytes(bytes) - if that constructor existed, which it doesn't =/ So...dunno =)
(In reply to comment #2) > The only alternative is that we have an annotation for a method that consumes > the reference of its instance. We do have a few of those... Yeah...you were originally supposed to be able to say (transfer full) on input arguments. Binding support here is relatively iffy though.
So GI clients could/should interpret a (transfer full) self as "this is actually a destructor", and invalidate the object afterwards with only freeing the GI metadata about it, but not the struct/object itself?
(In reply to comment #5) > So GI clients could/should interpret a (transfer full) self as "this is > actually a destructor", and invalidate the object afterwards with only freeing > the GI metadata about it, but not the struct/object itself? For a non-refcounted type it implies that the value will be destroyed, but for a refcounted type, it just means the function will steal your ref (and so you should add an extra ref before calling this function if you want your pointer to it to remain valid). In the non-refcounted case, I would say that memory-managed languages should copy the value first, and pass in the copy rather than the original. (Allowing the copy to be destroyed and leaving the original untouched.) It doesn't make sense to try to kludge up a representation for "this value has already been freed even though you might still have a pointer to it". Or: | (in) | (out) / Returns: ----------------+--------------------------+------------------------- (transfer none) | - | ref/copy after return (transfer full) | ref/copy before calling | -
(In reply to comment #6) > For a non-refcounted type it implies that the value will be destroyed, but for > a refcounted type, it just means the function will steal your ref (and so you > should add an extra ref before calling this function if you want your pointer > to it to remain valid). Right, but the whole point of g_bytes_unref_to_array() is to destroy the original GBytes, i. e. act as a destructor. This is more efficient in C. But if the GI client ups the ref count before, then in essence the method does nothing else than just a g_bytes_get_data(), and it would be rather confusing why the method doesn't do what it says on the tin?
Yes, any method whose behavior explicitly depends on manual memory management is not going to work right in automatically-memory-managed bindings.
Are there (or is it possible there will ever be) gi language bindings that use manual memory management? If not, it seems reasonable to skip this. Otherwise the method will need to be explicitly overridden to raise an error and avoid potential crashes. While this isn't necessarily a big deal, attempting to nip the problem in the bud will also help other bindings. If we can skip this, I hope we can also add reasoning as to why it is being skipped near the annotation itself. A link to this ticket or comment can be very helpful to future readers of the code base. It can be fairly annoying to come across a needed method that is marked as skip and then have to go through all the trials and tribulations that the person who marked it as skip already went through.
Also I don't think we should be skipping all ref/unref methods as a general rule as they may come in useful in binding layers. While it was not exactly a ref/unref method, I was able to fix a leak by explicitly calling GValue.unset in the python overrides __del__ method (bug 688137). Perhaps rather than skipping these methods we could expose them with an underscore (_unref) or some other annotation so the problem can be solved generically and still allow access?
It seems as though the [transfer full] annotation of a methods instance argument (self) is enough information to generalize avoidance of these methods in pygobject itself. So basically when pygobject is building methods, if the self arg is set to transfer full, we skip it or name it with an underscore.
(In reply to comment #11) > It seems as though the [transfer full] annotation of a methods instance > argument (self) is enough information to generalize avoidance of these methods > in pygobject itself. So basically when pygobject is building methods, if the > self arg is set to transfer full, we skip it or name it with an underscore. I agree, this sounds like a good idea, and more general than trying to chase all cases of these through annotations. So I'm reassigning to PyGObject; if the GLib maintainers disagree and would rather see this (skip)ed, I'm still fine with that as well, so please feel free to reassign back.
Actually, we can't. The "self" argument is not part of the .gir, so GI clients cannot actually see the transfer mode of the self argument. So if we want to fix this in a more generic way, we would need to expose self in the .gir.
(In reply to comment #13) > Actually, we can't. The "self" argument is not part of the .gir, so GI clients > cannot actually see the transfer mode of the self argument. So if we want to > fix this in a more generic way, we would need to expose self in the .gir. Note this actually was done: https://git.gnome.org/browse/gobject-introspection/commit/?id=9c0bd59b021e163eae191db95ce6beeb389cbd6f
Marking as fixed now that bug 729662 has landed and we have an API for instance argument transfer annotations. Also created bug 735076 for implementing support in pygi.
[Mass-moving gobject-introspection tickets to its own Bugzilla product - see bug 708029. Mass-filter your bugmail for this message: introspection20150207 ]