GNOME Bugzilla – Bug 678619
Provide a generic reference counting API
Last modified: 2018-05-24 14:18:12 UTC
As bindings for languages with automatic memory management become more popular, the limited availability of reference-counted types outside of GObject is becoming more of an issue. While adding reference counting to many of the types built into glib over the past several release cycles has definitely been a change for the better, in my opinion glib should make a stronger effort to facilitate creation and usage of reference counted types. While GObject has a lot to offer, I think most everyone would agree that creating GObject-derived types in C can be a bit cumbersome in terms of the amount of (mostly boilerplate) code required. Furthermore, GObjects provide many features which are simply unnecessary for many users, and some of those features can add significant execution time. Using the atomic functions provided by glib has, unfortunately, not proven to be very popular. There are examples of reference counted types outside of glib which are not derived from GObject, but they are rather few and far between. What glib needs, in my opinion, is a generic reference counting API. Something which is light and easy to use.
Created attachment 217014 [details] Possible implementation The code attached is my idea of what such an API could look like. The public API is quite small (actually, the whole thing is quite small) but I believe it is quite useful, and it is functional. There are two parts, which I have called GShared and GSharedPtr, though I'm not overly attached to those names. GSharedPtr is based on GShared, so it makes sense to talk about GShared first. The basic idea is that instead of using g_malloc (or the slice allocator, system malloc, etc.) to allocate storage you instead use g_shared_alloc (or one of the functions which calls it) and provide a GDestroyNotify in addition to the size. The code contains documentation for each function and macro, but really the prototypes probably tell you everything you need to know: gpointer g_shared_alloc (gsize size, GDestroyNotify notify); gpointer g_shared_alloc0 (gsize size, GDestroyNotify notify); #define g_shared_new(type,notify) \ ((type*) g_shared_alloc(sizeof(type),notify)) #define g_shared_new0(type,notify) \ ((type*) g_shared_alloc0(sizeof(type),notify)) You the proceed to use your structure just as you did with malloc, but now you can add and remove references at will using g_shared_ref and g_shared_unref: gpointer g_shared_ref (gpointer data); void g_shared_unref (gpointer data); Once the final reference is removed, the GDestroyNotify passed to the allocation function is invoked, then the memory is released. If the library provides *_init and *_destroy style methods which expect you to handle the allocation and freeing of memory yourself instead of *_new and *_free style methods (libftdi comes to mind as an example), it is actually possible use reference counting without any changes to the API whatsoever. Of course, this style API can also be implemented alongside of a new/free API, which allows libraries to provide optional reference counting without adding a dependency on glib, much less GObject, for those who do not require it. Even libraries which provide only a new/free API can add reference counting quite trivially without breaking compatibility. There is an example of creating a reference counted type towards the end of the code (grep for "foo"). In addition to reference counting, functions for adding and removing weak references are provided: guint g_shared_weak_ref (gpointer data, GFunc func, gpointer user_data, GDestroyNotify notify); gboolean g_shared_weak_unref (gpointer data, guint weak_ref_id); The code does not currently allow for a weak reference callback to prevent destruction of a type, but adding such a feature would be easy. Basically, GShared gives you reference counting and weak references for basically zero additional code. The second part, GSharedPtr, is a generic reference counted container for a single element. Simply put, it is a way of adding a reference counting layer on top of a raw pointer which doesn't offer you a way to choose the allocation function. As mentioned above, it is implemented using GShared, so the same functions still work for adding and removing references (including weak references). GSharedPtrs are created using the creatively named g_shared_ptr_new. The entire public API looks like this: typedef struct _GSharedPtr { gpointer data; } GSharedPtr; GSharedPtr* g_shared_ptr_new (gpointer data, GDestroyNotify notify); Again, full documentation is provided in the code, but you can probably guess how it works without reading it. Simply pass around the GSharedPtr* instead of the raw pointer. You can ref and unref it at will, and weak references work. When you need to access the original pointer, just use "ptr->data" instead of "data". I've included an example of GSharedPtr (search for "callbacks") which shows one use case. The example is a bit complex, but the vast majority of that complexity is actually unrelated to GSharedPtr... it's just creating a situation where GSharedPtr is necessary. This example is a not uncommon situation I've encountered many times, so it my be worth a bit of an explanation: Lots of C APIs provide a function which takes several callbacks, and a single user data and destroy notify shared between them. GObject Introspection and Vala both have a hard time handling this so libraries are adding alternate functions which allow for one user data and one destroy notify per callback. The difficulty is in maintaining a backwards-compatible function--you can't just copy the user data and destroy notify, since that would mean the destroy notify is invoked once per callback instead of once for all of the callbacks. The example uses a GSharedPtr*, with one reference per callback, which allows the data to be changed at will. A few notes on the current implementation: * It uses g_malloc, not the slice allocator. This can be changed pretty easily, but I'm not sure whether it should be. * It returns memory a location a few bytes into a real allocation. This has the potential to cause some confusion when looking at valgrind output, but I think it's the right way to go... The only alternative I can think of would be to keep a map of pointers to their metadata, but that is kind of horrible IMHO. * Weak refs are currently stored in a GSequence... it might be better to use a GSList or possibly GQueue since it is probably more likely there will be a small number of items. * I'm not entirely happy with GSharedPtr. I want to address the use case, I'm just not sure GSharedPtr (at least as it is currently implemented) is the best way to do that. Some things which might be worth adding: * g_shared_get_ref_count * g_shared_remove_weak_ref_by_func * g_shared_remove_weak_ref_by_data
I completely agree with the need for this. Before I dive into the code, can we bikeshed debate the name a little bit? "Shared" wouldn't be the first thing that comes to mind for this for me. "Refcounted": GRefcounted is pretty clear, but g_refcounted_ref() is kind of redundant I admit... "Refstruct": Same issue as above "Struct": GStruct / g_struct_ref: Not quite right because you could be refcounting a union, and the name duplication is confusing "Base": GBase / g_base_ref() - Dunno "Boxed": See next comment Any other ideas?
So one thing that's very important to me from an introspection perspective is that it should be easy to make boxed types for C library authors. With your GLib-only GShared, they'd still have to manually register it. Maybe that's not a big deal, it's just adding: G_DEFINE_BOXED_TYPE (MyBoxed, my_boxed, g_shared_ref, g_shared_unref) But what about having this in GObject, calling it GGenericBoxed/GBaseBoxed, and having it automatically define a GType too?
(In reply to comment #2) > I completely agree with the need for this. Before I dive into the code, can we > bikeshed debate the name a little bit? "Shared" wouldn't be the first thing > that comes to mind for this for me. I'm not really fond of "Shared" either, but I still prefer it to the alternatives. FWIW, my inspiration for that was boost's (now C++'s) shared_ptr, so I think it will make sense to a decent number of developers. I've already spent more time than I'd care to admit trying to think of names and "Shared" was the best I could do, so at this point I don't think I have anything else to contribute here. For now, at least, I'm happy to punt on naming and let others decide. (In reply to comment #3) > So one thing that's very important to me from an introspection perspective is > that it should be easy to make boxed types for C library authors. With your > GLib-only GShared, they'd still have to manually register it. Maybe that's not > a big deal, it's just adding: > > G_DEFINE_BOXED_TYPE (MyBoxed, my_boxed, g_shared_ref, g_shared_unref) > > But what about having this in GObject, calling it GGenericBoxed/GBaseBoxed, and > having it automatically define a GType too? Having it in glib seems more appropriate to me for a few reasons: * It could be used for the reference counted stuff in glib, such as GArray, GHashTable, etc. It wasn't my initial goal when I started thinking about this stuff, but the more I think about that the more I think doing so makes a lot of sense. It allows you to: * Present a single API to ref/unref stuff (two if you include GObject, but that can easily be explained by pointing out that GObject is feature-packed and GShared is lightweight). * Get rid of some (admittedly, a rather small ammount) code duplication in glib. * Add support for weak references. * Not everyone is going to need, or want, to register a type... I see lots of people using this for internal stuff where they have no need for a GType. I think if you put it in glib and make it register a boxed type the result is that you're going to end up dissuading people from using it. * I mentioned the use case of people other than the upstream authors wanting to use this without adding a dependency on libglib or libgobject to their libraries. Maybe it's just me, but I've always thought it impolite to register someone else's type. I think the best solution would be to have it in glib but add a macro to GObject that looks something like this: #define G_DEFINE_SHARED_TYPE(TypeName, type_name) \ G_DEFINE_BOXED_TYPE(TypeName, type_name, g_shared_ref, g_shared_unref)
This is something that has been discussed for quite some while, under the idea that it would get called g_ref()/g_unref() and we'd consider rebasing GObject on top of it...
Created attachment 217344 [details] Possible implementation (v2) Updated version, after discussing this with Ryan and Jürg on IRC. First, g_ref/g_unref instead of g_shared_ref/g_shared_unref. The biggest change is that there are now three levels of API. The lowest level (g_ref_*) has callbacks for the ref and unref functions, meaning it can work with things which already have ref/unref functions, including GObject. Everything now has versions for the slice allocator (previously there was only g_malloc/g_free). I've done away with weak references. It should be possible to use this to replace all the ref and unref functions in glib and gobject. It could also be used to add other types, such as a reference counted string.
I'm not very thrilled by the idea of a 'generic refcounting' api, tbh
(In reply to comment #7) > I'm not very thrilled by the idea of a 'generic refcounting' api, tbh The part about providing a simple api for people to add reference counting to their stuff or the part about using it for everything in glib? Or both?
if g_ref_ref()/g_ref() and g_ref_unref()/g_unref() are perceived as a naming issue (and I'd be inclined to agree with that), I propose g_ref_acquire() and g_ref_release().
(In reply to comment #8) > (In reply to comment #7) > > I'm not very thrilled by the idea of a 'generic refcounting' api, tbh > > The part about providing a simple api for people to add reference counting to > their stuff or the part about using it for everything in glib? Or both? Why do you think a 'simple' api for refcounting is needed ? Whats hard about +/- 1 ? Or g_atomic_add, if you like ?
(In reply to comment #10) > (In reply to comment #8) > > (In reply to comment #7) > > > I'm not very thrilled by the idea of a 'generic refcounting' api, tbh > > > > The part about providing a simple api for people to add reference counting to > > their stuff or the part about using it for everything in glib? Or both? > > Why do you think a 'simple' api for refcounting is needed ? > Whats hard about +/- 1 ? Or g_atomic_add, if you like ? Implementing basic reference counting isn't that hard, but it is a lot of boilerplate. With the increased importance of refcounting due to things like pygobject, gjs, vala, etc., why not make it as easy as possible in order to encourage adoption? The currently proposed API would actually result in one less line of code in order to have a reference counted type instead of g_malloc/g_free (or the slice allocator, system malloc, etc.). That said, I didn't really mean to emphasize "simple". Although I do think that's important, I think the "generic" part is much more interesting (hence its appearance in the bug title). Even for types which already implement refcounting there are a lot of benefits to a generic API. A few which come to mind are: • Decreased code duplication — how many basically identical foo_ref and foo_unref functions are floating around out there? (Actually, that made me a bit curious... `objdump -t /opt/gnome/lib64/*.so | grep -oP '[^ ]+_unref$' | sort | uniq | wc -l` = 366 for me) • Higher quality implementations — even the refcounting implementations in glib don't bother to check to make sure the current refcount is > 0 before increasing it. The default ref implementation could do that, and return NULL if it fails. My code currently doesn't do that either (I really just copied what is already in glib), but I'll change it. • More full-featured — I could easily add weak references back in to this. I actually really want to discuss this, but I think that conversation might be a bit premature.
(In reply to comment #11) > > • Decreased code duplication — how many basically identical foo_ref and > foo_unref functions are floating around out there? (Actually, that > made me a bit curious... `objdump -t /opt/gnome/lib64/*.so | grep > -oP '[^ ]+_unref$' | sort | uniq | wc -l` = 366 for me) Starting new API proposals with concrete users is very important. However your ethodology here of parsing some unknown set of ELF shared libraries is...weird =) It's better to gather it via source code, and map that back to module names in GNOME. So we've already mentioned the GHashTable/GPtrArray/GSource etc. custom ref/unrefs. A large class of things that this patch improves could be succinctly described as "custom boxed types". gnome-menus has one: http://git.gnome.org/browse/gnome-menus/tree/libmenu/gmenu-tree.h?id=b3f0c47b46b5456d220b7cb8b384a52e71018fa3#n105 GLib itself has a number of them like: http://git.gnome.org/browse/glib/tree/gio/gfileattribute.h?id=f416ece1039f65ce77df6983a872950c82877e37#n65 > • Higher quality implementations — even the refcounting > implementations in glib don't bother to check to make sure the > current refcount is > 0 before increasing it. The default ref > implementation could do that, and return NULL if it fails. My code > currently doesn't do that either (I really just copied what is > already in glib), but I'll change it. But then do you expect applications to do: foo = g_ref (foo); if (!foo) { ... do something here? what? ... } ? That seems pointless. Mismatched refcounts should be handled via g_return_if_fail() like every other API precondition in GLib.
(In reply to comment #12) > (In reply to comment #11) > > > > • Decreased code duplication — how many basically identical foo_ref and > > foo_unref functions are floating around out there? (Actually, that > > made me a bit curious... `objdump -t /opt/gnome/lib64/*.so | grep > > -oP '[^ ]+_unref$' | sort | uniq | wc -l` = 366 for me) > > Starting new API proposals with concrete users is very important. However your > ethodology here of parsing some unknown set of ELF shared libraries is...weird > =) It's better to gather it via source code, and map that back to module names > in GNOME. It's a not really unknown... it's the gnome jhbuild moduleset. I just wanted to get a quick feel for how much duplication there is... the answer is "a lot". > So we've already mentioned the GHashTable/GPtrArray/GSource etc. custom > ref/unrefs. > > A large class of things that this patch improves could be succinctly described > as "custom boxed types". gnome-menus has one: > > http://git.gnome.org/browse/gnome-menus/tree/libmenu/gmenu-tree.h?id=b3f0c47b46b5456d220b7cb8b384a52e71018fa3#n105 > > GLib itself has a number of them like: > http://git.gnome.org/browse/glib/tree/gio/gfileattribute.h?id=f416ece1039f65ce77df6983a872950c82877e37#n65 Cogl is an interesting case, too. They have a bunch of CoglObject subclasses registered with this: http://git.gnome.org/browse/cogl/tree/cogl/cogl-object-private.h?id=51b7fdbe17f300cf2edf42c2ca740ad748c9fd78 There are also a huge number of libraries which have been defining boxed types lately just so GObject Introspection works. Sometimes these are refcounted, such as GstAtomicQueue, so they can use the ref/unref functions for the GBoxedCopyFunc/GBoxedFreeFunc. Other times they aren't refcounted and code which actually copies and frees a new instance are used, like GstSegment. Some of those legitimately shouldn't be refcounted (ClutterColor comes to mind, though that's just my opinion), but a lot of times I think it's just done because people can't be bothered to add refcounting. Other times neither copy nor unref functions are available (e.g., GstAudioDownmixMeta in gst-plugins-base), and G-I consumers (other than Vala) just can't use that API. Another use case would be for libraries which don't actually want to depend on GLib. libftdi keeps popping into my mind for this, because it provides the these functions: int ftdi_init (struct ftdi_context *ftdi); void ftdi_deinit (struct ftdi_context *ftdi); And then their new/free functions just wrap that up in a malloc, so you're free to use whatever memory allocation scheme you want (including just putting a struct ftdi_context on the stack). That means we can do something like this: struct ftdi_context* ctx = g_shared_new_slice (struct ftdi_context, ftdi_deinit); And we have a refcounted libftdi context. Other libraries, such as SQLite, provide the ability for user to set custom malloc-like functions. You could use a g_shared_alloc wrapper to reference counted database connections, stored procedures, etc. The big problem I see with this is that it could be hard to know whether or not a type was refcounted... I'm not sure what the solution to that is, or if one exists. Perhaps we could just encourage the person doing it to just create a new typedef, or maybe just rely on a G-I annotation. > > • Higher quality implementations — even the refcounting > > implementations in glib don't bother to check to make sure the > > current refcount is > 0 before increasing it. The default ref > > implementation could do that, and return NULL if it fails. My code > > currently doesn't do that either (I really just copied what is > > already in glib), but I'll change it. > > But then do you expect applications to do: > > foo = g_ref (foo); > if (!foo) { ... do something here? what? ... } > > ? That seems pointless. Mismatched refcounts should be handled via > g_return_if_fail() like every other API precondition in GLib. The ref functions don't have void return values, so g_return_if_fail would be a bug. You would want g_return_val_if_fail(ref_count > 0, NULL) so that, hopefully, when you go to use the reference you'll be passing NULL instead of invalid memory and your application will emit a critical (because most methods which operate on that type should do there on g_return_if_fail (data != NULL)) instead of segfaulting. That said, depending on what happens with the weak references stuff, we may have to not consider trying to ref an object with ref_count == 0 to be a bug (at least not for multi-threaded code with weak references), and just return NULL silently instead of using g_return_if_fail.
by the by, I have (Valgrind-safe) refcounted memory areas in glib-bonghits: https://github.com/ebassi/glib-bonghits/blob/master/glib-bonghits/gb-ref-ptr.c which I used for refcounted static strings.
Hi Evan, Are you still working on this? I think it'd be quite useful.
Colin, sorry, I somehow missed the notification for your comment. Feel free to ping me on IRC if I ignore you again :) I'm still interested, but I'm not really working on this right now. I'd be happy to pick it back up if I could see a path for this to get accepted, though. I think the general feeling was that people wanted to bolt more stuff on top of this (such as RTTI), to the point where it would basically rival GObject in scope, but nobody wants two GObject APIs in glib, so it would mean rebasing GObject on top of this (as Ryan mentioned in #c5). Unfortunately that would mean breaking backwards compatibility, which is not going to happen any time soon. IIRC Ryan didn't like the weak references part of this API, and wanted something like GWeakRef instead. Since that's pretty much the only reasonable way to provide thread safety I agree with him, and I'd be willing to update the patch if I thought it might lead to this being accepted. Actually, the reason I came back here is to grab the code so I could play with using hazard pointers to implement something like that.
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/glib/issues/561.