After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 678619 - Provide a generic reference counting API
Provide a generic reference counting API
Status: RESOLVED OBSOLETE
Product: glib
Classification: Platform
Component: general
unspecified
Other Linux
: Normal enhancement
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2012-06-22 10:18 UTC by Evan Nemerson
Modified: 2018-05-24 14:18 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Possible implementation (15.34 KB, text/plain)
2012-06-22 10:22 UTC, Evan Nemerson
Details
Possible implementation (v2) (18.05 KB, text/plain)
2012-06-27 04:40 UTC, Evan Nemerson
Details

Description Evan Nemerson 2012-06-22 10:18:46 UTC
As bindings for languages with automatic memory management become more popular, the limited availability of reference-counted types outside of GObject is becoming more of an issue.  While adding reference counting to many of the types built into glib over the past several release cycles has definitely been a change for the better, in my opinion glib should make a stronger effort to facilitate creation and usage of reference counted types.

While GObject has a lot to offer, I think most everyone would agree that creating GObject-derived types in C can be a bit cumbersome in terms of the amount of (mostly boilerplate) code required.  Furthermore, GObjects provide many features which are simply unnecessary for many users, and some of those features can add significant execution time.

Using the atomic functions provided by glib has, unfortunately, not proven to be very popular.  There are examples of reference counted types outside of glib which are not derived from GObject, but they are rather few and far between.

What glib needs, in my opinion, is a generic reference counting API.  Something which is light and easy to use.
Comment 1 Evan Nemerson 2012-06-22 10:22:39 UTC
Created attachment 217014 [details]
Possible implementation

The code attached is my idea of what such an API could look like.  The
public API is quite small (actually, the whole thing is quite small)
but I believe it is quite useful, and it is functional.

There are two parts, which I have called GShared and GSharedPtr,
though I'm not overly attached to those names.  GSharedPtr is based on
GShared, so it makes sense to talk about GShared first.

The basic idea is that instead of using g_malloc (or the slice
allocator, system malloc, etc.) to allocate storage you instead use
g_shared_alloc (or one of the functions which calls it) and provide a
GDestroyNotify in addition to the size.  The code contains
documentation for each function and macro, but really the prototypes
probably tell you everything you need to know:


gpointer g_shared_alloc  (gsize size, GDestroyNotify notify);
gpointer g_shared_alloc0 (gsize size, GDestroyNotify notify);

#define g_shared_new(type,notify) \
  ((type*) g_shared_alloc(sizeof(type),notify))
#define g_shared_new0(type,notify) \
  ((type*) g_shared_alloc0(sizeof(type),notify))


You the proceed to use your structure just as you did with malloc, but
now you can add and remove references at will using g_shared_ref and
g_shared_unref:


gpointer g_shared_ref   (gpointer data);
void     g_shared_unref (gpointer data);


Once the final reference is removed, the GDestroyNotify passed to the
allocation function is invoked, then the memory is released.

If the library provides *_init and *_destroy style methods which
expect you to handle the allocation and freeing of memory yourself
instead of *_new and *_free style methods (libftdi comes to mind as an
example), it is actually possible use reference counting without any
changes to the API whatsoever.  Of course, this style API can also be
implemented alongside of a new/free API, which allows libraries to
provide optional reference counting without adding a dependency on
glib, much less GObject, for those who do not require it.

Even libraries which provide only a new/free API can add reference
counting quite trivially without breaking compatibility.  There is an
example of creating a reference counted type towards the end of the
code (grep for "foo").

In addition to reference counting, functions for adding and removing
weak references are provided:


guint g_shared_weak_ref (gpointer data, GFunc func,
                         gpointer user_data,
                         GDestroyNotify notify);
gboolean g_shared_weak_unref (gpointer data, guint weak_ref_id);


The code does not currently allow for a weak reference callback to
prevent destruction of a type, but adding such a feature would be
easy.

Basically, GShared gives you reference counting and weak references
for basically zero additional code.

The second part, GSharedPtr, is a generic reference counted container
for a single element.  Simply put, it is a way of adding a reference
counting layer on top of a raw pointer which doesn't offer you a way
to choose the allocation function.  As mentioned above, it is
implemented using GShared, so the same functions still work for adding
and removing references (including weak references).

GSharedPtrs are created using the creatively named g_shared_ptr_new.
The entire public API looks like this:


typedef struct _GSharedPtr {
  gpointer data;
} GSharedPtr;

GSharedPtr* g_shared_ptr_new (gpointer data, GDestroyNotify notify);


Again, full documentation is provided in the code, but you can
probably guess how it works without reading it.  Simply pass around
the GSharedPtr* instead of the raw pointer.  You can ref and unref it
at will, and weak references work.  When you need to access the
original pointer, just use "ptr->data" instead of "data".

I've included an example of GSharedPtr (search for "callbacks") which
shows one use case.  The example is a bit complex, but the vast
majority of that complexity is actually unrelated to GSharedPtr...
it's just creating a situation where GSharedPtr is necessary.  This
example is a not uncommon situation I've encountered many times, so it
my be worth a bit of an explanation:

Lots of C APIs provide a function which takes several callbacks, and a
single user data and destroy notify shared between them.  GObject
Introspection and Vala both have a hard time handling this so
libraries are adding alternate functions which allow for one user data
and one destroy notify per callback.  The difficulty is in maintaining
a backwards-compatible function--you can't just copy the user data and
destroy notify, since that would mean the destroy notify is invoked
once per callback instead of once for all of the callbacks.  The
example uses a GSharedPtr*, with one reference per callback, which
allows the data to be changed at will.

A few notes on the current implementation:

 * It uses g_malloc, not the slice allocator.  This can be changed
   pretty easily, but I'm not sure whether it should be.
 * It returns memory a location a few bytes into a real allocation.
   This has the potential to cause some confusion when looking at
   valgrind output, but I think it's the right way to go... The only
   alternative I can think of would be to keep a map of pointers to
   their metadata, but that is kind of horrible IMHO.
 * Weak refs are currently stored in a GSequence... it might be better
   to use a GSList or possibly GQueue since it is probably more likely
   there will be a small number of items.
 * I'm not entirely happy with GSharedPtr.  I want to address the use
   case, I'm just not sure GSharedPtr (at least as it is currently
   implemented) is the best way to do that.

Some things which might be worth adding:

 * g_shared_get_ref_count
 * g_shared_remove_weak_ref_by_func
 * g_shared_remove_weak_ref_by_data
Comment 2 Colin Walters 2012-06-22 14:45:03 UTC
I completely agree with the need for this.  Before I dive into the code, can we bikeshed debate the name a little bit?  "Shared" wouldn't be the first thing that comes to mind for this for me.  

"Refcounted": GRefcounted is pretty clear, but g_refcounted_ref() is kind of redundant I admit...
"Refstruct": Same issue as above
"Struct": GStruct / g_struct_ref: Not quite right because you could be refcounting a union, and the name duplication is confusing
"Base": GBase / g_base_ref() - Dunno
"Boxed": See next comment

Any other ideas?
Comment 3 Colin Walters 2012-06-22 14:58:02 UTC
So one thing that's very important to me from an introspection perspective is that it should be easy to make boxed types for C library authors.  With your GLib-only GShared, they'd still have to manually register it.  Maybe that's not a big deal, it's just adding:

G_DEFINE_BOXED_TYPE (MyBoxed, my_boxed, g_shared_ref, g_shared_unref)

But what about having this in GObject, calling it GGenericBoxed/GBaseBoxed, and having it automatically define a GType too?
Comment 4 Evan Nemerson 2012-06-22 17:52:13 UTC
(In reply to comment #2)
> I completely agree with the need for this.  Before I dive into the code, can we
> bikeshed debate the name a little bit?  "Shared" wouldn't be the first thing
> that comes to mind for this for me.

I'm not really fond of "Shared" either, but I still prefer it to the alternatives.  FWIW, my inspiration for that was boost's (now C++'s) shared_ptr, so I think it will make sense to a decent number of developers.  I've already spent more time than I'd care to admit trying to think of names and "Shared" was the best I could do, so at this point I don't think I have anything else to contribute here.  For now, at least, I'm happy to punt on naming and let others decide.

(In reply to comment #3)
> So one thing that's very important to me from an introspection perspective is
> that it should be easy to make boxed types for C library authors.  With your
> GLib-only GShared, they'd still have to manually register it.  Maybe that's not
> a big deal, it's just adding:
> 
> G_DEFINE_BOXED_TYPE (MyBoxed, my_boxed, g_shared_ref, g_shared_unref)
> 
> But what about having this in GObject, calling it GGenericBoxed/GBaseBoxed, and
> having it automatically define a GType too?

Having it in glib seems more appropriate to me for a few reasons:

 * It could be used for the reference counted stuff in glib, such as
   GArray, GHashTable, etc.  It wasn't my initial goal when I started
   thinking about this stuff, but the more I think about that the more
   I think doing so makes a lot of sense.  It allows you to:
   * Present a single API to ref/unref stuff (two if you include
     GObject, but that can easily be explained by pointing out that
     GObject is feature-packed and GShared is lightweight).
   * Get rid of some (admittedly, a rather small ammount) code
     duplication in glib.
   * Add support for weak references.
 * Not everyone is going to need, or want, to register a type...  I
   see lots of people using this for internal stuff where they have no
   need for a GType.  I think if you put it in glib and make it
   register a boxed type the result is that you're going to end up
   dissuading people from using it.
 * I mentioned the use case of people other than the upstream authors
   wanting to use this without adding a dependency on libglib or
   libgobject to their libraries.  Maybe it's just me, but I've always
   thought it impolite to register someone else's type.

I think the best solution would be to have it in glib but add a macro to GObject that looks something like this:

#define G_DEFINE_SHARED_TYPE(TypeName, type_name) \
  G_DEFINE_BOXED_TYPE(TypeName, type_name, g_shared_ref, g_shared_unref)
Comment 5 Allison Karlitskaya (desrt) 2012-06-26 21:02:13 UTC
This is something that has been discussed for quite some while, under the idea that it would get called g_ref()/g_unref() and we'd consider rebasing GObject on top of it...
Comment 6 Evan Nemerson 2012-06-27 04:40:30 UTC
Created attachment 217344 [details]
Possible implementation (v2)

Updated version, after discussing this with Ryan and Jürg on IRC.

First, g_ref/g_unref instead of g_shared_ref/g_shared_unref.

The biggest change is that there are now three levels of API.  The lowest level (g_ref_*) has callbacks for the ref and unref functions, meaning it can work with things which already have ref/unref functions, including GObject.

Everything now has versions for the slice allocator (previously there was only g_malloc/g_free).

I've done away with weak references.

It should be possible to use this to replace all the ref and unref functions in glib and gobject.  It could also be used to add other types, such as a reference counted string.
Comment 7 Matthias Clasen 2012-06-28 17:35:23 UTC
I'm not very thrilled by the idea of a 'generic refcounting' api, tbh
Comment 8 Evan Nemerson 2012-06-29 01:45:43 UTC
(In reply to comment #7)
> I'm not very thrilled by the idea of a 'generic refcounting' api, tbh

The part about providing a simple api for people to add reference counting to their stuff or the part about using it for everything in glib?  Or both?
Comment 9 Emmanuele Bassi (:ebassi) 2012-07-03 09:08:38 UTC
if g_ref_ref()/g_ref() and g_ref_unref()/g_unref() are perceived as a naming issue (and I'd be inclined to agree with that), I propose g_ref_acquire() and g_ref_release().
Comment 10 Matthias Clasen 2012-07-03 13:55:21 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > I'm not very thrilled by the idea of a 'generic refcounting' api, tbh
> 
> The part about providing a simple api for people to add reference counting to
> their stuff or the part about using it for everything in glib?  Or both?

Why do you think a 'simple' api for refcounting is needed ?
Whats hard about +/- 1 ? Or g_atomic_add, if you like ?
Comment 11 Evan Nemerson 2012-07-04 07:25:47 UTC
(In reply to comment #10)
> (In reply to comment #8)
> > (In reply to comment #7)
> > > I'm not very thrilled by the idea of a 'generic refcounting' api, tbh
> > 
> > The part about providing a simple api for people to add reference counting to
> > their stuff or the part about using it for everything in glib?  Or both?
> 
> Why do you think a 'simple' api for refcounting is needed ?
> Whats hard about +/- 1 ? Or g_atomic_add, if you like ?

Implementing basic reference counting isn't that hard, but it is a lot of boilerplate.  With the increased importance of refcounting due to things like pygobject, gjs, vala, etc., why not make it as easy as possible in order to encourage adoption?  The currently proposed API would actually result in one less line of code in order to have a reference counted type instead of g_malloc/g_free (or the slice allocator, system malloc, etc.).

That said, I didn't really mean to emphasize "simple".  Although I do think that's important, I think the "generic" part is much more interesting (hence its appearance in the bug title).  Even for types which already implement refcounting there are a lot of benefits to a generic API.  A few which come to mind are:

• Decreased code duplication ­— how many basically identical foo_ref and
  foo_unref functions are floating around out there?  (Actually, that
  made me a bit curious... `objdump -t /opt/gnome/lib64/*.so | grep
  -oP '[^ ]+_unref$' | sort | uniq | wc -l` = 366 for me)
• Higher quality implementations — even the refcounting
  implementations in glib don't bother to check to make sure the
  current refcount is > 0 before increasing it.  The default ref
  implementation could do that, and return NULL if it fails.  My code
  currently doesn't do that either (I really just copied what is
  already in glib), but I'll change it.
• More full-featured — I could easily add weak references back in to
  this.  I actually really want to discuss this, but I think that
  conversation might be a bit premature.
Comment 12 Colin Walters 2012-07-05 13:53:49 UTC
(In reply to comment #11)
>
> • Decreased code duplication ­— how many basically identical foo_ref and
>   foo_unref functions are floating around out there?  (Actually, that
>   made me a bit curious... `objdump -t /opt/gnome/lib64/*.so | grep
>   -oP '[^ ]+_unref$' | sort | uniq | wc -l` = 366 for me)

Starting new API proposals with concrete users is very important.  However your ethodology here of parsing some unknown set of ELF shared libraries is...weird =)  It's better to gather it via source code, and map that back to module names in GNOME.

So we've already mentioned the GHashTable/GPtrArray/GSource etc. custom ref/unrefs.

A large class of things that this patch improves could be succinctly described as "custom boxed types".  gnome-menus has one:

http://git.gnome.org/browse/gnome-menus/tree/libmenu/gmenu-tree.h?id=b3f0c47b46b5456d220b7cb8b384a52e71018fa3#n105

GLib itself has a number of them like:
http://git.gnome.org/browse/glib/tree/gio/gfileattribute.h?id=f416ece1039f65ce77df6983a872950c82877e37#n65

> • Higher quality implementations — even the refcounting
>   implementations in glib don't bother to check to make sure the
>   current refcount is > 0 before increasing it.  The default ref
>   implementation could do that, and return NULL if it fails.  My code
>   currently doesn't do that either (I really just copied what is
>   already in glib), but I'll change it.

But then do you expect applications to do:

foo = g_ref (foo);
if (!foo) { ... do something here?  what? ... }

?  That seems pointless.  Mismatched refcounts should be handled via g_return_if_fail() like every other API precondition in GLib.
Comment 13 Evan Nemerson 2012-07-05 19:04:00 UTC
(In reply to comment #12)
> (In reply to comment #11)
> >
> > • Decreased code duplication ­— how many basically identical foo_ref and
> >   foo_unref functions are floating around out there?  (Actually, that
> >   made me a bit curious... `objdump -t /opt/gnome/lib64/*.so | grep
> >   -oP '[^ ]+_unref$' | sort | uniq | wc -l` = 366 for me)
> 
> Starting new API proposals with concrete users is very important.  However your
> ethodology here of parsing some unknown set of ELF shared libraries is...weird
> =)  It's better to gather it via source code, and map that back to module names
> in GNOME.

It's a not really unknown... it's the gnome jhbuild moduleset.  I just wanted to get a quick feel for how much duplication there is...  the answer is "a lot".

> So we've already mentioned the GHashTable/GPtrArray/GSource etc. custom
> ref/unrefs.
> 
> A large class of things that this patch improves could be succinctly described
> as "custom boxed types".  gnome-menus has one:
> 
> http://git.gnome.org/browse/gnome-menus/tree/libmenu/gmenu-tree.h?id=b3f0c47b46b5456d220b7cb8b384a52e71018fa3#n105
> 
> GLib itself has a number of them like:
> http://git.gnome.org/browse/glib/tree/gio/gfileattribute.h?id=f416ece1039f65ce77df6983a872950c82877e37#n65

Cogl is an interesting case, too.  They have a bunch of CoglObject subclasses registered with this:

http://git.gnome.org/browse/cogl/tree/cogl/cogl-object-private.h?id=51b7fdbe17f300cf2edf42c2ca740ad748c9fd78

There are also a huge number of libraries which have been defining boxed types lately just so GObject Introspection works.  Sometimes these are refcounted, such as GstAtomicQueue, so they can use the ref/unref functions for the GBoxedCopyFunc/GBoxedFreeFunc.

Other times they aren't refcounted and code which actually copies and frees a new instance are used, like GstSegment.  Some of those legitimately shouldn't be refcounted (ClutterColor comes to mind, though that's just my opinion), but a lot of times I think it's just done because people can't be bothered to add refcounting.

Other times neither copy nor unref functions are available (e.g., GstAudioDownmixMeta in gst-plugins-base), and G-I consumers (other than Vala) just can't use that API.

Another use case would be for libraries which don't actually want to depend on GLib.  libftdi keeps popping into my mind for this, because it provides the these functions:


int ftdi_init (struct ftdi_context *ftdi);
void ftdi_deinit (struct ftdi_context *ftdi);


And then their new/free functions just wrap that up in a malloc, so you're free to use whatever memory allocation scheme you want (including just putting a struct ftdi_context on the stack).  That means we can do something like this:


struct ftdi_context* ctx =
    g_shared_new_slice (struct ftdi_context, ftdi_deinit);


And we have a refcounted libftdi context.  Other libraries, such as SQLite, provide the ability for user to set custom malloc-like functions.  You could use a g_shared_alloc wrapper to reference counted database connections, stored procedures, etc.

The big problem I see with this is that it could be hard to know whether or not a type was refcounted...  I'm not sure what the solution to that is, or if one exists.  Perhaps we could just encourage the person doing it to just create a new typedef, or maybe just rely on a G-I annotation.

> > • Higher quality implementations — even the refcounting
> >   implementations in glib don't bother to check to make sure the
> >   current refcount is > 0 before increasing it.  The default ref
> >   implementation could do that, and return NULL if it fails.  My code
> >   currently doesn't do that either (I really just copied what is
> >   already in glib), but I'll change it.
> 
> But then do you expect applications to do:
> 
> foo = g_ref (foo);
> if (!foo) { ... do something here?  what? ... }
> 
> ?  That seems pointless.  Mismatched refcounts should be handled via
> g_return_if_fail() like every other API precondition in GLib.

The ref functions don't have void return values, so g_return_if_fail would be a bug.  You would want g_return_val_if_fail(ref_count > 0, NULL) so that, hopefully, when you go to use the reference you'll be passing NULL instead of invalid memory and your application will emit a critical (because most methods which operate on that type should do there on g_return_if_fail (data != NULL)) instead of segfaulting.

That said, depending on what happens with the weak references stuff, we may have to not consider trying to ref an object with ref_count == 0 to be a bug (at least not for multi-threaded code with weak references), and just return NULL silently instead of using g_return_if_fail.
Comment 14 Emmanuele Bassi (:ebassi) 2013-09-10 16:08:14 UTC
by the by, I have (Valgrind-safe) refcounted memory areas in glib-bonghits: https://github.com/ebassi/glib-bonghits/blob/master/glib-bonghits/gb-ref-ptr.c

which I used for refcounted static strings.
Comment 15 Colin Walters 2014-01-02 14:31:46 UTC
Hi Evan,

Are you still working on this?  I think it'd be quite useful.
Comment 16 Evan Nemerson 2014-01-25 05:12:34 UTC
Colin, sorry, I somehow missed the notification for your comment.  Feel free to ping me on IRC if I ignore you again :)

I'm still interested, but I'm not really working on this right now.  I'd be happy to pick it back up if I could see a path for this to get accepted, though.

I think the general feeling was that people wanted to bolt more stuff on top of this (such as RTTI), to the point where it would basically rival GObject in scope, but nobody wants two GObject APIs in glib, so it would mean rebasing GObject on top of this (as Ryan mentioned in #c5).  Unfortunately that would mean breaking backwards compatibility, which is not going to happen any time soon.

IIRC Ryan didn't like the weak references part of this API, and wanted something like GWeakRef instead.  Since that's pretty much the only reasonable way to provide thread safety I agree with him, and I'd be willing to update the patch if I thought it might lead to this being accepted.  Actually, the reason I came back here is to grab the code so I could play with using hazard pointers to implement something like that.
Comment 17 GNOME Infrastructure Team 2018-05-24 14:18:12 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/glib/issues/561.