Bug 548954 – weak references are not threadsafe

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 548954 - weak references are not threadsafe


Summary:	weak references are not threadsafe


Status:	RESOLVED FIXED

Product:	glib
Classification:	Platform
Component:	gobject
Version:	unspecified
Hardware:	Other All

Importance:	Normal normal
Target Milestone:	---
Assigned To:	Simon McVittie
QA Contact:	gtkdev

URL:
Whiteboard:

Depends on:
Blocks:	665211

Reported:	2008-08-22 06:23 UTC by Allison Karlitskaya (desrt)
Modified:	2012-01-05 14:56 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
GBitLock: turn assumptions of g_futex_int_address into a static assertion (1.03 KB, patch) 2011-12-06 19:40 UTC, Simon McVittie	committed	Details \| Review
g_pointer_bit_lock_and_get: add (3.71 KB, patch) 2011-12-06 19:41 UTC, Simon McVittie	none	Details \| Review
_object_weak_ref, g_object_add_weak_pointer: document non-thread-safety (1.75 KB, patch) 2011-12-06 19:41 UTC, Simon McVittie	committed	Details \| Review
GWeakRef: add (11.35 KB, patch) 2011-12-06 19:42 UTC, Simon McVittie	none	Details \| Review
GDBusConnection: use GWeakRef to make the singletons thread-safe (3.52 KB, patch) 2011-12-06 19:43 UTC, Simon McVittie	none	Details \| Review
Add regression test for (atomic) weak-refs used to cache a singleton (4.06 KB, patch) 2011-12-07 15:09 UTC, Simon McVittie	none	Details \| Review
GObject: add threadsafe weak references (10.51 KB, patch) 2011-12-07 19:51 UTC, Allison Karlitskaya (desrt)	none	Details \| Review
Add test for weak-refs used to cache a singleton (3.47 KB, patch) 2011-12-07 19:51 UTC, Allison Karlitskaya (desrt)	none	Details \| Review
GWeakRef: add a weak GObject reference believed to be thread-safe (12.85 KB, patch) 2011-12-08 18:12 UTC, Simon McVittie	reviewed	Details \| Review
Add deterministic tests for the API of GWeakRef (4.50 KB, patch) 2011-12-08 18:13 UTC, Simon McVittie	committed	Details \| Review
Add regression test for GWeakRef used to cache a singleton (4.06 KB, patch) 2011-12-08 18:13 UTC, Simon McVittie	committed	Details \| Review
GDBusConnection: use GWeakRef to make the singletons thread-safe (3.44 KB, patch) 2011-12-08 18:14 UTC, Simon McVittie	committed	Details \| Review
GWeakRef: add a weak GObject reference believed to be thread-safe (11.49 KB, patch) 2012-01-02 17:26 UTC, Allison Karlitskaya (desrt)	committed	Details \| Review
test_weak_ref: free dynamic_weak earlier, to prove no use-after-free (802 bytes, patch) 2012-01-05 14:56 UTC, Simon McVittie	none	Details \| Review

Description Allison Karlitskaya (desrt) 2008-08-22 06:23:40 UTC

a fairly common code pattern, i'd guess is to have a static variable acting as a sort of 'cache' for an object that wants to come and go as people use it or stop using it.

for example:

foo *
get_foo_obj ()
{
  static foo *foo_cache;

  if (foo_cache == NULL)
    {
      foo_cache = new_foo ();
      weak_ref (foo_cache, &foo_cache);
    }
  else
    ref (foo_cache);

  return foo_cache;
}


unfortunately, there is absolutely no way for this to be done in a treadsafe way without abusing toggle refs.  the problem is:

1) user finishes with their reference to foo, calls unref()
2) they were the last one using it so it will be destroyed now
3) as part of the destruction, foo_cache is set to NULL.

but if a user in another thread comes along between steps #2 and #3 then they will see 'foo_cache' is non-NULL but the object that it points to has already been partially destroyed.

there's no way to work around this problem because you can't reasonably expect the user to acquire a lock before unreferencing and because by the time that the pointer is set to NULL it's far too late.

Comment 1 Behdad Esfahbod 2008-08-22 06:29:33 UTC

Agreed.  My use case is that weakrefs are used to break cycles of strong refs.  However, the weakref can never be safely used as the object at the other end may simply go away any moment.  So we need a MT-safe weakref.

Comment 2 Allison Karlitskaya (desrt) 2008-08-22 06:35:57 UTC

btw: i think that just about any solution to this problem will involve a function like:


GObject *g_object_safe_ref (GObject **object_ptr);

and then you'd write:


if ((foo = g_object_safe_ref (&foo_cache)))
  return foo;


after that, there are about 100 ways that the synchronisation between g_object_safe_ref and g_object_unref could work...

Comment 3 Behdad Esfahbod 2008-08-22 06:42:48 UTC

Actually I can't think of a hundred ways.  Obviously you can't dereference the pointer to get to any lock.  So any lock used should only depend on the value of the object_ptr.  So it's either a global lock (or set of global locks...), or a per weakref one.  Per-weakref would work but the GObject has to go lock them all before starting to finalize the object.   Which may be ok.  Other than that, your bitluck stuff obviously can use the lower bit of the weakref to do what we want.

Comment 4 Paul Davis 2010-02-26 01:41:22 UTC

the pattern shown in comment 2 is isomorphous with the boost::shared_ptr and boost::weak_ptr relationship. given a weak_ptr to an object: 

   boost::weak_ptr<SomeObject> wptr;

you can ONLY get a strongref via an explicit locking operation that will return null if the object has been deleted:

   boost::shared_ptr<SomeObject> sptr = wptr.lock ();
   if (!sptr) {
      // object deleted
      ...
   }
   /* else ... use sptr freely */

just FYI.

Comment 5 Florian Scandella 2010-02-28 14:38:31 UTC

i'm no glib developer, but my first thought was: why not use a function callback for cleanup (or just a notification that the object was/is going to be deleted) ? then the user is responsible for setting it's own variable to null and do locking if needed.

Comment 6 Havoc Pennington 2010-09-22 21:50:01 UTC

> 2) they were the last one using it so it will be destroyed now
> 3) as part of the destruction, foo_cache is set to NULL.

Is this true? It looks to me like weak refs are notified in dispose() and the object could still be resurrected after that point. So the refcount is not yet 0 and object may yet not be finalized, until after the weak refs are all null.

Comment 7 Havoc Pennington 2010-09-22 21:50:42 UTC

by "in dispose()" I guess I mean "right after dispose, in unref()" aka "conceptually in dispose, not in finalize"

Comment 8 Behdad Esfahbod 2010-09-22 22:16:04 UTC

Havoc, but what if thread 1 reads the weak ref while it's not null, context switch, thread 2 completely disposes object including null'ing the weakref, context switch, thread 1 continues with ref'ing the nonexistent object?

Ie.:

  strongref = weakref ? ref (weakref) : NULL;

What guarantees that between reading "weakref" (both times) and ref'ing it, it's not disposed by another thread?

Comment 9 Havoc Pennington 2010-09-22 22:19:33 UTC

right, weak pointers like this just not a good solution with threads I guess.

Comment 10 Behdad Esfahbod 2010-09-22 22:41:51 UTC

Well, we can fix it using bitlocks.

In g_object_unref(), before starting destruction we lock the low bit of all the weakrefs to the object, then continue with destruction if refcount is still zero.  When done, we need to atomically do 1) release the bitlock, 2) set the weakref to NULL.  This needs support from the bitlock.

Then add g_object_ref_weak() that would lock the low bit of the weakref, ref the object if it's not NULL, release the bitlock.

Comment 11 Simon McVittie 2011-12-02 13:15:48 UTC

(In reply to comment #10)
> all the weakrefs to the object

This needs a new weak-ref construct (another opaque pointer) somewhat analogous to g_object_add_weak_pointer(), right?

(As far as I can see, it has to be new API because g_object_add_weak_pointer() doesn't require API users to do anything special to access the weak pointer - but in any thread-safe scheme, API users *must* do something special.)

I'll try to implement this; the rest of this comment is discussion of alternatives.

(In reply to comment #0)
> a fairly common code pattern, i'd guess is to have a static variable acting as
> a sort of 'cache'
...
> 1) user finishes with their reference to foo, calls unref()
> 2) they were the last one using it so it will be destroyed now
> 3) as part of the destruction, foo_cache is set to NULL.
...
> but if a user in another thread comes along between steps #2 and #3 then they
> will see 'foo_cache' is non-NULL but the object that it points to has already
> been partially destroyed.

Bug #665211 is this exact race, seen "in real life" on a singleton GDBusConnection. The problem is that by the time the singleton weak ref is notified, the decision to dispose has already been made, so even if the object gets resurrected by a thread racing with the weak unref, it's already (at best) non-functional and (at worst) crashy.

(See <https://mail.gnome.org/archives/gtk-devel-list/2011-December/msg00000.html> for discussion of the extent to which it is or isn't acceptable for a disposed object to be crashy.)

Three solutions spring to mind:

* Abuse toggle-refs instead; if a language binding is using toggle-refs
  for their intended purpose at the same time, the cached object becomes
  immortal

* Make weak refs happen before we've committed to the decision to dispose
  the object, and afterwards, re-check whether the object is still dead
  in case another thread won the race and resurrected it; or introduce
  an incompatible flavour of weak refs that does happen sufficiently soon
  (I started prototyping this under the name "fragile refs", on the basis
  that they're weaker than normal weak refs)

* Make weak refs atomic, requiring an incompatible flavour of weak refs
  with a different API, as discussed above; any thread-safe situation
  must use these

Looking at the second suggestion again, I think has another undesirable race, though: if the intention of the cache is to have a pseudo-singleton object that is "the" session GDBusConnection (or whatever), but the "get" thread wins the race against the "unref" thread, then it'll resurrect the old singleton. Meanwhile, the next thread to try retrieving the singleton from the cache gets a new singleton.

In GDBusConnection this is a particular problem because messages on a single connection are totally ordered, but messages on a pair of connections get an undefined ordering (the order in which poll() happens to notice socket events), breaking expectations.

Another possibility I've thought about for GDBus is to force serialization of significant events by pushing them into the GDBusWorker thread, which can only do one thing at a time, resulting in everything happening in an internally-consistent order. However, that doesn't work in general (GDBus happens to have a convenient thread it can use, but not everything has one of those), and if GDBus ever starts terminating the GDBusWorker thread when nothing is using it (which is currently #if 0), it'll have the same problem with the singleton representing that thread!

Comment 12 Simon McVittie 2011-12-06 19:40:43 UTC

Created attachment 202931 [details] [review]
GBitLock: turn assumptions of g_futex_int_address into a  static assertion

We'll probably never encounter a platform where these fail, but that's
what static assertions are for...

Comment 13 Simon McVittie 2011-12-06 19:41:21 UTC

Created attachment 202932 [details] [review]
g_pointer_bit_lock_and_get: add

Atomic weak references turn out to benefit from this.

---

The implementation is by no means optimal:

+  /* FIXME: someone who speaks fluent x86 assembler could probably add a
+   * more optimal variant of this method similar to the USE_ASM_GOTO case
+   * above. I am not that person.
+   */

Comment 14 Simon McVittie 2011-12-06 19:41:45 UTC

Created attachment 202933 [details] [review]
_object_weak_ref, g_object_add_weak_pointer: document  non-thread-safety

Transparent access to a weak pointer from the thread performing the
weak -> strong conversion is incompatible with thread-safety: that
thread will have to do something special. This is GNOME#548954.

Comment 15 Simon McVittie 2011-12-06 19:42:42 UTC

Created attachment 202935 [details] [review]
GWeakRef: add

I believe these are thread-safe, and they fix the GDBusConnection
case seen in GNOME#665211, but I haven't stress-tested them beyond that.

---

More for review/comment than for merging, at this stage. At the very least, they need proper in-tree regression tests, rather than the standalone executable using GDBusConnection that I attached to Bug #665211.

Comment 16 Simon McVittie 2011-12-06 19:43:16 UTC

Created attachment 202936 [details] [review]
GDBusConnection: use GWeakRef to make the singletons  thread-safe

---

If all my previous patches are correct, then this closes Bug #665211.

Comment 17 Simon McVittie 2011-12-06 19:53:08 UTC

(In reply to comment #5)
> i'm no glib developer, but my first thought was: why not use a function
> callback for cleanup (or just a notification that the object was/is going to be
> deleted) ?

Problems with this, for the record:

* Which thread do we call the callback in? Is it OK for it to run in
  the thread where the last-unref happened?

* Are we holding any locks when we call it?

* Can it ref (and resurrect) the object? If so, why? If not, why not?

* Not atomic in a multi-threaded environment

I feel as though notify callbacks might be necessary in addition to/as part of GWeakRef, though - particularly if we use them to solve the long-standing Bug #118536.

Here's a straw-man:

    typedef void (*GWeakRefNotify) (gpointer user_data);

    /**
     * @context: (allow-none): if %NULL, use the default main context
     * @user_data: (allow-none):
     *
     * Call @callback with @user_data in @context after the
     * weak ref has been invalidated. By the time @callback is
     * called, g_weak_ref_lock (weak_ref) will already produce %NULL.
     *
     * For instance, this could be used to release memory that was
     * previously used to associate state with @weak_ref, such as
     * signals that used it as their user_data.
     *
     * If g_weak_ref_lock (weak_ref) is already %NULL, @callback
     * will be called in @context as soon as possible.
     */
    void g_weak_ref_add_notify (GWeakRef *weak_ref,
                                GMainContext *context,
                                GWeakRefNotify callback,
                                gpointer user_data);

or however you spell that in terms of closures.

Then g_signal_connect_object() could have a GWeakRef as the user-data, with the additional behaviour that when the object dies, the signal is later disconnected, avoiding the memory that g_signal_connect_object() currently "leaks"?

Comment 18 Behdad Esfahbod 2011-12-06 20:43:52 UTC

Thanks Simon for working this.  I have not read the patches yet, but want to note that internally we can simply threat GWeakRef and the current plain weak refs the same.  It's really up to the user to decide whether to use them unsafely as in the past, or use the new name and get correct treatment.

The only thing changing with what I'm suggesting is that we're making the old-style weakrefs slightly more dangerous as the object finalization will now be changing the value of the weakref pointer, but that pointer would have been unsafe anyway.

Comment 19 Behdad Esfahbod 2011-12-06 20:46:29 UTC

+ * This is equivalent to g_bit_lock, but working on pointers (or other
+ * pointer-sized values). Additionally, it returns the value the
+ * pointer would have if not locked (i.e. if the lock bit was zero).

The last sentence is vague.  lock_bit is the index of the bit, so "if the lock bit was zero" can be misleading.  How about "(ie. the bit being lock is zero in the return value)"

Comment 20 Simon McVittie 2011-12-07 11:52:25 UTC

(In reply to comment #18)
> I have not read the patches yet, but want to
> note that internally we can simply threat GWeakRef and the current plain weak
> refs the same.

I'm not at all sure that we can: the current weak refs are notified after dispose, whereas my new GWeakRef is notified before dispose, specifically to avoid the possibility that the cached singleton gets disposed while still available to other threads. Would it be considered a compatible change if we moved weak ref notification up to before dispose?

Also, the API of my GWeakRef doesn't currently support either of the patterns used by the current weak refs (if I added notify functions, then it would, sort of, but see below). Instead of having one GObject <-> n weak ref "objects" where each weak ref "object" is a pointer, I have one GObject <-> one GWeakRef <-> n refs to the GWeakRef. Essentially, the GWeakRef is treated as the part of the GObject that must survive beyond dispose; as an implementation detail, it's lazily-allocated, so that GObjects that are never weakly referenced don't even have to allocate the memory for it.

The other model I could have used would be to do as you suggested, and have the GObject store a list of *locations* as used by g_object_add_weak_pointer(), but this has disadvantages:

* (I think this is the most severe)
  currently, if you forget to unref a GWeakRef, you leak 2 words of memory,
  which is a thing we're relatively good at detecting (and never a security
  vulnerability beyond slow DoS); if locations are tracked, and you forget
  to unregister the weak pointer before freeing the memory in which it is
  located, you get a use-after-free

* the GObject needs to manipulate lists to store n locations - either with
  realloc() (GArray), or many small allocations (linked list); in my
  implementation, lists are only needed if we support notify functions

* if we used a bitlock in each weak pointer then the n weak pointers could
  not be invalidated atomically, leading to this surprising situation:

  * we have 2 weak refs a, b to an instance X

  * thread 1: X is about to lose its last ref

  * thread 1: a is invalidated and NULLed out

  * thread 2: retrieve weak ref from b, strengthen it to strong ref

  * thread 1: either b is invalidated and NULLed out, or notice that
    X now has a ref and do nothing [1]

  * thread 1: notice that X now has a ref and do not dispose or finalize it

  * now we have:

    * thread 2 still points to the old X
    * if any thread tries to strengthen a it will get NULL, and probably
      react by allocating a new instance Y
    * if any thread tries to strengthen b it is implementation-dependent
      whether it gets NULL or X (because of [1]), so retrieving the
      global singleton will either give you X or Y
  
    and we have X and Y, two values for "the global singleton"! This
    is bad if it's the GDBusConnection, because messages received on X
    are not totally ordered with respect to messages received on Y
    due to not sharing a GSocketConnection

* if we used a bitlock in each weak pointer then reading it while locked
  would yield an odd pointer and a strange crash, whereas the current
  situation is likely to yield either the old value or NULL, which is at
  least understandable

* if we used a global lock then every g_weak_ref_lock() and every
  g_object_unref() would have to take it; in GWeakRef,
  every g_object_unref() takes a global lock and (briefly) a
  weak-ref-specific bitlock, while every g_weak_ref_lock() only takes
  the bitlock

The other problem I can see with implementing the current weak refs in terms of the new weak refs is that the current weak refs cause the notify function to be called immediately, from the thread that performed the last-unref; whereas if I add notify functions to GWeakRef, I think I'd prefer them to be called later, from a thread of the caller's choice (treated as a way to clean up leaks, rather than something to rely on for correctness - you always use g_weak_ref_lock(), and behave according to whether it's NULL or non-NULL, to give you correctness).

(In reply to comment #19)
> + * This is equivalent to g_bit_lock, but working on pointers (or other
> + * pointer-sized values). Additionally, it returns the value the
> + * pointer would have if not locked (i.e. if the lock bit was zero).
> 
> The last sentence is vague.  lock_bit is the index of the bit, so "if the lock
> bit was zero" can be misleading.  How about "(ie. the bit being lock is zero in
> the return value)"

Yes, I wasn't completely happy with that wording either. How about this?

"Additionally, it returns the value the pointer would have had if not locked (i.e. the bit being used as the lock is zero in the return value)"

My implementation of GWeakRef doesn't strictly *need* this, because GWeakRef is opaque: I could have used (one bit of) a third word in the struct as a bit-lock, used bit 31 of the refcount as the bit-lock, or used a GMutex embedded in the struct.

Comment 21 Simon McVittie 2011-12-07 11:59:26 UTC

(In reply to comment #11)
> Looking at the second suggestion again, I think has another undesirable race,
> though: if the intention of the cache is to have a pseudo-singleton object that
> is "the" session GDBusConnection (or whatever), but the "get" thread wins the
> race against the "unref" thread, then it'll resurrect the old singleton.
> Meanwhile, the next thread to try retrieving the singleton from the cache gets
> a new singleton.

I believe my proposed implementation avoids this possibility, because it does not call out to user-supplied callbacks while NULLing out the weak ref, and it uses a lock to ensure that NULLing out the weak ref is atomic with respect to strengthening the weak ref.

So, the only things that can cause an object to be resurrected after the weak
refs have been notified are:

* it refs itself (directly or indirectly) in its own dispose() as called by
  its last-unref: this is pretty rare, and the object is already useless
  at this point, so it's good that it's no longer the global singleton;
  nobody should have a pointer to it at this point anyway

* someone has incorrectly kept a borrowed pointer to the object,
  which they ref after the weak refs have already been notified:
  this is going to be unsafe regardless, because they're racing with the
  object's destruction, and if they lose the race, that object will crash
  when they try to use it (or maybe even when they ref it)

Comment 22 Simon McVittie 2011-12-07 15:08:31 UTC

Comment on attachment 202936 [details] [review]
GDBusConnection: use GWeakRef to make the singletons  thread-safe

Also attached to Bug #665211.

Comment 23 Simon McVittie 2011-12-07 15:09:45 UTC

Created attachment 202997 [details] [review]
Add regression test for (atomic) weak-refs used to cache a singleton

---

We probably need more test coverage than this... but this illustrates the problem, and that these patches solve it, independently of GDBus.

Comment 24 Allison Karlitskaya (desrt) 2011-12-07 19:51:13 UTC

Created attachment 203028 [details] [review]
GObject: add threadsafe weak references

Comment 25 Allison Karlitskaya (desrt) 2011-12-07 19:51:16 UTC

Created attachment 203029 [details] [review]
Add test for weak-refs used to cache a singleton

Comment 26 Allison Karlitskaya (desrt) 2011-12-07 19:51:55 UTC

This is my proposal, with an adapted test case.

Comment 27 Allison Karlitskaya (desrt) 2011-12-07 20:03:01 UTC

A couple of notes:

I'm rather attached to the approach in general due to its simplicity relative to the other patches.

I'm not totally happy with the API for a few reasons.

 - the add_weak_pointer API would ideally set the value of the pointer
   to point at the object

 - the process of removing a weak reference that may be dead is extremely
   annoying, as seen in the test:

      strengthened = g_object_ref_weak_pointer (&weak);
      if (strengthened)
        {
          g_object_remove_weak_pointer (strengthened, &weak);
          g_object_unref (strengthened);
        }

 - it's still possible to access the pointer directly

 - the g_object_ref_weak_pointer() function name suggests that a GObject*
   should perhaps be the first argument


I could therefore see something more like this as an API:

typedef struct
{
  GObject * volatile ptr;
} GWeakRef;

void       g_weak_ref_init       (GWeakRef *ref,
                                  gpointer  object);
gpointer   g_weak_ref_get_object (GWeakRef *ref);
void       g_weak_ref_clear      (GWeakRef *ref);

Comment 28 Behdad Esfahbod 2011-12-07 21:11:47 UTC

I like Ryan's patch in that it adds minimal API.  However, I think that makes it easier to use the API wrongly.  So I'm leaning towards deprecating the old API and adding the new one, with a typedef like:

typedef gsize GWeakRef;

implementation remains as is, it's just to avoid dereferencing.  That also helps finding application that needs to be fixed as the old API is deprecated.

My only comment on the implementation is that I think we should clear the weakref upon removal.  And I suggest we don't support the multiple-times added case.  In fact, make ref_init() functionally ref_clear the existing weakref.

Comment 29 Allison Karlitskaya (desrt) 2011-12-07 21:28:12 UTC

It's worth noting that the existing weak pointer system is totally valid for non-multithreaded applications and that the most noteworthy consumer of GLib is Gtk, which doesn't even attempt to be threadsafe.

The reason that the weakref isn't set on add or cleared on removal is because those APIs are pre-existing and to modify them would be an incompatible change.  It's quite clear that if we add the new APIs, we should have them do as you suggest.

Comment 30 Behdad Esfahbod 2011-12-07 21:30:26 UTC

Good point.

Comment 31 Allison Karlitskaya (desrt) 2011-12-07 21:47:31 UTC

One more thing that may be worth talking about: python has support for constructing a dictionary with weak references on the values such that when the value is finalized it is removed from the hashtable.

We could use the new APIs that I proposed in a similar way to Simon's code to accomplish this by allocating a piece of memory to contain the weakref and adding that to the hashtable.  This has the disadvantage of not freeing the memory or removing it from the hashtable (ie: the hashtable could grow indefinitely).

We could attempt to introduce a special-case in GObject's code that removes items from a hashtable at the same time as setting weak pointers to NULL.  This sounds insane on the surface and only gets worse when you think about it more: you'd have to call into the hashtable to remove an items that you possibly don't know the key of while holding a global lock and deal with the side effects of the GDestroyNotify on the key.  Add to that the fact that you'd probably want to acquire another lock to ensure that you're accessing the hashtable safely...

An interesting alternative would be to install both a weak pointer and a traditional callback-based weakref.  Lookup would call g_weak_ref_get_object() and the traditional weak callback would actually remove the item from the hashtable (while holding the lock to the table).  That way you'd get the NULL return immediately and then later on the memory would actually stop being used.

It might be worth it to have a class that implements this logic and/or to write documentation on how to do it properly for yourself.

Comment 32 Simon McVittie 2011-12-08 11:53:33 UTC

(In reply to comment #27)
> void       g_weak_ref_init       (GWeakRef *ref,
>                                   gpointer  object);
> gpointer   g_weak_ref_get_object (GWeakRef *ref);
> void       g_weak_ref_clear      (GWeakRef *ref);

I like this API better.

(In reply to comment #28)
> And I suggest we don't support the multiple-times added
> case.  In fact, make ref_init() functionally ref_clear the existing weakref.

I don't think this is a good idea. "init" in GLib normally means "assume contents are uninitialized", for use like this:

    GObject *object = ...;
    GWeakRef ref;                /* uninitialized */

    g_weak_ref_init (&ref, object);

If g_weak_ref_init() cleared the weak-ref before use, this would crash.

(Examples in GLib: g_value_init, g_datalist_init, g_hash_table_iter_init, etc.; libdbus also has this pattern.)

We could have a g_weak_ref_reset() for convenience, though (again, name stolen from GValue):

    /**
     * @ref: a weak reference, which must have been initialized to some
     *  value
     */
    void
    g_weak_ref_reset (GWeakRef *ref,
                      GObject  *object)
    {
      g_weak_ref_clear (ref);
      g_weak_ref_init (ref, object);
    }

If we have a stack-allocatable GWeakRef, I'm inclined to say that g_weak_ref_init() should allow @object to be NULL, with these semantics:

    g_weak_ref_init (&ref, NULL);                         /* not an error */
    g_assert (g_weak_ref_get_object (&ref) == NULL);      /* not an error */

This would mean that GDBusConnection's the_session_bus, the_system_bus would each just be a GWeakRef, containing either NULL (if not currently connected to that bus) or a non-NULL weak ref.

We could even say that

    GWeakRef ref = G_WEAK_REF_EMPTY;      /* or ..._NULL or something */

is the same thing.

(In reply to comment #28)
> So I'm leaning towards deprecating the old
> API and adding the new one, with a typedef like:
> 
> typedef gsize GWeakRef;

guintptr, please, if we go this way - strictly speaking, I don't think C guarantees that size_t is the same size as void *.

(This may be another of these assumptions that is not guaranteed by ISO C, but true on all widely available CPUs and relied on by GLib, though; like NULL being all-bits-zero, function pointers being data-pointer-sized, pointers being either one or two ints (please review Attachment #202931 [details]), and so on.)

Perhaps even better would be:

typedef struct _GWeakRefContents GWeakRefContents;
typedef struct {
    GWeakRefContents *opaque;
} GWeakRef;

because then you still can't dereference the contents (without a cast or a compiler warning as a hint that you're doing the wrong thing), but you can initialize it with { NULL }, rather than relying on { 0 } being the right thing on all CPUs we support?

(In reply to comment #31)
> An interesting alternative would be to install both a weak pointer and a
> traditional callback-based weakref.  Lookup would call g_weak_ref_get_object()
> and the traditional weak callback would actually remove the item from the
> hashtable (while holding the lock to the table).  That way you'd get the NULL
> return immediately and then later on the memory would actually stop being used.

Yes, I think we need this - g_signal_connect_object() should use it, for a start. (Stop delivering the signal before first dispose, and clear up the signal-subscription data during finalize.)

I wonder whether my "clear up data later, in main context X" idea is still also useful, or whether that can just be implemented as the notify callback in a traditional callback-based weak ref by anyone who needs it?

> It might be worth it to have a class that implements this logic and/or to write
> documentation on how to do it properly for yourself.

Definitely.

Comment 33 Dan Winship 2011-12-08 12:53:43 UTC

(In reply to comment #29)
> The reason that the weakref isn't set on add or cleared on removal is because
> those APIs are pre-existing and to modify them would be an incompatible change.

Your patch changes weak pointers to be nullified at the start of the disposing process rather than at the end, which could also potentially break existing code (if it depended on the fact that weak pointers would still be valid during the class's own dispose() implementation).

>       /* we are about tp remove the last reference */

fix the typo while you're there...

Comment 34 Simon McVittie 2011-12-08 15:12:31 UTC

(In reply to comment #27)
> I'm rather attached to the approach in general due to its simplicity relative
> to the other patches.

Your implementation looks generally good, I've redone my patches in terms of it (testing now).

>  - the add_weak_pointer API would ideally set the value of the pointer
>    to point at the object

Addressed by:

    g_weak_ref_init (GWeakRef *weak_ref_location, gpointer object)

>  - the process of removing a weak reference that may be dead is extremely
>    annoying, as seen in the test:

Implemented as:

    g_weak_ref_clear (GWeakRef *weak_ref_location)

>  - it's still possible to access the pointer directly

In my current implementation you'd have to access weak_ref_location->priv.p, which is hopefully enough of a hint that you're wrong. I could change it to be a pointer to a dummy struct if we want to force a cast?

I've also implemented g_weak_ref_reset() (which turns out to be quite useful), and made it valid to have a GWeakRef pointing at NULL (equivalent to one whose object already died).

Comment 35 Simon McVittie 2011-12-08 15:13:56 UTC

(In reply to comment #33)
> Your patch changes weak pointers to be nullified at the start of the disposing
> process rather than at the end, which could also potentially break existing
> code (if it depended on the fact that weak pointers would still be valid during
> the class's own dispose() implementation).

I've kept GWeakRef as a distinct type instead of improving g_object_add_weak_pointer, mainly for this reason.

Comment 36 Simon McVittie 2011-12-08 15:14:54 UTC

Comment on attachment 203028 [details] [review]
GObject: add threadsafe weak references

>+gpointer
>+g_object_ref_weak_pointer (volatile void *weak_pointer_location)

Why does this need to be volatile? I don't see any accesses to it that aren't protected by the rwlock.

Comment 37 Allison Karlitskaya (desrt) 2011-12-08 15:27:26 UTC

volatile is like const -- it's not about _needing_ to be volatile, but saying "if you have a volatile variable that you wish to access with this, we'll treat it properly".

Comment 38 Simon McVittie 2011-12-08 18:11:24 UTC

(In reply to comment #12)
> Attachment #202931 [details]
> GBitLock: turn assumptions of g_futex_int_address into a  static assertion

Would still be nice to apply, but no longer relevant to this bug.

(In reply to comment #13)
> Attachment #202932 [details]
> g_pointer_bit_lock_and_get: add

No longer needed.

(In reply to comment #14)
> Attachment #202933 [details]
> _object_weak_ref, g_object_add_weak_pointer: document  non-thread-safety

Still relevant.

The rest of the patches here are superseded by those I'm about to attach.

Comment 39 Simon McVittie 2011-12-08 18:12:46 UTC

Created attachment 203099 [details] [review]
GWeakRef: add a weak GObject reference believed to be thread-safe

Heavily based on an implementation by Ryan Lortie, but with various API
changes.

---

This is my new proposal.

Comment 40 Simon McVittie 2011-12-08 18:13:12 UTC

Created attachment 203100 [details] [review]
Add deterministic tests for the API of GWeakRef

These don't address the thread-safety, but do address basic use.

Comment 41 Simon McVittie 2011-12-08 18:13:54 UTC

Created attachment 203101 [details] [review]
Add regression test for GWeakRef used to cache a singleton

---

Basically Bug #665211, but with the GDBus bits stripped out.

Comment 42 Simon McVittie 2011-12-08 18:14:34 UTC

Created attachment 203102 [details] [review]
GDBusConnection: use GWeakRef to make the singletons thread-safe

---

This fixes Bug #665211, but I'm putting it here for review because it's the first "real" use of our new API.

Comment 43 Matthias Clasen 2011-12-09 04:12:54 UTC

Review of attachment 203099 [details] [review]:

::: gobject/gobject.c
@@ +3744,3 @@
+ *
+ * If the object's #GObjectClass.dispose method results in additional
+ * references to the object being held, any #GWeakRef<!-- -->s taken

No need for that <!-- --> anymore - gtk-doc can deal with #GWeakRefs nowadays.

::: gobject/gobject.h
@@ +588,3 @@
+void     g_weak_ref_reset      (GWeakRef          *weak_ref,
+                                gpointer           object);
+gpointer g_weak_ref_get_object (volatile GWeakRef *weak_ref);

I think I'd prefer to shorten the last function to g_weak_ref_get - that would go better with the other three.

Comment 44 Matthias Clasen 2011-12-09 04:13:31 UTC

Review of attachment 202931 [details] [review]:

Looks good to me

Comment 45 Matthias Clasen 2011-12-09 04:14:45 UTC

Review of attachment 202933 [details] [review]:

Makes sense

Comment 46 Matthias Clasen 2011-12-09 04:18:39 UTC

Review of attachment 203100 [details] [review]:

looks good

Comment 47 Matthias Clasen 2011-12-09 04:20:07 UTC

Review of attachment 203100 [details] [review]:

::: gobject/tests/reference.c
@@ +330,3 @@
+  /* clear and free dynamic_weak... */
+  g_weak_ref_clear (dynamic_weak);
+

Are you not leaking dynamic_weak here ? I would have expected a g_free (dynamic_clear);

Comment 48 Matthias Clasen 2011-12-09 04:21:23 UTC

Review of attachment 203101 [details] [review]:

Looks good

Comment 49 Matthias Clasen 2011-12-09 04:23:04 UTC

Review of attachment 203102 [details] [review]:

Looks good

Comment 50 Simon McVittie 2011-12-09 10:37:02 UTC

(In reply to comment #47)
> ::: gobject/tests/reference.c
> @@ +330,3 @@
> +  /* clear and free dynamic_weak... */
> +  g_weak_ref_clear (dynamic_weak);
> +
> 
> Are you not leaking dynamic_weak here ? I would have expected a g_free
> (dynamic_clear);

Yes, you're right. Which also means that the test doesn't yet verify that there isn't a use-after-free on dynamic_weak, so I should check that (with valgrind).

(In reply to comment #43)
> No need for that <!-- --> anymore - gtk-doc can deal with #GWeakRefs nowadays.

Good to know!

> +gpointer g_weak_ref_get_object (volatile GWeakRef *weak_ref);
> 
> I think I'd prefer to shorten the last function to g_weak_ref_get - that would
> go better with the other three.

Good thinking. I also wonder whether to change g_weak_ref_reset to _set, since it can set a non-NULL value.

Do you think the "state machine" of when you're allowed to call which method is obvious enough? For the record, it is:

    old state:       uninitialized     empty     non-empty
    ------------------------------------------------------
    operation:    |
         init     |  initialized   (initialized) (crash)
         (re)set  |  (crash)        initialized  initialized
         clear    |  (crash)        empty        empty

where "initialized" means "empty or non-empty according to the parameter"; "(initialized)" means it happens to work correctly, but is documented as not allowed; and "(crash)" means undefined behaviour, but in practice it'll result in a crash sooner or later.

Comment 51 Behdad Esfahbod 2011-12-09 15:23:39 UTC

I personally prefer g_weak_ref_ref(), but that's just me.

Comment 52 Allison Karlitskaya (desrt) 2011-12-09 17:15:58 UTC

I don't care a whole lot about g_weak_ref_get() vs. g_weak_ref_get_object().

I'm not convinced that the reset API is needed.

Comment 53 Allison Karlitskaya (desrt) 2011-12-09 17:18:44 UTC

about _init and clear:

In my opinion, _init() operates on uninitialised memory only.  If you call it more than once, you're entering the undefined behaviour zone (usually by leaking memory, but in this case possibly by crashing or crashing later or something).

_clear() operates only on something that had _init() called on it before and by the time _clear() is over, the memory can be assumed to be uninitialised.  You have to call _init() on it again before you use it for anything else.

Comment 54 Simon McVittie 2011-12-09 18:00:12 UTC

(In reply to comment #53)
> _clear() operates only on something that had _init() called on it before and by
> the time _clear() is over, the memory can be assumed to be uninitialised.  You
> have to call _init() on it again before you use it for anything else.

That's not the API that _clear() currently has. If you want _clear() to mean this, we could make it clear the pointer to 0xDEADBEEF or something... but "indistinguishable from a weakref whose object has died" is no harder to implement, and seems more useful.

I don't think we want to require that people _clear() their dead weakrefs? (Unless the implementation ends up allocating/holding memory.)

After the object has died, the weak ref is already in a stable state, and won't be dereferenced again; there seems no reason to require it to be cleared further.

Comment 55 Simon McVittie 2011-12-09 18:01:19 UTC

(In reply to comment #53)
> In my opinion, _init() operates on uninitialised memory only.  If you call it
> more than once, you're entering the undefined behaviour zone (usually by
> leaking memory, but in this case possibly by crashing or crashing later or
> something).

Yes, in this case re-init()ing a weakref to any living object is likely to cause a crash when that object dies.

Comment 56 Matthias Clasen 2011-12-27 20:29:56 UTC

Review of attachment 203099 [details] [review]:

::: gobject/gobject.c
@@ +3857,3 @@
+ * empty.
+ *
+ * g_weak_ref_lock() can be used to obtain a "strong" reference to the

g_weak_ref_lock doesn't exist in this patch, should be g_weak_ref_get_object

@@ +3866,3 @@
+ *
+ * Returns: (transfer full): a weak reference to @object, to be
+ *  freed with g_weak_ref_unref()

Returns: does not make sense for this void function (and g_weak_ref_unref does not exist anyway)

@@ +3902,3 @@
+ *
+ * Returns: (transfer full): a weak reference to @object, to be
+ *  freed with g_weak_ref_unref()

Like above, Returns: on a void function, pointing to nonexisting api

@@ +3925,3 @@
+ * using g_object_unref().
+ *
+ * Returns: (transfer full) (type GObject.Object): the object pointed to

is (type GObject.Object) correct ? I'm not sure about the typenames to use in these annotations, but "(type GObject)" would have looked more natural to me.

Also, would be good to say "a new reference to ..."

Comment 57 Matthias Clasen 2011-12-27 20:31:50 UTC

Shall we wrap this up ? 

I agree with Ryan that reset seems somewhat unneeded, considering the docs say that it is basically a shortcut for reset(); init()

Comment 58 Behdad Esfahbod 2012-01-02 13:52:47 UTC

Happy 2012 everyone.  Lets get over with this!  Looking forward to use it in Pango...

Comment 59 Allison Karlitskaya (desrt) 2012-01-02 17:26:02 UTC

Created attachment 204462 [details] [review]
GWeakRef: add a weak GObject reference believed to be thread-safe

This patch is a joint work with Simon McVittie.

Comment 60 Allison Karlitskaya (desrt) 2012-01-02 17:27:30 UTC

Attachment 203100 [details] pushed as fa5ff39 - Add deterministic tests for the API of GWeakRef
Attachment 203101 [details] pushed as 146aa7a - Add regression test for GWeakRef used to cache a singleton
Attachment 203102 [details] pushed as 1e09bfc - GDBusConnection: use GWeakRef to make the singletons thread-safe
Attachment 204462 [details] pushed as 46c2f57 - GWeakRef: add a weak GObject reference believed to be thread-safe

Comment 61 Allison Karlitskaya (desrt) 2012-01-02 17:31:32 UTC

Comment on attachment 203100 [details] [review]
Add deterministic tests for the API of GWeakRef

Don't know what happened here with git-bz....

Comment 62 Simon McVittie 2012-01-05 14:55:52 UTC

(In reply to comment #50)
> > Are you not leaking dynamic_weak here ? I would have expected a g_free
> > (dynamic_clear);
> 
> Yes, you're right. Which also means that the test doesn't yet verify that there
> isn't a use-after-free on dynamic_weak, so I should check that (with valgrind).

Ryan fixed this, but in an order that doesn't verify that a cleared ref isn't use-after-free'd. I'll attach a patch to change this to what I'd intended, which I've checked under valgrind to confirm that there is no use-after-free.

Comment 63 Simon McVittie 2012-01-05 14:56:23 UTC

Created attachment 204690 [details] [review]
test_weak_ref: free dynamic_weak earlier, to prove no use-after-free