After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 642026 - Race condition in g_static_private_free
Race condition in g_static_private_free
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: general
unspecified
Other Linux
: Normal critical
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2011-02-10 13:28 UTC by Jürg Billeter
Modified: 2011-05-28 18:03 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch (1.86 KB, patch)
2011-02-10 13:28 UTC, Jürg Billeter
none Details | Review
test case (1.00 KB, text/plain)
2011-05-23 16:04 UTC, Simon McVittie
  Details
[PATCH 1/5] GRealThread: remove obsolete comment about gmain.c, which no longer has a copy (802 bytes, patch)
2011-05-24 17:42 UTC, Simon McVittie
committed Details | Review
[PATCH 2/5] g_static_private_free: defer non-trivial destruction til after we unlock (2.04 KB, patch)
2011-05-24 17:42 UTC, Simon McVittie
committed Details | Review
[PATCH 3/5] Refactor GStaticPrivate accessors to facilitate protecting them with locks (4.03 KB, patch)
2011-05-24 17:42 UTC, Simon McVittie
committed Details | Review
[PATCH 4/5] GStaticPrivate: protect GRealThread.private_data with a bit-lock (2.95 KB, patch)
2011-05-24 17:43 UTC, Simon McVittie
committed Details | Review
[PATCH 5/5] Add a regression test for GNOME#642026 (3.21 KB, patch)
2011-05-24 17:44 UTC, Simon McVittie
committed Details | Review

Description Jürg Billeter 2011-02-10 13:28:26 UTC
Created attachment 180577 [details] [review]
Patch

g_thread_cleanup and g_static_private_free might run at the same time. This may cause the GDestroyNotify callback to be invoked twice for the same data pointer (once in the thread calling g_static_private_free and once in the thread in cleanup phase).

The attached patch works for me but maybe there is a simpler or more elegant way to fix this. I didn't use the g_thread mutex as g_thread must be unlocked during the destruction callback, and I suspect that we can't completely fix the race condition with that constraint.
Comment 1 Simon McVittie 2011-05-23 16:04:00 UTC
Created attachment 188394 [details]
test case

The problem case here is:

* have a pointer, p, with a notifier, n

* have a GStaticPrivate, say sp

* have a thread, say t1, which has called g_static_private_set (&sp, p, n)

* have another thread, say t2 (in my test case, it's the main thread)

* race these two actions:

  - t1 terminates itself with g_thread_exit() or by returning

  - t2 calls g_static_private_free (&sp)

* this will occasionally result in two calls, which could even happen
  simultaneously if you're really unlucky:

  - t1 calls n(p)
  - t2 also calls n(p)

To given an idea of how often this test case will crash, it crashed with g_error on 10/10 runs (1e5 iterations per run) on my laptop, but with only 1e4 iterations in the loop, it frequently succeeded.
Comment 2 Simon McVittie 2011-05-23 17:32:44 UTC
To fix this particular double-free it looks as though it ought to be sufficient to lock the main threading mutex while stealing the private_data GArray, then call the destructors once we've escaped from under the lock:

      G_LOCK (g_thread);
      array = thread->private_data;
      thread->private_data = NULL;
      G_UNLOCK (g_thread);

      if (array)
	{
	  guint i;

	  for (i = 0; i < array->len; i++ )
	    {
	      GStaticPrivateNode *node =
		&g_array_index (array, GStaticPrivateNode, i);
	      if (node->destroy)
		node->destroy (node->data);
	    }
	  g_array_free (array, TRUE);
	}

However, there are more locking problems with g_static_private_free:

* It locks to iterate over all threads, but unlocks during each iteration
  to call the destructor: this seems wrong, because it's only that lock
  that protects it from g_thread_all_threads being altered during iteration.
  It looks as though it should instead put all (data, destructor) pairs with
  a non-NULL destructor into a temporary list, then call all of their
  destructors after it unlocks?

* The fact that g_static_private_free touches other threads' private_data
  also invalidates the assumption in g_static_private_get and
  g_static_private_set that it's OK to use private_data without a lock,
  so either those functions should lock too, or this stuff should all be
  converted to use atomic pointers or something?
Comment 3 Simon McVittie 2011-05-24 17:42:02 UTC
Created attachment 188493 [details] [review]
[PATCH 1/5] GRealThread: remove obsolete comment about gmain.c,  which no longer has a copy
Comment 4 Simon McVittie 2011-05-24 17:42:17 UTC
Created attachment 188494 [details] [review]
[PATCH 2/5] g_static_private_free: defer non-trivial destruction til  after we unlock
Comment 5 Simon McVittie 2011-05-24 17:42:33 UTC
Created attachment 188495 [details] [review]
[PATCH 3/5] Refactor GStaticPrivate accessors to facilitate  protecting them with locks

* g_static_private_get: have a single entry and exit

* g_static_private_set: delay creation of GArray so the whole tail of
  the function can be under the private_data lock without risking
  deadlock with the g_thread lock; call the destructor last, after
  we could have unlocked

* g_static_private_free: choose next thread in list before accessing
  private_data, to keep all accesses together

* g_thread_cleanup: steal private_data first, then work exclusively with
  the stolen array (which doesn't need to be under a lock any more)
Comment 6 Simon McVittie 2011-05-24 17:43:20 UTC
Created attachment 188496 [details] [review]
[PATCH 4/5] GStaticPrivate: protect GRealThread.private_data with a  bit-lock

This could be replaced with uses of the global g_thread lock, or busy-looping on a call to g_atomic_pointer_compare_and_exchange, if desired.
Comment 7 Simon McVittie 2011-05-24 17:44:04 UTC
Created attachment 188497 [details] [review]
[PATCH 5/5] Add a regression test for GNOME#642026

This is Attachment #188394 [details], turned into a patch that adds a regression test.
Comment 8 Simon McVittie 2011-05-24 18:02:50 UTC
(In reply to comment #6)
> GStaticPrivate: protect GRealThread.private_data with a  bit-lock
> 
> This could be replaced with uses of the global g_thread lock, or busy-looping
> on a call to g_atomic_pointer_compare_and_exchange, if desired.

I'd particularly appreciate feedback from GThread maintainers on the relative costs of locking mechanisms, and which one should be used here. The original intention here seems to be that each thread could access its own private data extremely cheaply, without locks, but the addition of g_static_private_free (which also calls the destructors from "the wrong thread") makes that unsafe.

I've tried to keep accesses within "the right thread" relatively cheap (there can never be contention for each GRealThread's bit-lock unless g_static_private_free is called, even if other threads are doing things that take the g_thread lock), while making g_static_private_free operate safely.
Comment 9 Matthias Clasen 2011-05-27 22:40:28 UTC
Review of attachment 188493 [details] [review]:

Obviously correct
Comment 10 Matthias Clasen 2011-05-27 22:45:00 UTC
Review of attachment 188494 [details] [review]:

Yeah, clearly correct.
Comment 11 Matthias Clasen 2011-05-27 22:49:26 UTC
Review of attachment 188495 [details] [review]:

Looks all fine to me.
Comment 12 Matthias Clasen 2011-05-27 22:56:25 UTC
Review of attachment 188496 [details] [review]:

::: glib/gthread.c
@@ +263,3 @@
+   * holding this (particularly on the g_thread lock). */
+  volatile gint private_data_lock;
+  GArray *private_data;

Good that you add a comment, but tiny pet peeve of mine: I like to keep the
closing */ lined up with the other stars, so move it to the next line.

Also, given that you add this extra int for the lock, no real need to use bit locks here, could just use g_atomic_int_compare_and_exchange... Or, avoid the extra int and use a low bit in private_data, but that is a bit more involved, I guess.
Comment 13 Matthias Clasen 2011-05-28 00:29:49 UTC
Review of attachment 188496 [details] [review]:

::: glib/gthread.c
@@ +263,3 @@
+   * holding this (particularly on the g_thread lock). */
+  volatile gint private_data_lock;
+  GArray *private_data;

Actually, I was talking nonsense here. Seems fine to use a bit lock.
Comment 14 Matthias Clasen 2011-05-28 13:28:14 UTC
Looking at this some more, I can't help but feel that the handling of private_key->index is not really safe either.

- It gets accessed without any protection in g_static_private_get
- g_static_private_set takes the thread lock when setting it (arguably
  because the lock protects the global list of free indices)
- g_static_private_free reads and sets it to 0 outside the lock

One thing that can clearly happen here is that set can race against free and set key to an index taken off the list of free indices after free set the key to 0. That would probably only cause us to leak an index.

I guess for correctness private_key->index should be set atomically ?
Comment 15 Matthias Clasen 2011-05-28 14:12:34 UTC
It also seems that you could have two calls of g_static_private_free on the same static private race against each other and put the same index on the free list twice, which would be bad news. It seems clear that static privates with a bounded lifetime need some external protection, like a rule that only the same thread that created them can free them, or so.
Comment 16 Matthias Clasen 2011-05-28 18:00:36 UTC
Review of attachment 188493 [details] [review]:

Committed
Comment 17 Matthias Clasen 2011-05-28 18:01:09 UTC
Review of attachment 188494 [details] [review]:

Committed
Comment 18 Matthias Clasen 2011-05-28 18:01:22 UTC
Review of attachment 188495 [details] [review]:

Looks all fine to me.
Comment 19 Matthias Clasen 2011-05-28 18:02:30 UTC
Review of attachment 188497 [details] [review]:

Committed
Comment 20 Matthias Clasen 2011-05-28 18:02:45 UTC
Review of attachment 188496 [details] [review]:

Committed