GNOME Bugzilla – Bug 642026
Race condition in g_static_private_free
Last modified: 2011-05-28 18:03:02 UTC
Created attachment 180577 [details] [review] Patch g_thread_cleanup and g_static_private_free might run at the same time. This may cause the GDestroyNotify callback to be invoked twice for the same data pointer (once in the thread calling g_static_private_free and once in the thread in cleanup phase). The attached patch works for me but maybe there is a simpler or more elegant way to fix this. I didn't use the g_thread mutex as g_thread must be unlocked during the destruction callback, and I suspect that we can't completely fix the race condition with that constraint.
Created attachment 188394 [details] test case The problem case here is: * have a pointer, p, with a notifier, n * have a GStaticPrivate, say sp * have a thread, say t1, which has called g_static_private_set (&sp, p, n) * have another thread, say t2 (in my test case, it's the main thread) * race these two actions: - t1 terminates itself with g_thread_exit() or by returning - t2 calls g_static_private_free (&sp) * this will occasionally result in two calls, which could even happen simultaneously if you're really unlucky: - t1 calls n(p) - t2 also calls n(p) To given an idea of how often this test case will crash, it crashed with g_error on 10/10 runs (1e5 iterations per run) on my laptop, but with only 1e4 iterations in the loop, it frequently succeeded.
To fix this particular double-free it looks as though it ought to be sufficient to lock the main threading mutex while stealing the private_data GArray, then call the destructors once we've escaped from under the lock: G_LOCK (g_thread); array = thread->private_data; thread->private_data = NULL; G_UNLOCK (g_thread); if (array) { guint i; for (i = 0; i < array->len; i++ ) { GStaticPrivateNode *node = &g_array_index (array, GStaticPrivateNode, i); if (node->destroy) node->destroy (node->data); } g_array_free (array, TRUE); } However, there are more locking problems with g_static_private_free: * It locks to iterate over all threads, but unlocks during each iteration to call the destructor: this seems wrong, because it's only that lock that protects it from g_thread_all_threads being altered during iteration. It looks as though it should instead put all (data, destructor) pairs with a non-NULL destructor into a temporary list, then call all of their destructors after it unlocks? * The fact that g_static_private_free touches other threads' private_data also invalidates the assumption in g_static_private_get and g_static_private_set that it's OK to use private_data without a lock, so either those functions should lock too, or this stuff should all be converted to use atomic pointers or something?
Created attachment 188493 [details] [review] [PATCH 1/5] GRealThread: remove obsolete comment about gmain.c, which no longer has a copy
Created attachment 188494 [details] [review] [PATCH 2/5] g_static_private_free: defer non-trivial destruction til after we unlock
Created attachment 188495 [details] [review] [PATCH 3/5] Refactor GStaticPrivate accessors to facilitate protecting them with locks * g_static_private_get: have a single entry and exit * g_static_private_set: delay creation of GArray so the whole tail of the function can be under the private_data lock without risking deadlock with the g_thread lock; call the destructor last, after we could have unlocked * g_static_private_free: choose next thread in list before accessing private_data, to keep all accesses together * g_thread_cleanup: steal private_data first, then work exclusively with the stolen array (which doesn't need to be under a lock any more)
Created attachment 188496 [details] [review] [PATCH 4/5] GStaticPrivate: protect GRealThread.private_data with a bit-lock This could be replaced with uses of the global g_thread lock, or busy-looping on a call to g_atomic_pointer_compare_and_exchange, if desired.
Created attachment 188497 [details] [review] [PATCH 5/5] Add a regression test for GNOME#642026 This is Attachment #188394 [details], turned into a patch that adds a regression test.
(In reply to comment #6) > GStaticPrivate: protect GRealThread.private_data with a bit-lock > > This could be replaced with uses of the global g_thread lock, or busy-looping > on a call to g_atomic_pointer_compare_and_exchange, if desired. I'd particularly appreciate feedback from GThread maintainers on the relative costs of locking mechanisms, and which one should be used here. The original intention here seems to be that each thread could access its own private data extremely cheaply, without locks, but the addition of g_static_private_free (which also calls the destructors from "the wrong thread") makes that unsafe. I've tried to keep accesses within "the right thread" relatively cheap (there can never be contention for each GRealThread's bit-lock unless g_static_private_free is called, even if other threads are doing things that take the g_thread lock), while making g_static_private_free operate safely.
Review of attachment 188493 [details] [review]: Obviously correct
Review of attachment 188494 [details] [review]: Yeah, clearly correct.
Review of attachment 188495 [details] [review]: Looks all fine to me.
Review of attachment 188496 [details] [review]: ::: glib/gthread.c @@ +263,3 @@ + * holding this (particularly on the g_thread lock). */ + volatile gint private_data_lock; + GArray *private_data; Good that you add a comment, but tiny pet peeve of mine: I like to keep the closing */ lined up with the other stars, so move it to the next line. Also, given that you add this extra int for the lock, no real need to use bit locks here, could just use g_atomic_int_compare_and_exchange... Or, avoid the extra int and use a low bit in private_data, but that is a bit more involved, I guess.
Review of attachment 188496 [details] [review]: ::: glib/gthread.c @@ +263,3 @@ + * holding this (particularly on the g_thread lock). */ + volatile gint private_data_lock; + GArray *private_data; Actually, I was talking nonsense here. Seems fine to use a bit lock.
Looking at this some more, I can't help but feel that the handling of private_key->index is not really safe either. - It gets accessed without any protection in g_static_private_get - g_static_private_set takes the thread lock when setting it (arguably because the lock protects the global list of free indices) - g_static_private_free reads and sets it to 0 outside the lock One thing that can clearly happen here is that set can race against free and set key to an index taken off the list of free indices after free set the key to 0. That would probably only cause us to leak an index. I guess for correctness private_key->index should be set atomically ?
It also seems that you could have two calls of g_static_private_free on the same static private race against each other and put the same index on the free list twice, which would be bad news. It seems clear that static privates with a bounded lifetime need some external protection, like a rule that only the same thread that created them can free them, or so.
Review of attachment 188493 [details] [review]: Committed
Review of attachment 188494 [details] [review]: Committed
Review of attachment 188497 [details] [review]: Committed
Review of attachment 188496 [details] [review]: Committed