GNOME Bugzilla – Bug 321886
GTK+ cannot be reliably used in multi-threaded applications
Last modified: 2011-02-04 16:11:01 UTC
The following construct is used throughout GTK+: static gboolean idle_callback (gpointer data) { SomeWidget *instance = data; GDK_THREADS_ENTER(); /* do something with instance */ GDK_THREADS_LEAVE(); return whatever; } static void some_widget_something (SomeWidget *instance) { instance->idle_id = g_idle_add(idle_callback, instance); } static void some_widget_finalize (GObject *object) { SomeWidget *instance = SOME_WIDGET(object); if (instance->idle_id) g_source_remove(instance->idle_id); (* G_OBJECT_CLASS (parent_class)->finalize) (object); } Timeline of events: 1) in thread x: the mt application grabs the GDK lock 2) in the main thread: idle_callback() is executed and waits to acquire the lock 3) in thread x: the mt application does something which causes the instance to be finalized 4) in the main thread: my_idle_callback() acquires the lock, proceeds, and crashes because the instance has been destroyed while it was waiting for the lock The attached conceptual patch provides an example solution for GtkTextView. If the solution is deemed acceptable, it should be generalized to other objects which are affected by the problem (note that the patch breaks the GTK+ ABI on 64-bit platforms; it is only a quick example).
Created attachment 54951 [details] [review] example solution
_gtk_safe_source_remove() errata: gboolean _gtk_safe_source_remove (GtkSafeSourceHandle *handle) { guint source_id; g_return_val_if_fail (handle != NULL, FALSE); g_return_val_if_fail (handle->source_id > 0, FALSE); source_id = handle->source_id; handle->source_id = 0; return g_source_remove (source_id); }
hmm. i see the problem, and i understand why the proposed solution works, but wit will be a lot of work to implement that throughout gtk. can't we find a ssolution or ar least some support for handling this situation in glib ?
Created attachment 55000 [details] [review] an alternative What about this one?
Owen, I know we talked about this bug, but I don't remember any conclusions. Do you ?
Owen reminded me: > I think what we decided was that there should be some way (GLib API > addition) to find out if the currently invoked GSource has been > destroyed.
Created attachment 65622 [details] [review] Alternative approach based on g_source_is_current_destroyed Ok, I've just been badly burnt by this bug, so here is an approach based on storing the current source in thread-local storage and exposing the ability to test whether the currently firing source has been destroyed. ... This still requires the review of all idle handlers within GTK+.
I think there's a race in your approach, because GLib dispatches sources with the context lock not held: thread 1: static gboolean idle_callback (gpointer data) thread 1: { thread 1: SomeWidget *self = data; thread 1: GDK_THREADS_ENTER (); thread 1: if (!g_main_is_current_source_destroyed ()) thread 1: { thread 2: g_source_remove(the_source); thread 1: /* boom */ Besides, I think gtk_idle_add_locked() would be more useful, since it removes the need to use the same construct in all the GDK-locked callbacks.
AIUI, that race is prevented by the reference counting on the source (raised when the source is added to the pending dispatches array). From my application POV having a pure glib method plays nicely with other uses of g_(timeout|idle)_add. However, I agree that a construct such as gtk_idle_add_locked() is required to simplify maintenance of GTK+ and widget libraries. Instead of supplying a new source_lock function we could use a custom g_idle_source that acquired the lock and performed the test before dispatching the callback. ie along the lines of static GSourceFuncs gtk_locked_idle_funcs = { g_idle_prepare, g_idle_check, gtk_locked_idle_dispatch, NULL }; static gboolean gtk_locked_idle_dispatch (GSource *source, GSourceFunc callback, gpointer user_data) { gboolean ret=FALSE; GDK_THREADS_ENTER(); if (!g_main_is_current_source_destroy ()) ret = callback (user_data); GDK_THREADS_LEAVE(); return ret; } GSource * gtk_locked_idle_source_new (void) { GSource *source; source = g_source_new (>k_locked_idle_funcs, sizeof (GSource)); g_source_set_priority (source, G_PRIORITY_DEFAULT_IDLE); return source; } guint gtk_locked_idle_add_full() { GSource *source; guint id; g_return_val_if_fail (function != NULL, 0); source = gtk_locked_idle_source_new (); if (priority != G_PRIORITY_DEFAULT_IDLE) g_source_set_priority (source, priority); g_source_set_callback (source, function, data, notify); id = g_source_attach (source, NULL); g_source_unref (source); return id; } and similarly for the timeouts.
Nevermind my comment; this morning I forgot that the finalize method should also be called with the appropriate lock held (GDK's in this case), so your construct is safe (my patch had the same construct anyway, but inside g_main_dispatch()). Regarding the API, I like your approach, it is more elegant / less cumbersome than my lock callbacks.
Created attachment 65721 [details] [review] glib patch for g_source_is_destroyed This is the GLib portion implementing the TLS storage as before. The difference is that I needed a g_source_set_funcs to overload g_(idle|timeout)_source_new for the new gtk_locked_(idle|timeout)_source.
Created attachment 65725 [details] [review] GTK+ patch to implement safer MT idle callbacks This patch creates a couple of new sources that acquire the GDK mutex and then test to see if the source is still valid before dispatching the user callback. Note that this patch requires g_source_set_funcs from the GLib patch, but does not actually use g_main_is_current_source_destroyed(). Though it might be cleaner to add a g_source_is_destroyed() rather than testing flags directly.
Created attachment 65728 [details] Test case And a test case...
Created attachment 65729 [details] Test case Again, without tabs.
I've committed the GLib patch, just changing g_main_is_current_source_destroyed by g_source_is_destroyed
urm, i really think the proposed solution (checking source "destroyed" state) is a bad hack around the real issue. to recap, the problem scenario was: 1) in thread x: the mt application grabs the GDK lock 2) in the main thread: idle_callback() is executed and waits to acquire the lock 3) in thread x: the mt application does something which causes the instance to be finalized 4) in the main thread: idle_callback() acquires the lock, proceeds, and crashes because the instance has been destroyed while it was waiting for the lock the problem here is really about 3. you shouldn't be able to destroy the instance while an idle handler is pending for it. usually this is taken care of by doing: static void idle_handler (void *data) { GObject *object = G_OBJECT (data); GDK_THREADS_ENTER(); /* do stuff here */ g_object_unref (object); GDK_THREADS_LEAVE(); return FALSE; } g_idle_add (idle_handler, g_object_ref (object)); and about comment #15, Chris patch originally also cared about the case where the passed in source is NULL (albeit wrongly because it returned FALSE not TRUE there). what's in glib now doesn't. also, i think the whole notion of g_main_current_source() is completely bogus, especially in the light of nested loops. if you wanted to support this properly, it should be a function on a main_context, and it needs a stack to cope properly with recursion. and finally, thinking of user code, i'd say it makes more sense to be able to query a source's "destroyed" state by id (that#d also much simply the attachment in comment #1). i think the changes that went into CVS here really should be reviewed by Owen, since i doubt he'd approve them. so i'm adding him to CC:, sorry for the spam if i'm mistaken.
Tim, I completely agree that the usual idiom is to hold a reference to the object. In fact, outside of GtkWidgets I tend to use: static void idle_handler (void *data) { MyObject *object = MY_OBJECT (data); /* do stuff here */ return FALSE; } g_idle_add_full (G_PRIORITY_DEFAULT_IDLE, idle_handler, g_object_ref (object), (GDestroyNotify)g_object_unref); Just in case I remember to destroy the GMainContext ;). However, on occasion, it reallly is simpler to only pass a weak reference to the idle handler, and it would be useful if Gtk+ provided some convenient support functions. And as noted, the original motivation for this bug is that the unsafe idiom is used throughout Gtk+. </preaching to the choir>
I don't really see how referencing the source the way you write it is going to work, especially for timeouts (cursor blink in GtkEntry, say). How do you remove the timeout when the widget is destroyed? You could wait for the timeout to run, and have the timeout notice itself that the widget is no longer focused or realized, and remove itself, but waiting the duration of the timeout (which in some circumstances could quite reasonably be 30 minutes or more) is reasonable. Chris's approach of using the GDestroyNotify passed to g_source_set_callback() seems to be a bit better. I don't see any fundamental problems with it. *But* you do have to make sure that all your idles and timeouts can deal with objects that have been destroyed and unrealized, which I doubt is even close to the case in GTK+. Being able to cancel a pending source is something we've always had in the GLib API, because it is an extremely useful capability. I'd like to be able to do that: - In a threadsafe manner - Preserving the same ordering guarantees that the GDK lock gives us elsewhere in GTK+. I'd like to be able to assume that once I call g_source_remove(source), the code in the source won't be run at some later point. When we discussed this issue earlier, we considered other ways of allowing source cancellation in a thread-safe manner; for example, passing a pointer to a mutex to GLib that it acquires before invoking the callback, but it gets really messy and ugly. Being able to check whether the currently invoked source is destroyed was the simplest and cleanest solution we could come up with. I don't see any issue with g_main_current_source() as a concept ... it's a perfectly well defined concept per thread, and if you check it at the beginning of a source callback, it's very clear which source you mean.
BTW - the GTK+ API I'd add here would be something along the lines of: void gdk_threads_set_source_callback (GSource *source, GSourceFunc func, gpointer data, GDestroyNotify notify); // Convenience guint gdk_threads_add_timeout (guint interval, GSourceFunc function, gpointer data); // Convenience guint gdk_threads_add_idle (GSourceFunc function, gpointer data); Repeating another 5 lines of code dozens or hundreds of places over GTK+ is clearly a bad idea; the patch might even be an overall code size reduction by centralizing the GDK_THREADS_ENTER/LEAVE... The documentation content referring to GDK_THREADS_ENTER/LEAVE and GSource callbacks should likely be updated as well.
Owen, i'm not really arguing against g_source_destroyed(). and yes, you have a point in that in some cases ref counting isn't good enough. however note that (also adressing comment #17): - passing g_object_unref as GDestroyNotify function to the source does *not* work. the problem here is that when g_object_unref is called you don't know whether the gdk lock is being held, because the last source reference could be dropped by some g_source_remove() inside the gdk lock, or by the gmain.c code after dispatching without the lock being held. - i think having g_main_current_source() work only at the start of the dispatch callback is a very bad idea. at least, with the current code in gmain.c, i don't see how: dispatch (void*) { GSource *s1 = g_main_current_source(); g_main_loop_run (other_loop); /* could also recurse */ GSource *s2 = g_main_current_source(); g_assert (s1 == s2); } will hold. i'd think breaking this assertion intentionally will result in many astonished users of g_main_current_source(), but maybe that's just me and your users are going to be more clever... - what is the reasoning for not providing g_main_current_source() as a function on GMainContext? if convenience is a concern, g_main_current_source() could be implemented in terms of: GSource* g_main_current_source (void) { /* getting current context from a per-thread context stack */ GMainContext *mc = g_main_context_get_current(); return g_main_context_get_current_source (mc); }
Hmm, true about the g_object_unref() comment ... I knew there was some reason that pattern made me uncomfortable other than the lack of ordering. Most GTK+ finalizers are probably pretty safe to call outside of the GDK lock, but I wouldn't really recommend it. Talking to rambokid on IRC, we both find the naming GSList *source; in GMainDispatch a little confusing when reading the code, but since, despite the name, it is a stack of sources, there shouldn't be a problem with "only at the start of a dispatch callback". And with the stack, we can hold off on adding a function to get the current source for a specified context until someone comes up with a use case.
The intention inside GLib was to maintain a stack of sources in order to support mainloop recursion: ==== g_main_dispatch () current->depth++; current->source = g_slist_prepend (current->source, source); need_destroy = ! dispatch (source, callback, user_data); current->source = g_slist_remove (current->source, source); current->depth--; ==== where the g_slist_remove could have been: g_assert (current->source->data == source); { GSList *l = current->source; current->source = l->next; g_slist_free1 (l); } So I believe your g_assert (s1 == s2) holds. The issue with providing an acessor on GMainContext is that we must cater for the simple minded user of g_timeout_add who does not know his context. Owen, could you flesh some more of your idea of the Gtk+ API. I am interested to see if there is a cleaner approach than overloading g_timeout_funcs.
ok, so apart from renaming current->source to make its stack nature more obvious, and maybe hinting in the documentation that it is intended to be used inside a source callback, are there other gripes with whats currently in gmain.c ?
re comment #23: nope, not anymore. we figured we'd wait for a use case before changing the API.
Chris: I don't think anything very complex is neeed. I wouldn't dig into GLib internals and write custom sources either for efficiency or to get a high degree of abstraction. As far as I can see, you can easily write, say, a gdk_threads_add_idle() that creates a callback structure, and does: GDK_THREADS_ENTER(); if (/* source isn't destroyed */) callback->func(callback->data); GDK_THREADS_LEAVE(); (The gdk_threads_set_source_callback() I have above, is, upon reflection, not easily implementable, because you can't generically call the callback. So, probably we can just add the simple versions. You *can* always add the g_source_destroyed() check directly into the end code if needed. I'm only really comfortable with the techniques in g_source_set_closure() because of GLib and GObject are distributed together as a single package.)
Matthias, here's a gripe... My patch used g_slice for a g_static_private without specifying a destroyer. If we invoke g_slice_free during the GDestroyNotify then we risk recursing in the thread cleanup logic, so it seems safer to use a simple g_new() instead. ~ line 1721 - dispatch = g_slice_new0 (GMainDispatch); - g_static_private_set (&depth_private, dispatch, NULL); + dispatch = g_new0 (GMainDispatch); + g_static_private_set (&depth_private, dispatch, g_free);
(In reply to comment #26) > Matthias, here's a gripe... > > My patch used g_slice for a g_static_private without specifying a destroyer. ok, yes. seems your patch leaks one GMainDispatch per thread that way. > If we invoke g_slice_free during the GDestroyNotify then we risk recursing in > the thread cleanup logic, so it seems safer to use a simple g_new() instead. here, you talk about the GDestroyNotify of g_static_private_set() right? i don't understand how that could recurse, can you please elaborate?
Well I may have confused myself, it wouldn't be the first time... I was under the presumption that the GStaticPrivate would be destroyed by g_thread_cleanup, and by calling g_slice_free which would in turn call g_thread_self and potentially restore the GPrivate associated with g_thread_self. Add to that my vague memory of seeing some code that looped over G*Privates inside the thread cleanup to handle such cases... However, I can not find any trace of any such behaviour inside GThread. My apologies, I'm confusing pthread_[sg]etspecific behaviour with G*Private. So calling g_slice_free in a G*Private destructor will result in a call to g_thread_self after pthread has nullified its associated GPrivate TSD key, resulting in a new GRealThread (and one more loop around the pthread cleanup). [I think.]
Nope, wrong again. g_slice_free doesn't call g_thread_self but g_private_get... So instead of allocating a new GRealThread we would allocate an entire new pair of magazines and then push our solitary bit of memory into a slot. Loop around the pthread cleanup and then return it back to the slab. [I think ;]
(In reply to comment #28) > I was under the presumption that the GStaticPrivate would be destroyed by > g_thread_cleanup, yes, in a loop over all GStaticPrivate items. > My apologies, I'm confusing pthread_[sg]etspecific behaviour with G*Private. (In reply to comment #29) > g_slice_free doesn't call g_thread_self but g_private_get... right. note the difference between GPrivate and GStaticPrivate, you can still use GSlice after all GStaticPrivate slots have been destroyed. > So instead of allocating a new GRealThread we would allocate an entire new > pair of magazines and then push our solitary bit of memory into a slot. Loop > around the pthread cleanup and then return it back to the slab. [I think ;] nope ;) the g_private_get() executed upon g_slice_free() will do no allocation whatsoever. it just accesses a GPrivate item that has already been setup earlier when you did g_slice_alloc() (so the magazines, etc. are of course still allocated). i think your confusion just comes from treating GPrivate and GStaticPrivate the same. they are not, GStaticPrivate is what users are supposed to use and GSlice will work before and after GStaticPrivate is available to user code.
Created attachment 66860 [details] [review] Implement gdk_threads_add_idle and friends Rebased patch on current glib. Renamed functions to match Owen's hints. gdk_threads_add_idle is probably as good as it gets - it is quite distinct to the deprecated gtk_idle_add and clearly shows the link between it and the gdk_threads_mutex. The patch still uses the original method of a new source dispatch func. I'll give Owen's suggestion a whirl shortly...
Created attachment 66862 [details] [review] Alternative gdk_threads_add_idle This patch avoids creating a pair of custom sources. Instead it creates a simple callback for g_idle_add that performs the wrapping for the real callback.
given that this is not very useful without the extra work to convert all of GTK+ and GDK to it, I'll put it on 2.12 API freeze, with the intention of merging it early in the 2.12 cycle.
*** Bug 350076 has been marked as a duplicate of this bug. ***
Created attachment 70307 [details] [review] First attempt at full conversion This patch is mostly a blind-conversion of any place calling g_(timeout|idle)_add and wrapping its callback with GDK_THREADS_(ENTER|LEAVE). I've not converted the few places where custom sources had been used. Indeed their number suggests that there is a need for gdk_threads_idle_add_object. During the conversion a couple of real bugs in the source tree (ie places where GDK_THREAD_(ENTER|LEAVE) were required but had been forgotten...). demos/gtk-demo/images.c | 2 demos/gtk-demo/pixbufs.c | 2 demos/pixbuf-demo.c | 2 demos/testanimation.c | 2 demos/testpixbuf.c | 2 docs/tools/widgets.c | 2 examples/gtkdial/gtkdial.c | 4 examples/progressbar/progressbar.c | 2 gdk/directfb/gdkwindow-directfb.c | 4 gdk/gdk.c | 206 +++++++++++++++++++++++ gdk/gdk.h | 17 + gdk/gdk.symbols | 4 gdk/gdkwindow.c | 4 gdk/win32/gdkinput-win32.c | 2 gtk/gtkbutton.c | 6 gtk/gtkcalendar.c | 8 gtk/gtkcellrenderertext.c | 6 gtk/gtkclipboard.c | 4 gtk/gtkclist.c | 12 - gtk/gtkcombo.c | 3 gtk/gtkcombobox.c | 24 -- gtk/gtkcontainer.c | 6 gtk/gtkdnd.c | 18 -- gtk/gtkentry.c | 30 --- gtk/gtkexpander.c | 8 gtk/gtkfilechooserdefault.c | 6 gtk/gtkfilesystemunix.c | 12 - gtk/gtkfilesystemwin32.c | 18 -- gtk/gtkicontheme.c | 6 gtk/gtkiconview.c | 24 -- gtk/gtkimage.c | 8 gtk/gtklist.c | 12 - gtk/gtkmenu.c | 22 -- gtk/gtkmenuitem.c | 6 gtk/gtknotebook.c | 26 -- gtk/gtkpathbar.c | 8 gtk/gtkprintoperation-unix.c | 3 gtk/gtkprintoperation-win32.c | 4 gtk/gtkprintoperation.c | 29 --- gtk/gtkprintunixdialog.c | 2 gtk/gtkrange.c | 13 - gtk/gtkrecentchooserdefault.c | 12 - gtk/gtkrecentchoosermenu.c | 6 gtk/gtkrecentmanager.c | 4 gtk/gtkselection.c | 19 -- gtk/gtkspinbutton.c | 10 - gtk/gtkstatusicon.c | 2 gtk/gtktext.c | 6 gtk/gtktextview.c | 38 +--- gtk/gtktoolbar.c | 5 gtk/gtktooltips.c | 8 gtk/gtktreeview.c | 75 +------- gtk/gtkuimanager.c | 5 gtk/gtkwindow.c | 4 modules/printbackends/cups/gtkprintbackendcups.c | 7 tests/autotestfilechooser.c | 4 tests/stresstest-toolbar.c | 2 tests/testassistant.c | 2 tests/testcombo.c | 2 tests/testdnd.c | 8 tests/testentrycompletion.c | 2 tests/testmenus.c | 2 tests/testmerge.c | 2 tests/testsocket_common.c | 2 tests/teststatusicon.c | 4 tests/testtext.c | 2 tests/testtoolbar.c | 4 tests/testtreeflow.c | 2 68 files changed, 389 insertions(+), 429 deletions(-)
Thanks for the patch. I'll look at it once we have branched.
2006-12-22 Matthias Clasen <mclasen@redhat.com> * gdk/gdk.symbols: * gdk/gdk.h: * gdk/gdk.c: Add functions to allow threadsafe handling of idles and timeouts wrt. to the GDK lock. (#321886, Chris Wilson)
2006-12-22 Matthias Clasen <mclasen@redhat.com> * *.c: Replace a lot of idle and timeout calls by the new gdk_threads api.
I currently find 5 places where we construct our own source with g_cclosure_new_object. Not sure if it is worth adding a gdk_threads_add_idle_object for that.