After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 681334 - Add GObject API to move finalizers to the main context
Add GObject API to move finalizers to the main context
Status: RESOLVED OBSOLETE
Product: glib
Classification: Platform
Component: gobject
unspecified
Other All
: Normal enhancement
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2012-08-06 20:40 UTC by Colin Walters
Modified: 2018-05-24 14:26 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
WIP: Clean up GdkCursorX11 in default main context (3.53 KB, patch)
2012-08-06 20:40 UTC, Colin Walters
rejected Details | Review
gobject: Add g_object_class_set_flags() with FINALIZE_IN_DEFAULT_CONTEXT (12.00 KB, patch)
2012-08-08 00:21 UTC, Colin Walters
rejected Details | Review
gobject: Add g_object_class_set_flags() with DISPOSE_IN_DEFAULT_CONTEXT (14.59 KB, patch)
2012-08-12 16:25 UTC, Colin Walters
rejected Details | Review

Description Colin Walters 2012-08-06 20:40:14 UTC
Language bindings with threaded garbage collectors may invoke
g_object_unref() in a non-default thread.  GDK should automatically
handle this by proxying destruction of thread-unsafe native calls
to the default main context.
Comment 1 Colin Walters 2012-08-06 20:40:17 UTC
Created attachment 220496 [details] [review]
WIP: Clean up GdkCursorX11 in default main context
Comment 2 Colin Walters 2012-08-06 20:46:17 UTC
See thread following from https://mail.gnome.org/archives/gtk-devel-list/2012-August/msg00008.html
Comment 3 Colin Walters 2012-08-07 21:08:12 UTC
So in reality, GTK+ depends on being run in the NULL context.  It might be reasonable then to add a GObject class feature which invoked finalizers in the default context.
Comment 4 Colin Walters 2012-08-08 00:21:53 UTC
Created attachment 220620 [details] [review]
gobject: Add g_object_class_set_flags() with FINALIZE_IN_DEFAULT_CONTEXT

To make GTK+ work with language bindings that have threaded garbage
collectors, this flag is the easiest path forward.

See https://mail.gnome.org/archives/gtk-devel-list/2012-August/msg00008.html
Comment 5 Steve Frécinaux 2012-08-08 07:44:16 UTC
Review of attachment 220620 [details] [review]:

::: gobject/gobject.c
@@ +2952,3 @@
+#ifdef	G_ENABLE_DEBUG
+  IF_DEBUG (OBJECTS)
+  {

Indentation is wrong

::: gobject/gobject.h
@@ +422,3 @@
+void        g_object_interface_install_property (gpointer     g_iface,
+						 GParamSpec  *pspec);
+

I think you duplicated this definition by mistake.
Comment 6 Benjamin Otte (Company) 2012-08-08 10:10:31 UTC
The patch is missing handling for dispose, no?
Also, do we have any numbers about performance implications?

For the record: I consider this complete crack. Making gobject hide the problems of threading when combining external libraries is a sure path to debugging madness.
Comment 7 Colin Walters 2012-08-08 13:54:21 UTC
(In reply to comment #6)
> The patch is missing handling for dispose, no?

I don't think we need that - GObject dispose handlers should just basically be calling g_object_unref() on other GObjects.  Unthreadsafe cleanup should happen in finalize.

Are you aware of code which has thread-unsafe dispose?

> Also, do we have any numbers about performance implications?

Not yet.

> For the record: I consider this complete crack. Making gobject hide the
> problems of threading when combining external libraries is a sure path to
> debugging madness.

What do you suggest instead?
Comment 8 Allison Karlitskaya (desrt) 2012-08-08 13:55:39 UTC
This feature really scares the hell out of me as well.

If we are going to be injecting GMainContext support into the base object system (which is something that might actually make sense, in my opinion) then we need to do one hell of a better job than this...
Comment 9 Colin Walters 2012-08-08 13:58:51 UTC
(In reply to comment #8)

> If we are going to be injecting GMainContext support into the base object
> system (which is something that might actually make sense, in my opinion) then
> we need to do one hell of a better job than this...

-EVAGUE

What are you concretely suggesting?
Comment 10 Benjamin Otte (Company) 2012-08-08 14:12:54 UTC
(In reply to comment #7)
> Are you aware of code which has thread-unsafe dispose?
> 
Wanna vouch that http://git.gnome.org/browse/gtk+/tree/gtk/gtkwidget.c?h=3.4.0#n10320 does not have any threading issues? Note that we call back into windowing system vfuncs in at least hide() and unrealize(). But I think we also do in the destroy signal.

> > For the record: I consider this complete crack. Making gobject hide the
> > problems of threading when combining external libraries is a sure path to
> > debugging madness.
> 
> What do you suggest instead?
>
Fix the external libraries instead.

The approach of "let's support crack" was tried in early GStreamer. It didn't work.
The biggest problem I have with that is the implications. If Python runs threaded GC and OpenGL can't cope with it, then it's suddenly my bug as a GTK developer to fix this? Why? Go fix GL or Python!
Comment 11 Pavel Holejsovsky 2012-08-08 14:25:44 UTC
(In reply to comment #10)
> The biggest problem I have with that is the implications. If Python runs
> threaded GC and OpenGL can't cope with it, then it's suddenly my bug as a GTK
> developer to fix this? Why? Go fix GL or Python!

python's GC run in separate thread is just tip of the iceberg.  It can be demostrated also e.g. with Lua, which runs incremental GC always in the context of the thread executing Lua context (and there can be only one thread execting single Lua context). The problem can appear when callback from C to Lua happens from the worker thread.  This can be e.g. signal from gstreamer, or (as suggested in ML) even some Gio callback, like GCancellable::cancelled or GDBusInterfaceSkeleton::g-authorize-method

In this case, if GC kicks in while executing the callback, the g_object_unref() might be invoked from worker thread. In this scenario, there is no GL, no external thread spawned by some external library.  

IMHO the problem arises because glib supports, encourages and uses multithreading programming, while gtk/gdk wants to avoid it like hell.  So i believe that it is reasonable to try to solve the problem on the glib/gobject/gtk level, because it is actually caused by the interaction of these components.

Of course, you might argue "fix the GC to not kick in when running in the bad thread", but this is dangerously close to "we don't support binding to languages using GC", which naturally leads to "we support only C and Vala".
Comment 12 Benjamin Otte (Company) 2012-08-08 14:41:09 UTC
(In reply to comment #11)
> IMHO the problem arises because glib supports, encourages and uses
> multithreading programming, while gtk/gdk wants to avoid it like hell.
> 
And this is exactly the problem I want to avoid.

It's not a GTK problem. GTK code itself doesn't care at all from where you call whatever function. It works perfectly well in a threaded context.

The problem is libX11, OpenGL or the win32 API we build the toolkit on. And I don't want to be responsible for hacking around documented limitations in those libraries.

There's an easy fix for your problem btw: Annotate functions that may run from a random thread in the introspection and handle them in a special way (make Lua not invoke the GC or marshal things back into the main context or don't even support them).
Comment 13 Allison Karlitskaya (desrt) 2012-08-08 14:55:30 UTC
so GObject has a pretty big problem.  We made (in my opinion) the ancient mistake of making ref() and unref() be threadsafe.

What this _actually_ means is that it is possible to create threadsafe classes based on GObject, in which it is safe to call all sorts of methods (including ref() and unref()) from any thread.

What people take this to mean, in practice, is that it is safe to call ref() and unref() from any thread for any object of any class.  That's a reasonable assumption, but it's actually quite false at present.  The reason for that is because we generally do not consider it reasonable that dispose() or finalize() could find itself running in any thread at all.

The problem gets even worse with GDK threads because now you should probably be holding the GDK lock in order to call unref() on (for example) a widget...

I'm not sure we can ever fix this problem fully without some very substantial changes, but I think the correct fix in short-term is to stop assuming that it's okay to call unref() from arbitrary threads on arbitrary objects.  The same applies to GDestroyNotify and various other 'clean-up' type code that the caller assumes can be run anywhere and the callee generally assumes will only be run in "their" thread.

A more complete fix in my opinion would be to group GObject classes into disjoint sets: threadsafe and non-threadsafe ones.  Ideally this would be easily identified by the type system and we could have (for example) non-atomic refcounting operations on the non-threadsafe objects.  We should probably do it in a way that it's possible to "upgrade" non-threadsafe objects to being threadsafe objects in the future without users of the API noticing.  That's dreamer territory for the time being.

We can also talk about giving each (threadsafe) object an owner context.  It would be essential that _any_ context could own an object (not just hard-coding to the global default).  We could possibly run finalize() and dispose() from this context (although in a world where objects are aware of the fact that they should be implemented in a thread-safe way, I'm not sure why this would be necessary).  More usefully, we would probably want to support dispatch-to-thread functionality in GSignal (since many objects in GIO are already implementing this pattern of each object having a 'home' thread and doing all signal dispatches from it).

In short, I think there are really only two solutions to your problem:

 - if python implementations of GObjects are not threadsafe then you should
   start harassing people who are calling unref() on your objects from strange
   threads

 - if you don't want to do this, then you should consider that your objects
   are threadsafe, in which case you should deal with finalize() being called
   from any thread.  At the very worst, why can't you just do an internal
   dispatch to deal with this?
Comment 14 Dan Winship 2012-08-08 15:00:34 UTC
(In reply to comment #7)
> Are you aware of code which has thread-unsafe dispose?

GtkObjects run destroy() from dispose(), so basically any widget's dispose needs to be run in the main context.

(In reply to comment #13)
> I think the correct fix in short-term is to stop assuming that
> it's okay to call unref() from arbitrary threads on arbitrary objects.  The
> same applies to GDestroyNotify and various other 'clean-up' type code that the
> caller assumes can be run anywhere and the callee generally assumes will only
> be run in "their" thread.

Yeah. Like, in particular, anything that uses g_simple_async_result_run_in_thread() needs to make sure that it does not free any data from within its GSimpleAsyncThreadFunc, etc.
Comment 15 Allison Karlitskaya (desrt) 2012-08-08 15:03:07 UTC
Sorry.  I got that entirely backwards by not properly reading the bug description.  The problem is that the language binding in question is the one exhibiting the bad behaviour of calling unref() on arbitrary GObjects from strange threads.

This is just totally and completely a bug.  Language bindings should simply not be doing that.
Comment 16 Pavel Holejsovsky 2012-08-08 15:08:54 UTC
(In reply to comment #12)
> It's not a GTK problem. GTK code itself doesn't care at all from where you call
> whatever function. It works perfectly well in a threaded context.
> 
> The problem is libX11, OpenGL or the win32 API we build the toolkit on. And I
> don't want to be responsible for hacking around documented limitations in those
> libraries.

I think we both agree upon "let's fix things at the place they are broken". 
The problem here is that the things are broken on the layer which is not easily
fixable (libX11, GL) if fixable at all (win32).  I believe that we should
extend the rule above to "let's fix things at the place they are broken, or at
the nearest possible place, so that the breakage doesn't spread across the
stack".  This results in an attempt to fix problems in gtk/gobject as
implemented by Colin's patch.  Your attitude is "its not broken in my code, go
fix it elsewhere".  With this attitude, many problems are gradually shifted to
upper layers (higher level libs and apps), causing to reimplement workarounds
for such problem again and again, usually with lots of bugs etc.

In summary, I believe that it is more sane to have workaround for broken
libx11/win32/GL inside gdk (which directly encapsulates it) instead of an
application using gtk.

Sorry for getting too much into offtopic/handwaving territory, I'll stop now.
Comment 17 Pavel Holejsovsky 2012-08-08 15:16:15 UTC
(In reply to comment #15)
> Sorry.  I got that entirely backwards by not properly reading the bug
> description.  The problem is that the language binding in question is the one
> exhibiting the bad behaviour of calling unref() on arbitrary GObjects from
> strange threads.
> 
> This is just totally and completely a bug.  Language bindings should simply not
> be doing that.

This is one possible solution, I've tried to explain why it might not be good idea to force marshal every g_object_unref() into main context in https://mail.gnome.org/archives/gtk-devel-list/2012-August/msg00038.html
Comment 18 Allison Karlitskaya (desrt) 2012-08-08 15:26:19 UTC
I consider the original proposal from the linked thread to be a bit more sane than what is being discussed here, but only a bit.

Having a qdata slot for an optional "owner" GMainContext is quite similar to what I was discussing above.  Let's say we use this on an advisory basis to tell threads "you must not call methods on this object except while in a thread that is the (acquired) owner of this main context".

Now we are somewhere: the GC of the binding can now query this qdata to discover if must use g_main_context_invoke() in order to do the unref().  Note: the ENTIRE unref() really needs to be done in the other thread -- there are all kinds of things other than finalize: dispose(), signal disconnects with destroynotifies, qdata free functions, weakref callbacks, toggleref notifies, etc.

As an optimisation we could have a function that implements a 'fast path' for unref() in the "boring" cases (ie: refcount > 2, or refcount > 1 and not using toggle refs) and uses g_main_context_invoke() to normal unref() for all other cases.  This function could be unref() itself, but I think that I don't like that...

I still this that this entire proposal is just a bit crazy.  Why not capture the maincontext at the time you get handed the object in the first place and make the assumption that it's always safe to call from this thread?  You could store the qdata for yourself... it seems that the answer to this question is "because performance" which is fair enough.  In that case it seems that what you really need is a way to tell the threadsafe objects from the non-threadsafe ones, and then we're back at my pie-in-the-sky proposal.
Comment 19 Dan Winship 2012-08-08 16:05:36 UTC
(In reply to comment #18)
> I still this that this entire proposal is just a bit crazy.  Why not capture
> the maincontext at the time you get handed the object in the first place and
> make the assumption that it's always safe to call from this thread?

That doesn't actually work. There's no guarantee that the language binding will see the object for the first time in its "home" thread. (Eg, the object may be created by C code in the main thread, but then first seen by the binding as an argument of a signal emitted from another thread.)
Comment 20 Allison Karlitskaya (desrt) 2012-08-08 16:24:28 UTC
That's true, but I'd argue that the bug there is in passing a pointer to an object through a public API in a thread in which it is not permitted to call back into the object (for even ref/unref purposes).  Can you think of a single public API that does that?  If you can, I'd argue that it's broken.
Comment 21 Colin Walters 2012-08-08 17:17:53 UTC
(In reply to comment #15)
> ... The problem is that the language binding in question is the one
> exhibiting the bad behaviour of calling unref() on arbitrary GObjects from
> strange threads.
> 
> This is just totally and completely a bug.  Language bindings should simply not
> be doing that.

There are a lot of advantages to concurrent GC.  For example, it can *significantly* decreate latency, which is really important to us as client-side software developers.

While I wouldn't call it out of the question to conclude in this bug that bindings should be proxying g_object_unref to the main context, do realize it has a lot of performance implications, and if there's a better solution possible (which might involve patching *both* bindings and GTK+/glib), it's well worth searching for it.
Comment 22 Allison Karlitskaya (desrt) 2012-08-08 17:48:22 UTC
(In reply to comment #21)
> While I wouldn't call it out of the question to conclude in this bug that
> bindings should be proxying g_object_unref to the main context, do realize it
> has a lot of performance implications, and if there's a better solution
> possible (which might involve patching *both* bindings and GTK+/glib), it's
> well worth searching for it.

I agree 100% with everything said here.  The proposed patch is totally insufficient, however.
Comment 23 Dan Winship 2012-08-08 18:04:23 UTC
(In reply to comment #20)
> That's true, but I'd argue that the bug there is in passing a pointer to an
> object through a public API in a thread in which it is not permitted to call
> back into the object (for even ref/unref purposes).  Can you think of a single
> public API that does that?  If you can, I'd argue that it's broken.

Well, first, as has been noted above, we don't currently have a concept of "thread where you're not allowed to ref/unref an object".

I don't have a specific example, but all it takes is an async op that uses run_in_thread() and then at some point during the operation either directly or indirectly causes an object-valued property of some object which is known to the binding to be set to some other object which is not yet known to the binding, which the binding might then observe via a ::notify handler.

(Of course there is probably lots of code that doesn't deal well with getting a ::notify in an unexpected thread...)
Comment 24 Benjamin Otte (Company) 2012-08-08 19:36:46 UTC
(In reply to comment #22)
> (In reply to comment #21)
> > While I wouldn't call it out of the question to conclude in this bug that
> > bindings should be proxying g_object_unref to the main context, do realize it
> > has a lot of performance implications, and if there's a better solution
> > possible (which might involve patching *both* bindings and GTK+/glib), it's
> > well worth searching for it.
> 
> I agree 100% with everything said here.  The proposed patch is totally
> insufficient, however.
>
I don't.

It assumes that fine-grained threading support is a good idea and I still don't buy that. I do buy that marshaling jobs into a separate thread is a good idea, but only if there's a well-established and well-understood protocol. "Random vfuncs of your object may be called from random threads" does not qualify here.

For example if garbage collection is happening in a different thread, it's that thread's job to establish a protocol to communicate with the other threads about which objects to destroy. "Let's just make all functions synchronized" is not it.
In particular because no library developer will ever test it and it will end up in the current situation where everyone is just hand-waving for as long as possible.
Comment 25 Colin Walters 2012-08-08 20:37:54 UTC
(In reply to comment #24)
> "Let's just make all functions synchronized" is not it.

We're not talking about "all functions" - just dispose/finalize handlers.

I'm leaning a lot more towards a variant of my original patch which keeps the "flag classes with thread-unsafe dispose/finalize handlers" approach, but does what comment #18 suggests - change g_object_unref() to check if the old refcount == 1 and the flag is set, move dispose to the default context.

Should we support arbitrary contexts?  I'm not aware of anyone who would want that currently.  But it is worth considering.
Comment 26 Colin Walters 2012-08-08 20:39:31 UTC
Comment on attachment 220620 [details] [review]
gobject: Add g_object_class_set_flags() with FINALIZE_IN_DEFAULT_CONTEXT

Per discussion, needs to handle dispose at least too.
Comment 27 Benjamin Otte (Company) 2012-08-08 20:46:46 UTC
(In reply to comment #25)
> We're not talking about "all functions" - just dispose/finalize handlers.
> 
We're mostly talking "all the code called when finalizing: dispose, finalize, destroy, signal handlers, weak refs, unmap, unrealize, state changes etc. That's a whole lot of things. And it's not at all clear _how_ to achieve that thread safety.
Comment 28 Colin Walters 2012-08-08 20:55:43 UTC
(In reply to comment #27)
> (In reply to comment #25)
> > We're not talking about "all functions" - just dispose/finalize handlers.
> > 
> We're mostly talking "all the code called when finalizing: dispose, finalize,
> destroy, signal handlers, weak refs, unmap, unrealize, state changes etc.
> That's a whole lot of things. And it's not at all clear _how_ to achieve that
> thread safety.

If we proxy dispose/finalize to the default main context, then everything they call that you're talking about becomes safe.
Comment 29 Colin Walters 2012-08-08 21:06:55 UTC
I talked to irc.mozilla.org:#jsapi - in newer SpiderMonkey, the GC finalizers are apparently automatically proxied to the main thread:

http://dxr.mozilla.org/mozilla-central/js/src/jsgc.cpp.html#l1709

But from what I can tell of the current js185, that's not true.
Comment 30 Colin Walters 2012-08-12 16:25:40 UTC
Created attachment 220946 [details] [review]
gobject: Add g_object_class_set_flags() with DISPOSE_IN_DEFAULT_CONTEXT

To make GTK+ work with language bindings that have threaded garbage
collectors, this flag is the easiest path forward.

https://bugzilla.gnome.org/show_bug.cgi?id=681334

See https://mail.gnome.org/archives/gtk-devel-list/2012-August/msg00008.html
Comment 31 Colin Walters 2012-08-13 15:10:32 UTC
Pavel, can you try testing this against Lua?  What would really help us is to get a reproducible crasher.

I think a Gjs script which just allocates a lot of objects (say GdkCursor) that have thread-unsafe dispose in a loop might do it.
Comment 32 Allison Karlitskaya (desrt) 2012-08-14 04:21:34 UTC
I still don't like any API that only works for the default context.
Comment 33 Pavel Holejsovsky 2012-08-14 06:43:05 UTC
(In reply to comment #32)
> I still don't like any API that only works for the default context.

We can kind of combine Colin's and my proposal; have defined gdata slot which contains GMainContext instance to run dispoase with, and if it is not present, use default context.  This way, there is a possibility to use non-default context, but for the major cases where default context is to be used, no memory is wasted (for storing default slot).

The slot could be set during object construction with (pseudocode):

if ((class_flags & DISPOSE_IN_CONTEXT) != 0 && default_slot != thread_default_slot)
  g_object_set_data (obj, CONTEXT_ID, thread_default_slot);
Comment 34 Pavel Holejsovsky 2012-08-14 06:49:50 UTC
(In reply to comment #31)
> Pavel, can you try testing this against Lua?  What would really help us is to
> get a reproducible crasher.

I'll try to create Lua crasher :-).  

The question is, do we mark *every* Gtk/Gdk object with DISPOSE_IN_MAIN_CONTEXT flag, or only Gdk objects wrapping native handles which are freed during last unref() ?

I always thought that the latter would be the case, but then we have problem that threaded GC might invoke some Gtk object finalizer concurrently with main thread invoking some other Gtk code, and Gtk will (rightly) not like it.  But since practically none GObject-based library provides full thread safety (i.e. synchronization on the level of individual objects), we would end up marking practically every GObject-based object with DISPOSE_IN_MAIN_CONTEXT flag.
Comment 35 Colin Walters 2012-08-14 13:40:33 UTC
(In reply to comment #34)
>
> I always thought that the latter would be the case, but then we have problem
> that threaded GC might invoke some Gtk object finalizer concurrently with main
> thread invoking some other Gtk code, and Gtk will (rightly) not like it.

Concrete example?  If we only mark objects with unsafe dispose, it'd be completely fine for the GC thread to dispose an object which then calls g_object_unref() on an object marked with _DISPOSE_IN_MAIN_CONTEXT, because it'll automatically be proxied.
Comment 36 Pavel Holejsovsky 2012-08-14 15:09:24 UTC
(In reply to comment #35)
> Concrete example?  If we only mark objects with unsafe dispose, it'd be
> completely fine for the GC thread to dispose an object which then calls
> g_object_unref() on an object marked with _DISPOSE_IN_MAIN_CONTEXT, because
> it'll automatically be proxied.

But it will not be fine if GC thread disposes an object which invokes some other methods of some other GTK object in its dispose/finalize.  An artificial example would be some GTK object which modifies state of another GTK widget.  I'm sorry for not providing the example in existing sources, I'm in a bit time pressure now, but I hope that my theoretical example illustrates my concern.  I'm just not sure if is it legal to call methods of another GTK object in object's finalizer.
Comment 37 Colin Walters 2012-08-14 16:21:08 UTC
(In reply to comment #36)
> (In reply to comment #35)
> > Concrete example?  If we only mark objects with unsafe dispose, it'd be
> > completely fine for the GC thread to dispose an object which then calls
> > g_object_unref() on an object marked with _DISPOSE_IN_MAIN_CONTEXT, because
> > it'll automatically be proxied.
> 
> But it will not be fine if GC thread disposes an object which invokes some
> other methods of some other GTK object in its dispose/finalize.  An artificial
> example would be some GTK object which modifies state of another GTK widget. 
> I'm sorry for not providing the example in existing sources, I'm in a bit time
> pressure now, but I hope that my theoretical example illustrates my concern. 

That's a good point; I guess it's similar to what Benjamin was saying, but I didn't quite understand the scenario.

So for these sorts of cases, I see three options:

1) Remember, we only need to annotate root classes.  So if we say e.g. GtkWidget should be disposed in the default context, then *all subclasses* will too.  This should cover a lot of things.
2) We can modify classes "manually" to do the proxying, as my original GTK+ patch did
3) Fix whatever the data is to have an internal mutex
Comment 38 Jasper St. Pierre (not reading bugmail) 2013-01-07 20:54:11 UTC
This came up with gjs/gnome-shell. The GDBus worker thread simply calls g_object_ref on the GDBusConnection, which will flip a toggle notify, which will get called in the wrong thread.

SpiderMonkey, not being thread-safe, hates this, and aborts.
Comment 39 Colin Walters 2013-01-07 23:24:12 UTC
(In reply to comment #38)
> This came up with gjs/gnome-shell. The GDBus worker thread simply calls
> g_object_ref on the GDBusConnection, which will flip a toggle notify, which
> will get called in the wrong thread.
> 
> SpiderMonkey, not being thread-safe, hates this, and aborts.

This is the opposite case from that discussed in this bug, where a GLib-side thread is calling into SpiderMonkey from an unexpected thread.

We discussed queuing the toggle refs via dispatch to the NULL maincontext.  However, the problem with this is it introduces a whole other level of concurrency.

Maybe instead, we add the notification to a list (protected by a mutex) on the context.  Then we process this notification list whenever we transition from JS -> C, like function returns.
Comment 40 Colin Walters 2013-01-07 23:27:52 UTC
(In reply to comment #0)
> Language bindings with threaded garbage collectors may invoke
> g_object_unref() in a non-default thread.

Rereading this original bug, I made a comment here:
https://bugzilla.gnome.org/show_bug.cgi?id=681334#c29

We could take this to be "fixed in newer spidermonkey, no GObject/GTK+ changes needed".  It only helps though if we actually upgrade GNOME to use a newer spidermonkey.
Comment 41 Jasper St. Pierre (not reading bugmail) 2013-01-08 00:47:11 UTC
Note that the assert only triggers in DEBUG builds (which I'm using simply for JS_DumpHeap and for other asserts), and only if we build with JS_THREADSAFE, which Fedora does. Future versions of SpiderMonkey drop JS_THREADSAFE builds entirely, at which point this problem will go away.
Comment 42 Allison Karlitskaya (desrt) 2013-10-09 13:56:42 UTC
Review of attachment 220946 [details] [review]:

Any solution here must _not_ hardcode the global default main context as being special in any way.  Although that's appropriate for single-threaded language bindings (gjs), this is something that we're increasingly moving away from in (particularly) GIO, so to add a new API that makes this assumption would be a step backwards.
Comment 43 Allison Karlitskaya (desrt) 2013-10-09 14:24:37 UTC
So here's my hope to try to nail down some of the ideas I started mentioning in comment 13 and expand on them a bit.

The two types of objects we have:

1) Those objects that are utterly threadsafe and can have any operation performed on them from any thread (including refs and unrefs).  They may therefore call 'dispose' in any thread.  They may also emit signals in any thread.  GDBusConnection is a great example of this.

2) Those objects that are really tied to a single thread (by which I mean main context).  It is only ever safe to access these objects from that one thread (by which i mean 'while having acquired the relevant main context').  These objects always fire their signals into a particular thread and are only ever disposed from the same thread.  All of Gtk pretty much falls into this category for the main thread, but we have many examples of objects that are like this in GIO for arbitrary main contexts (GDBusProxy, GDBusActionGroup, GDBusMenuModel, GSettings, GFileMonitor, GAppInfoMonitor...).

We also have some objects that are a bit of the mix of the above due to unclear ideas in the past about precisely how we should approach these sorts of problems.  For example, even though I list GSettings above as only having operations performed on it from one thread, we call g_object_ref() on it from other threads, which can result in bindings seeing 'toggle ref increment' calls (see bug 709223 for problems caused by that).

We may (or maybe not) say that g_object_ref() is a safe operation from every thread always and require language bindings to be safe against toggle ref increments happening in the 'wrong thread'.  This is something that they would have to deal with anyways for the other type of object (which is explicitly safe for any operation from any thread).

Arguing against the above point, I'd look at a common problem that we get into caused by GSettings being reffed from another thread.  People do this all the time:

  s = g_settings_new (...);
  data = new_data();
  g_signal_connect (s, "changed", handler, data);

  ...

  g_object_unref (s);
  free (data);

which, right now, is a bug.  If during the time of "..." a change event happens on a key on that GSettings object, then it will be picked up (with g_object_ref()) from the worker thread and kept alive long enough to fire a signal with the now-freed data.

I've been telling everyone for a while that they must explicitly disconnect their signal handlers before they free the data for them, but looking around lots of GNOME code I see that this is a very widely-repeated mistake (and a difficult one to find because it only very rarely actually causes a problem).  I consider this to be a deficiency of my API -- that the average user of it gets it wrong in difficult-to-discover ways.

So that's a good argument for only ever calling ref() from the 'owner thread' of an object -- it gives some nice additional guarantees about what the lifecycle of the object will be.  You never have a case where you have a pending handler dispatched from another thread keeping your object alive.

Making this restriction would also allow us to have non-atomic refcounting in the case of objects bound to a specific thread.


So here's a concrete proposal:

We add some new interface like GThreadLocal or something like that with:

 #define G_IS_THREAD_LOCAL()
 GMainContext *g_thread_local_get_context (GThreadLocal *gtl);

and maybe even

  void g_thread_local_unref_safely (GThreadLocal *gtl);

which would do little more than dispatching an idle (so maybe we can just skip this API).

Implementing _get_context() would be very easy for most thread-local objects since (in GIO at least) we always store the GMainContext in the instance struct.

GtkWidget would implement this interface by just always returning the global default main context.

I'd want this to be a "special" interface in the sense that gobject.c knows about it in a very fast way, without going through the type system (perhaps as a flag on the GObjectClass, as suggested in Colin's patch).  We could then do optimisations such as skipping the atomic ops on refcounting (and could also make G_IS_THREAD_LOCAL() faster -- maybe even implementing it entirely as a macro).  Or not.  If this is only done in the GC of language bindings then maybe it doesn't need to be -that- fast (and interface lookup isn't exactly super-slow).

If a class implements this interface then it must only _ever_ do anything (including g_object_ref()) on a particular instance from the advertised main context (and instruct its users to do the same).

Exception: g_thread_local_get_context() and G_IS_THREAD_LOCAL() are expected to have to work from _any_ thread, even on one-thread-only objects.  The reason for this is to accommodate bindings with threaded GC -- when getting ready to destroy an object, they could first call these in order to determine if it was safe to do so in-place and, if not, which context they should dispatch to.
Comment 44 GNOME Infrastructure Team 2018-05-24 14:26:59 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/glib/issues/585.