GNOME Bugzilla – Bug 742509
Thread-safety annotation needed
Last modified: 2018-02-08 12:31:12 UTC
Most of the introspected object types cannot assumed to be thread-safe, but there are some important exceptions, such as GMainContext and GMainLoop, some types in GStreamer. For languages where thread safety can be utilized at the language level, an annotation would be helpful to generate bindings for such types accordingly. So far the only language known to me where this is important is Rust, though Haskell people might consider this interesting as well. In my nascent Rust bindings (https://github.com/mzabaluev/grust), I have resorted to defining wrappers for the introspected types as unsafe for being passed or shared between tasks by default, and relaxing that restriction on a case-by-case basis, which will require some solution for whitelisting. P.S. In Gio, there is GCancellable which is peculiar: its own methods are thread-safe, but the GObject it derives from is not.
I'm currently (yesterday and today) working on a GContextSpecific interface that can be used (dynamically) to determine to which main context an object 'belongs' (if any). The reason that this is dynamic is because we can have objects of a particular type that belong to a particular context and other objects of the same type that do not.
(In reply to comment #1) > I'm currently (yesterday and today) working on a GContextSpecific interface > that can be used (dynamically) to determine to which main context an object > 'belongs' (if any). It's definitely useful, and I have implemented something similar as a bolt-on back when an M:N runtime was a thing in Rust. However: 1. Rust is a compiled language on par with C and C++, so performing dynamic checks in every method call does not feel like a good solution. 2. If the default for objects is not to have any context affinity, it's unclear how to prevent accidental concurrent use in the majority of cases. Rust emphasizes safe concurrency. 3. Contexts != threads. If an object is affine to a context that is not the current thread default, this does not necessarily mean it is not intended to be safely used in the thread (though messing with objects belonging to a different event context does not seem like a sane thing to do in general). Note that it's possible in Rust to pass a non-Send value between tasks, it just has to be done in "unsafe" code. Unsafe bindings to the C API are going to be public as well. The problem is in enabling easy concurrent usage for types that can be so used, even if they don't come from GLib itself.
Not sure how main context and thread safety go together here either. They shouldn't have anything to do with each other except for special cases like GDK/GTK. In GStreamer we mark functions and objects as thread-safe inside the documentation (not everywhere though because it started to become useless as basically every bit of our API is supposed to be threadsafe), maybe this could just become a gobject-introspection annotation?
The main idea is that we tend to use main contexts (not threads) as exclusion mechanisms for a lot of cases -- essentially all of GLib and Gtk is effectively under this system. The question of if it is valid to access an object or not is not tied to which thread you are in but rather which main contexts you are the owner of. 99% of the time this is effectively the same thing, of course. The reason that we do that is that often we need to know the context anyway (for dispatching) so we may as well call that the owner. I'm not totally sure that we're discussing the same thing here, but ever since thread-default main contexts became a thing, GLib has been going in this direction and it's about to take another big jump that way. What would you use this information for?
I'm talking about thread-safety like defined as "MT-Safe" here for example: https://www.gnu.org/software/libc/manual/html_node/POSIX-Safety-Concepts.html#POSIX-Safety-Concepts Which is completely different from anything with main contexts. What you seem to mean is objects that are tied to one specific thread and you must not call anything on them from any other thread, even if there is no other thread using them at the same time. Which is the case e.g. for GDK/GTK objects, but not most/all of GIO (you can safely use a GSocket from multiple threads if you're careful) or the basic data structures in GLib (GList is not thread-safe but can be used from multiple threads if you handle the locking yourself). The information would be used in Rust to declare if it is safe to pass objects between threads or not. Which is more related to what I meant than what you meant AFAIU :) The GContextSpecific thing seems useful nonetheless, but that's solving a completely different problem that does not exist anywhere in GStreamer (we don't even require a GMainContext to exist anywhere).
I've got these object markers for the Rust compiler: https://github.com/mzabaluev/grust/blob/master/src/marker.rs Basically, ObjectMarker provides compiler poison preventing a containing object from being safely shared between threads. SyncObjectMarker does not have the poison. By default, objects bound from GObject-introspection would not be sendable and will be confined to the task where they were created or otherwise obtained. For the thread-safe objects (annotated in GIR or in some custom way), the generator would put in the less restrictive SyncObjectMarker so that they can be shared with reference counting (the value being actually passed around is of type refcount::SyncRef).
I can think of a few types of objects that we have inside of GLib: - objects that can be used from any thread, without restriction (like GMainContext, GVariant, etc.) - objects that can be sent between threads as long as only one thread at a time is using them (like GHashTable, GSequence, etc.) - objects that are bound to a specific thread (via the main context) and must only be used while that context is acquired, even for simple things like refcounting (GtkWidget for example) - objects that are bound to a specific thread (via context) and can have some subset of operations performed on them from another thread (like GSettings) - objects that start out non-threadsafe until some sort of 'sealing' operation occurs on them after which point they are threadsafe (like GSource being added to a GMainContext and DConfChangesets) - probably more examples It is worth noting that some types can have objects that are in one category or another, depending on the particular instance. GDBusConnections that are backed by kdbus, for example, will be thread-local, whereas dbus-1 GDBusConnections will be globally shared.
With the Rust bindings, I favor a conservative approach: values of bound types are considered unsafe for sending, and an initially simple system of annotations can tell the generator to relax this restriction in some commonly needed ways. My expectation is that the bindings will be mostly used in application scenarios, where one thread is dealing with Gtk+ UI, or one thread uses a "tree" of Gio objects on its main context, or one thread controlling GStreamer pipeline(s), and so on. Where inter-thread communication is needed, it can be done using in-language means; and there's always the fire axe of "unsafe". (In reply to comment #7) > > - objects that can be used from any thread, without restriction (like > GMainContext, GVariant, etc.) These will be wrapped into types that implement Send/Sync. A thread-safety annotation on the introspected type would be helpful here. GMainContext already gets special treatment in the bindings though, and GVariant will get its own as well. > - objects that can be sent between threads as long as only one thread at a > time is using them (like GHashTable, GSequence, etc.) This sounds like "Send, but not Sync" in Rust type kind bounds. For non-refcounted, deeply copied types the generator could just emit the bindings like that unless annotated otherwise. Refcounted types (ignoring GHashTable, which needs to be in the core bindings anyway) appear problematic to represent with the current type model, so they will be lumped in with the next case. I'm thinking to implement one trick, however: the initial reference to a newly created object would not be "parked" at its thread, so it will be sendable. To obtain more references you'd have to "park" the original reference, tying the object to a thread as far as Rust is concerned. This will enable use cases like creating objects to dispatch callbacks with them on another thread's context with g_main_context_invoke() and similar methods. To use this feature in safe bindings for constructors that create non-aliased objects, an annotation would be needed to distinguish from cases when the returned object shares state with some objects otherwise available, is attached to the thread-default context, etc. > - objects that are bound to a specific thread (via the main context) and must > only be used while that context is acquired, even for simple things like > refcounting (GtkWidget for example) This is currently enforced for GObject-derived objects by default. > - objects that are bound to a specific thread (via context) and can have some > subset of operations performed on them from another thread (like GSettings) If the thread-safe methods are annotated, I can see how an additional helper type could be generated, to be obtainable in a SyncRef from the plain type. The thread-safe reference would be sendable and only expose the safe methods. How commonly is this needed in real usage? > - objects that start out non-threadsafe until some sort of 'sealing' operation > occurs on them after which point they are threadsafe (like GSource being > added to a GMainContext and DConfChangesets) Here too, dual types with sealing conversion. GSource will need to be bound in the core library, so I should implement something like below: impl Source { pub fn attach(self, ctx: &mut MainContext) -> AttachedSource { // ... } } The method consumes its recipient Source by value. Rust's borrow checker ensures there are no other references to the object at the time; linear types are awesome. For bindings outside core GLib, an annotation for such sealing methods would be needed. > It is worth noting that some types can have objects that are in one category or > another, depending on the particular instance. GDBusConnections that are > backed by kdbus, for example, will be thread-local, whereas dbus-1 > GDBusConnections will be globally shared. These will have to be locked down to the worst case as far as the static type system is concerned. We'll likely add some unsafe shortcuts to the bindings as common needs are discovered. To summarize, we have identified four different annotation cases we'd ultimately like to make use of. I'm listing them in the order of perceived necessity: - thread-safe objects; - methods sealing thread safety on their object; - unparked thread-unsafe object as return value; - thread-safe methods on otherwise thread-unsafe objects.
(In reply to comment #7) > - objects that are bound to a specific thread (via the main context) and must > only be used while that context is acquired, even for simple things like > refcounting (GtkWidget for example) Here's a related set of questions: How common/weird/anti-patternish is to have multiple threads acquire one context at different times? I could greatly relax the restrictions on what could go into async callbacks if I knew that a callback closure is only expected to be invoked in the thread where it was created. Though concurrency is prevented by the acquire lock, being invoked in another thread may still screw up things dependent on thread-local storage and the like. I assume this concern is not unique to Rust. Can we consider such context-stealing possibly done by other libraries or applications to be unlikely and inherently unsafe? Are there any methods on (generally) thread-unsafe objects taking callbacks that will be invoked in another thread/context? I can cover g_source_set_callback(), as mentioned in the previous comment. Any other cases out there? I'd like to use the object's thread-safety as the default determinant for whether its callback closures should be restricted with the Send bound. Do we need an annotation for any known exceptions?
[Mass-moving gobject-introspection tickets to its own Bugzilla product - see bug 708029. Mass-filter your bugmail for this message: introspection20150207 ]
I was thinking about this recently, and would like something in the direction of: (thread-safety external|yes|main|current) * external: You're responsible for controlling access to the object externally to the object. Bring your own locks. * yes: Holding a reference to the object is sufficient to call this function. * main: this may only be called from the main thread (ie: thread-1) * current: this may only be accessed from the thread that created it Things like "current" would mean that bindings could add an extra pointer to track g_thread_self() in their wrapper structure and assert before calling. It would also make it easier to add a "thread sanitizer" to bindings when using gtk/gdk that assert thread-1.
Note that in Rust this is on a type level, not function level. So either your whole type is thread-safe (in one way or another), or nothing is. From the Rust point of view, only the "yes" (and to some degree "main") case would be useful: it would resolve to the Send and Sync traits and involve no runtime checks or anything. "external", "main" and "current" would all mean that type is not thread-safe in any way. For "main" you could then automatically generate some runtime checks that it only happens on the main thread. Which is currently added manually to the relevant code. Do you have an example for something that is "current" and not "external"/"main"? One thing that seems to be missing here however is the equivalent of Rust's Send trait. An object that can be created from any thread and can be sent to any thread, as long as only a single thread at a time has references to the object. This would be the case for most non-reference-counted* types in glib. * reference counting has the problem that you can create new references that look like the original object, and could send those to another thread. So you would need runtime checks or some type-trickery to prevent this from happening.
(In reply to Sebastian Dröge (slomo) from comment #12) > Do you have an example for something that is "current" and not "external"/"main"? Anything that uses the thread default GMainContext was my thought (SoupServer comes to mind). > One thing that seems to be missing here however is the equivalent of Rust's > Send trait. An object that can be created from any thread and can be sent to > any thread, as long as only a single thread at a time has references to the > object. This would be the case for most non-reference-counted* types in glib. I meant for this case to be covered by "external", but perhaps that's not enough information for the case you describe.
(In reply to Sebastian Dröge (slomo) from comment #12) > Note that in Rust this is on a type level, not function level. So either > your whole type is thread-safe (in one way or another), or nothing is. > > From the Rust point of view, only the "yes" (and to some degree "main") case > would be useful: it would resolve to the Send and Sync traits and involve no > runtime checks or anything. "external", "main" and "current" would all mean > that type is not thread-safe in any way. I think "current" is representable by `!Send + !Sync` types, and I assumed it to be the default for GObject-derived classes in my bindings. It provides weaker thread safety than "external" (that's `Send + !Sync`, I presume?), and its guarantees, as far as I can see, are only broken by classes that are actually "main", whose instances constructed off the main thread don't make much practical sense anyway. > For "main" you could then automatically generate some runtime checks that it > only happens on the main thread. Which is currently added manually to the > relevant code. (thread-safety main) could be statically bound to Rust types where instances can only be constructed with the main thread's context object, which will be a singleton non-Send type distinct from the type wrapping an arbitrary `GMainContext`.
(In reply to Mikhail Zabaluev from comment #14) > (In reply to Sebastian Dröge (slomo) from comment #12) > > Note that in Rust this is on a type level, not function level. So either > > your whole type is thread-safe (in one way or another), or nothing is. > > > > From the Rust point of view, only the "yes" (and to some degree "main") case > > would be useful: it would resolve to the Send and Sync traits and involve no > > runtime checks or anything. "external", "main" and "current" would all mean > > that type is not thread-safe in any way. > > I think "current" is representable by `!Send + !Sync` types, and I assumed > it to be the default for GObject-derived classes in my bindings. It provides > weaker thread safety than "external" (that's `Send + !Sync`, I presume?), > and its guarantees, as far as I can see, are only broken by classes that are > actually "main", whose instances constructed off the main thread don't make > much practical sense anyway. Yes > > For "main" you could then automatically generate some runtime checks that it > > only happens on the main thread. Which is currently added manually to the > > relevant code. > > (thread-safety main) could be statically bound to Rust types where instances > can only be constructed with the main thread's context object, which will be > a singleton non-Send type distinct from the type wrapping an arbitrary > `GMainContext`. This seems like the wrong place to discuss that, let's move that to https://github.com/gtk-rs/glib/ . While what you say would work, it would be quite inconvenient from a usability point of view. Currently this is all handled with runtime checks, which are not that expensive either.
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/gobject-introspection/issues/119.