GNOME Bugzilla – Bug 661689
GDBusConnection methods assume that init has succeeded
Last modified: 2011-10-21 15:15:49 UTC
g_dbus_connection_start_message_processing() dereferences connection->worker, which may be NULL if initialization has not been started (possibly also if it failed). g_dbus_connection_flush_sync(), g_dbus_connection_close(), g_dbus_connection_send_message_unlocked() are similar. The GInitable docs suggest that this is meant to be considered to be a runtime error (graceful failure with GError), not a programming error (assertion). If it's considered to be a programming error, an assertion or check would be better than a segfault. One subtlety is that it's necessary to call g_dbus_connection_send_message_unlocked() during initialization. (By way of background: I've been analyzing crash-dumps in which worker is NULL. I can't see how the specific case seen in the crash-dumps can happen, and at least one of the devices where a crash was seen is suspected to have bad RAM, which is why I haven't opened a bug for that yet.)
This is non-trivial to fix because the locking discipline is a bit unclear. Am I right in thinking that, as a consequence of the g_bus_get() family returning a singleton, GDBusConnection is expected to be fully thread-safe (all public methods may be called from arbitrary threads), an exception to the usual rule that you can only use a given GObject from one thread at a time?
(In reply to comment #0) > g_dbus_connection_start_message_processing() dereferences connection->worker, > which may be NULL if initialization has not been started (possibly also if it > failed). > > g_dbus_connection_flush_sync(), g_dbus_connection_close(), > g_dbus_connection_send_message_unlocked() are similar. > > The GInitable docs suggest that this is meant to be considered to be a runtime > error (graceful failure with GError), not a programming error (assertion). If > it's considered to be a programming error, an assertion or check would be > better than a segfault. > > One subtlety is that it's necessary to call > g_dbus_connection_send_message_unlocked() during initialization. Yeah, it's not at all valid to use a GDBusConnection instance until it has been initialized. Do you have a need for this? (Btw, I actually didn't know that the GInitable docs suggested returning graceful errors if the instance is not successfully initialized... but I can't seem to think of any valid reason for trying to use an object instance before it's properly initialized - not even in language bindings. Maybe we should just fix the GInitable docs to not suggest that it's worth gracefully handling such as weird corner case. Or maybe I haven't thought it through. Alex?)
(In reply to comment #1) > This is non-trivial to fix because the locking discipline is a bit unclear. Am > I right in thinking that, as a consequence of the g_bus_get() family returning > a singleton, GDBusConnection is expected to be fully thread-safe (all public > methods may be called from arbitrary threads), an exception to the usual rule > that you can only use a given GObject from one thread at a time? That is correct, GDBusConnection is supposed to be 100% thread-safe.
(In reply to comment #2) > Yeah, it's not at all valid to use a GDBusConnection instance until it has been > initialized. Do you have a need for this? Not really, but I've been looking through some automated crash-dumps where this appears to have already happened... (if I'm filling in the missing things correctly - the cores don't contain heap data so I have to guess). > Btw, I actually didn't know that the GInitable docs suggested returning > graceful errors if the instance is not successfully initialized... I'll try to put together some patches to do either a graceful error or g_return_if_fail, depending on feedback on this bug and/or whether a graceful error is feasible.
(In reply to comment #3) > That is correct, GDBusConnection is supposed to be 100% thread-safe. Right, we definitely need more locking then. I have a half-finished branch, I'll see what I can do.
(In reply to comment #5) > (In reply to comment #3) > > That is correct, GDBusConnection is supposed to be 100% thread-safe. > > Right, we definitely need more locking then. I have a half-finished branch, > I'll see what I can do. Bug #661992
(In reply to comment #4) > I'll try to put together some patches to do either a graceful error or > g_return_if_fail, depending on feedback on this bug and/or whether a graceful > error is feasible. (Luckily, GDBusConnection is pretty similar in 2.30 and in master.) The following struct members are read-only after initialization: stream, auth, worker, bus_unique_name, guid, capabilities, credentials, initialization_error. Any function that accesses one of these has two constraints: * it must either check that the connection has been initialized, or not care whether struct member has its pre-initialization value * it must have a memory barrier if we want to guarantee that initialization in thread A will yield the post-initialization value in thread B So having checks for functions that dereference these members is necessary for both Bug #661992 and this bug. My proposal is to turn is_initialized into a member of an atomically-accessed flags word, and change its semantics from "has been initialized successfully" to "has completed initialization, successfully or not". This means an atomic read is a memory barrier for thread-safe correctness - and we want to read it for the "is it initialized?" check, anyway. There are several categories among the public API: * Functions that raise a GError and dereference or return a post-init struct member. The GInitable docs are fairly clear that these are expected to raise G_IO_ERROR_NOT_INITIALIZED if it failed or hasn't been done yet, even though that violates the usual "don't GError on programmer error" rule. Or, we could change the GInitable docs, and make them g_return_val_if_fail. * g_dbus_connection_close (and sync version) * g_dbus_connection_flush (and sync version) * g_dbus_connection_send_message (and with_reply versions, and wrappers for call, emit_signal etc.) * Functions that raise a GError but don't need to dereference or return a post-init struct member. These currently work fine before init, or after a failed init. We could make them raise a GError, or just silently work anyway. * g_dbus_connection_register_object (and subtree) * One function that dereferences a post-init struct member and doesn't have a GError. This currently crashes. It can't have its side-effect correctly until initialized: it could either do nothing silently, or warn/critical and do nothing. * g_dbus_connection_start_message_processing * Functions that return a post-init struct member pointer and don't have a GError. These don't currently crash, but their caller probably does; not all of them can even return NULL under normal circumstances. They could either silently return NULL, or warn/critical and return NULL. * g_dbus_connection_get_stream * g_dbus_connection_get_guid * g_dbus_connection_get_unique_name * g_dbus_connection_get_peer_credentials * One function that returns a post-init struct member scalar and doesn't have a GError. The caller gets a meaningless value, but won't crash. This could silently return 0, or warn/critical and return 0. * g_dbus_connection_get_capabilities * Functions that don't return or dereference a post-init struct member and don't have a GError. These currently work fine. They could either continue to have their side-effects (if any), warn/critical but continue to have their side-effects, or critical and do nothing. * g_dbus_connection_is_closed * g_dbus_connection_get_exit_on_close (and its setter) * g_dbus_connection_unregister_object (and subtree) * g_dbus_connection_signal_subscribe * g_dbus_connection_signal_unsubscribe * g_dbus_connection_signal_add_filter * g_dbus_connection_signal_remove_filter Prior art in GLib, apart from GDBus: * GSocket's various simple getters/setters (g_socket_get_fd, etc.) don't check - even the one that can fail with G_IO_ERROR_NOT_SUPPORTED. Operations with side-effects, like g_socket_listen(), all check. g_socket_condition_check() silently returns 0 ("nothing has happened") if not initialized. GSocket's initable_init is less problematic than GDBusConnection's, because it doesn't set struct members that have a public getter. * GCharsetConverter warns in its reset() (which doesn't have a GError), raises a GError from its convert() (which does), and silently succeeds in its simple getters/setters. GSocket's initable_init is less problematic than GDBusConnection's, because it doesn't set struct members that have a public getter.
Hey, thanks for the comprehensive analysis in comment 7 - I asked alexl about his opinion on IRC (since he wrote GInitiable originally): <davidz> alexl: hey - re https://bugzilla.gnome.org/show_bug.cgi?id=661689 I'm in favor of just changing the GInitable docs (e.g. replace "should return G_IO_ERROR_NOT_INITIALIZED" with "should g_warning(), g_critical") ... I just don't see how it can ever be valid to use a GInitable instance until it has been initialized (but maybe I'm not looking hard enough)... thoughts? <alexl> davidz: Can't think of any useful scenario either. <alexl> davidz: go for it <davidz> alexl: sure, I'll follow up in the bug Thus, I think we should change the GInitable docs to suggest that use-before-initialization is a programming error in the general case ... and that programmers should e.g. g_return_if_fail(), g_critical() instead of returning G_IO_ERROR_NOT_INITIALIZED. Thoughts?
agreed, and likewise for use-after-initialization-has-failed; the only thing you can do if g_initable_init() returns FALSE is destroy the object. The GInitable docs should probably also xref GAsyncInitable a bit more. (eg, if an object is both GInitable and GAsyncInitable, then you can call either g_initable_init() or g_async_initable_init_async() to initialize it). Also, presumably you are allowed to g_object_ref/unref() an initable before initializing it, which the docs currently technically forbid.
(In reply to comment #7) > * Functions that raise a GError and dereference or return a post-init > struct member. The GInitable docs are fairly clear that these are expected > to raise G_IO_ERROR_NOT_INITIALIZED if it failed or hasn't been done yet, > even though that violates the usual "don't GError on programmer error" rule. > Or, we could change the GInitable docs, and make them g_return_val_if_fail. > > * g_dbus_connection_close (and sync version) > * g_dbus_connection_flush (and sync version) > * g_dbus_connection_send_message (and with_reply versions, > and wrappers for call, emit_signal etc.) Since these have never worked correctly, I'd be in favour of g_return_if_fail for master. I'm not sure that that's appropriate for the stable branch, though: perhaps we should raise a GError there? > * Functions that raise a GError but don't need to dereference or return > a post-init struct member. These currently work fine before init, > or after a failed init. We could make them raise a GError, or just silently > work anyway. > > * g_dbus_connection_register_object (and subtree) There's no real reason they need the object to be initialized, but that does dilute the "it's wrong to not initialize your GInitable immediately" message. It would be most consistent to have these fail in the same way as send_message() and friends. > * One function that dereferences a post-init struct member and > doesn't have a GError. This currently crashes. It can't have its > side-effect correctly until initialized: it could either do nothing > silently, or warn/critical and do nothing. > > * g_dbus_connection_start_message_processing Since this can't possibly work, I think it should have the check, and critical if the object isn't initialized, rather than silently not doing what was asked for. > * Functions that return a post-init struct member pointer and > don't have a GError. These don't currently crash, but their caller > probably does; not all of them can even return NULL under normal > circumstances. They could either silently return NULL, or warn/critical > and return NULL. > > * g_dbus_connection_get_stream > * g_dbus_connection_get_guid > * g_dbus_connection_get_unique_name > * g_dbus_connection_get_peer_credentials Strictly speaking, I don't think anything guarantees that setting a pointer variable is atomic with respect to threads, so in the worst possible case, we could read the struct-member pointer when it was half-updated, and have an invalid pointer. This couldn't happen on a sane architecture, but even so, I don't think we should read these variables unless we've confirmed that they have their final values. So, I think these should all assert that the object is initialized, and critical if it isn't. If it isn't, I think the safest thing to return is NULL. An extra subtlety is that we shouldn't use g_return_val_if_fail (at least, not directly), because that would drop the memory barrier if G_DISABLE_CHECKS is defined! get_stream() could maybe return non-NULL, silently, if the object was given a stream at construct-time; I don't think it's really worth it (we'd have to have a flag for "it had a stream at construct time"). Similarly for get_guid(). Similarly, get_unique_name() and get_peer_credentials() could maybe return NULL, silently, if the object isn't a bus connection, or isn't a server, respectively. I don't think this is worth it either. > * One function that returns a post-init struct member scalar and doesn't have > a GError. The caller gets a meaningless value, but won't crash. > This could silently return 0, or warn/critical and return 0. > > * g_dbus_connection_get_capabilities In theory this could return a junk intermediate value too, although that at least wouldn't crash anything. I think it should critical and return 0. Again, we shouldn't use g_return_val_if_fail directly, because the memory barrier is a side-effect. > * Functions that don't return or dereference a post-init struct member and > don't have a GError. These currently work fine. They could either continue > to have their side-effects (if any), warn/critical but continue to have > their side-effects, or critical and do nothing. > > * g_dbus_connection_is_closed > * g_dbus_connection_get_exit_on_close (and its setter) > * g_dbus_connection_unregister_object (and subtree) > * g_dbus_connection_signal_subscribe > * g_dbus_connection_signal_unsubscribe > * g_dbus_connection_add_filter > * g_dbus_connection_remove_filter g_dbus_connection_is_closed() and the exit-on-close accessors need thread-safety work, but that's a job for another bug. It would be most consistent with GSocket if the simple getters/setters silently worked, even before init. set_exit_on_close needs to work before initialization anyway, because it's a construct property. On the other hand, if we're declaring that failure to initialize a GInitable is a g_return_if_fail situation, maybe GSocket needs changing... If we make registering objects check for initialization before having any side-effects, then unregistering objects is going to warn anyway (because it can't possibly find a registration), so there's certainly no harm in making these warn too. I'd be inclined to escalate to a critical warning (the same as g_return_if_fail), to be honest. I'm not sure about signals or filters, but by this point, a critical warning seems most consistent with the rest of GDBusConnection.
(In reply to comment #8) > Thus, I think we should change the GInitable docs to suggest that > use-before-initialization is a programming error in the general case ... and > that programmers should e.g. g_return_if_fail(), g_critical() instead of > returning G_IO_ERROR_NOT_INITIALIZED. ... (In reply to comment #9) > agreed, and likewise for use-after-initialization-has-failed; the only thing > you can do if g_initable_init() returns FALSE is destroy the object. I've cloned Bug #662208. This yak is increasingly difficult to shave :-P
(In reply to comment #10) > (In reply to comment #7) > > * Functions that raise a GError and dereference or return a post-init > > struct member. The GInitable docs are fairly clear that these are expected > > to raise G_IO_ERROR_NOT_INITIALIZED if it failed or hasn't been done yet, > > even though that violates the usual "don't GError on programmer error" rule. > > Or, we could change the GInitable docs, and make them g_return_val_if_fail. > > > > * g_dbus_connection_close (and sync version) > > * g_dbus_connection_flush (and sync version) > > * g_dbus_connection_send_message (and with_reply versions, > > and wrappers for call, emit_signal etc.) > > Since these have never worked correctly, I'd be in favour of g_return_if_fail > for master. I'm not sure that that's appropriate for the stable branch, though: > perhaps we should raise a GError there? I would just do the same on the stable branch. > > * Functions that raise a GError but don't need to dereference or return > > a post-init struct member. These currently work fine before init, > > or after a failed init. We could make them raise a GError, or just silently > > work anyway. > > > > * g_dbus_connection_register_object (and subtree) > > There's no real reason they need the object to be initialized, but that does > dilute the "it's wrong to not initialize your GInitable immediately" message. > > It would be most consistent to have these fail in the same way as > send_message() and friends. Yeah, just return_val_if_fail() if used when not initialized. > [Good stuff about how we should add checks for use-before-initialization > that I largely agree with] Sure, I don't mind adding checks and making sure the checks work properly in the case that someone is accessing the uninitialized object from another thread than where it is being initialized. But I'm still not sure who we are adding the checks for? I mean, use of a GDBusConnection before initialization is not a very common error I think. Especially not since with the way g_bus_get() and g_dbus_connection_new() work.. I mean, we are only returning the GDBusConnection instance in the finish() methods if initialization succeeds (specifically, g_bus_get() and g_dbus_connection_new() does not return the instance). But maybe you have run into cases where code is using a GDBusConnection instance before it's initialized? Or maybe you just prefer to play it extra safe?
(In reply to comment #12) > Sure, I don't mind adding checks and making sure the checks work properly in > the case that someone is accessing the uninitialized object from another thread > than where it is being initialized. > > But I'm still not sure who we are adding the checks for? Quietly getting undefined behaviour is really hard to debug; quietly getting undefined behaviour in multi-threaded situations doubly so. When something goes wrong with GDBusConnection (and I can see that ->closed isn't used in a thread-safe way, I just haven't yet worked out what the failure mode would be or how to fix it), I'd like to get assertion failures and criticals, rather than silent breakage. > I mean, use of a GDBusConnection before initialization is not a very common > error I think. [...] > But maybe you have run into cases where code is using a GDBusConnection > instance before it's initialized? I have a crash-dump here (not public, sorry) with a connection->worker NULL dereference in one of its many threads (while sending Hello(), iirc), and the only possible explanations I can see are this, and bad RAM :-/ Normally I'd be tempted to say "if you're doing it that wrong, you get to keep both pieces"; but GDBusConnection is subtle, because it uses threads internally and is designed to be used by even more threads, so I think it's worth being maximally careful. (Also, the current documentation for GInitable could be interpreted as implying that this isn't undefined behaviour at all, which can't help.)
(In reply to comment #13) > (In reply to comment #12) > > Sure, I don't mind adding checks and making sure the checks work properly in > > the case that someone is accessing the uninitialized object from another thread > > than where it is being initialized. > > > > But I'm still not sure who we are adding the checks for? > > Quietly getting undefined behaviour is really hard to debug; quietly getting > undefined behaviour in multi-threaded situations doubly so. When something goes > wrong with GDBusConnection (and I can see that ->closed isn't used in a > thread-safe way, I just haven't yet worked out what the failure mode would be > or how to fix it), I'd like to get assertion failures and criticals, rather > than silent breakage. OK, fair enough. > > I mean, use of a GDBusConnection before initialization is not a very common > > error I think. > [...] > > But maybe you have run into cases where code is using a GDBusConnection > > instance before it's initialized? > > I have a crash-dump here (not public, sorry) with a connection->worker NULL > dereference in one of its many threads (while sending Hello(), iirc), and the > only possible explanations I can see are this, and bad RAM :-/ > > Normally I'd be tempted to say "if you're doing it that wrong, you get to keep > both pieces"; but GDBusConnection is subtle, because it uses threads internally > and is designed to be used by even more threads, so I think it's worth being > maximally careful. > > (Also, the current documentation for GInitable could be interpreted as implying > that this isn't undefined behaviour at all, which can't help.) OK, makes sense. (Btw, I think connection->worker being NULL might be yet another symptom of one of the bugs where the worker thread kept running and it's calling back into an already finalized GDBusConnection instance. I can't remember but 322e25b535a63a631f2f53439a876a4d7d9c1f87 and f0b04acfd31b768151a88db3f8d3347f55b2a7b3 might be relevant. Just guessing in the hope that it might be helpful.)
Created attachment 199579 [details] [review] GDBusConnection: replace is_initialized with an atomic flag The comment implied that even failed initialization would set is_initialized = TRUE, but this wasn't the case - failed initialization would only set initialization_error, and it was necessary to check both. It turns out the documented semantics are nicer than the implemented semantics, since this lets us use atomic operations, which are also memory barriers, to avoid needing separate memory barriers or locks for initialization_error (and other members that are read-only after construction). I expect to need more than one atomically-accessed flag to fix thread safety, so instead of a minimal implementation I've turned is_initialized into a flags word.
Created attachment 199580 [details] [review] GDBusConnection: check for initialization where needed for thread-safety Also document which fields require such a check in order to have correct threading semantics. This usage doesn't matches the GInitable documentation, which suggests use of a GError - but using an uninitialized GDBusConnection is programming error, and not usefully recoverable. (The GInitable documentation may have been a mistake - GNOME#662208.) Also, not all of the places where we need it can raise a GError. The check serves a dual purpose: it turns a non-deterministic crash into a deterministic critical warning, and is also a memory barrier for thread-safety. All of these functions dereference or return fields that are meant to be protected by FLAG_INITIALIZED, so they could crash or return an undefined value to their caller without this, if called from a thread that isn't the one that called initable_init() (although I can't think of any way to do that without encountering a memory barrier, undefined behaviour, or a race condition that leads to undefined behaviour if the non-initializing thread wins the race). One exception is that initable_init() itself makes a synchronous call. We deal with that by passing new internal flags up the call stack, to reassure g_dbus_connection_send_message_unlocked() that it can go ahead.
Created attachment 199581 [details] [review] GDBusConnection: document use while uninitialized as undefined behaviour --- We might not need this if we document it in G[Async]Initable instead.
Created attachment 199582 [details] [review] GDBusConnection: check for initializedness in most public API The only exceptions are those of the trivial getters/setters that don't already need the initialization check for its secondary role as a memory barrier (this is consistent with GSocket, where trivial getters/setters don't check): * g_dbus_connection_set_exit_on_close * g_dbus_connection_get_exit_on_close * g_dbus_connection_is_closed g_dbus_connection_set_exit_on_close needs to be safe for use before initialization anyway, so it can be set at construct-time.
Review of attachment 199579 [details] [review]: LGTM
Review of attachment 199580 [details] [review]: Looks good to me. Thanks.
Review of attachment 199581 [details] [review]: I think it's a good idea to document it even with GInitable and GAsyncInitable being documented. Please commit.
Review of attachment 199582 [details] [review]: Yup. Thanks for the patches!
First three patches (thread safety bugfixes, and documentation) pushed to glib-2-30 for 2.30.2; all four pushed to master for 2.31.1.