GNOME Bugzilla – Bug 756498
kerberos: fail early on ticket request when ticketing disabled
Last modified: 2021-07-05 10:58:31 UTC
Right now, it's possible to crash gnome-online-account if you turn of the kerberos slider at just the right moment when it's doing a refresh. This is because it will try to get a ticket when ticketing has already been disabled (and the ticketing interface is unset). This commit adds a check for that.
Created attachment 313183 [details] [review] kerberos: fail early on ticket request when ticketing disabled
Review of attachment 313183 [details] [review]: Thanks for the patch, Ray. It looks good to me, but I noticed two things unrelated to this: ::: src/goabackend/goakerberosprovider.c @@ +285,3 @@ identifier = goa_account_get_identity (account); ticketing = goa_object_get_ticketing (GOA_OBJECT (object)); The GOA_OBJECT cast is unnecessary because 'object' is already a pointer to a GoaObject. I am doubtful about using goa_object_peek_ticketing. This code can be run in a separate thread (other than the one where the GDBusObjectManagerServer was created) when called from ensure_credentials_sync because goa_provider_ensure_credentials will run the synchronous variant in a separate thread.
Review of attachment 313183 [details] [review]: ::: src/goabackend/goakerberosprovider.c @@ +292,3 @@ + GOA_ERROR, + GOA_ERROR_NOT_SUPPORTED, + _("ticketing is disabled for account")); Nitpick: won't it be better to use capital 'T' if this shows up in the UI?
Review of attachment 313183 [details] [review]: So i chatted a bit with rishi about this on irc. The peek_ticking problem he pointed out is actually a more serious general issue in goa. more than one backend relies on the interfaces to remain on the object while ensure_credentials_sync runs. We have a 2 fold plan for dealing with this: 1) don't remove interfaces until ensure_credentials operations complete 2) cancel pending operations when the interface should get removed 1) could alternatively be "stop using the peek_* functions and only use the get_* functions from the non main thread" Doing those things should obsolete the need for the below patch, though, so rejecting it.
(In reply to Ray Strode [halfline] from comment #4) > Review of attachment 313183 [details] [review] [review]: > > So i chatted a bit with rishi about this on irc. The peek_ticking problem he > pointed out is actually a more serious general issue in goa. more than one > backend relies on the interfaces to remain on the object while > ensure_credentials_sync runs. We have a 2 fold plan for dealing with this: > > 1) don't remove interfaces until ensure_credentials operations complete > 2) cancel pending operations when the interface should get removed > > 1) could alternatively be "stop using the peek_* functions and only use the > get_* functions from the non main thread" > Doing those things should obsolete the need for the below patch, though, so > rejecting it. I *think* (2) won't generally work across all providers, particularly ones like Google or ownCloud that have multiple switches against them. Each switch corresponds to a different interface (eg., mail, calendars, photos, etc.), and are usually associated with a different application. Currently, there is no way for an application to tell gnome-online-accounts that it is only interested in a particular interface. The way it works is, if you toggle the photos switch, then gnome-photos will react to the DBus notification/signal from g-o-a and stop showing photos from your Google account. Toggling the mail switch will have a similar effect on evolution and so on. So, if both evolution and gnome-photos had called EnsureCredentials on the Google account, we don't want to cancel evolution's invocation just because someone switched off photos. And, as I noted above, there is no way for goa-daemon to know that an invocation of EnsureCredentials is from an application that is only interested in a particular interface. I don't think that this limitation is a big problem per se. When the user toggles a switch off, the application has to cancel a bunch of operations related to it, anyway. Hence, it is not that unreasonable to expect it to cancel its own EnsureCredentials call too. eg., if you turn off Google photos, then gnome-photos should cancel downloading thumbnails, or if you turn off Google mail, then evolution should cancel downloading email, and so on. That said, there is room for some improvements. Since the credentials are same across the whole account, we should compress multiple EnsureCredentials invocations into one network call. We should also cancel invocations when the caller's bus name disappears. However, I guess, these are unrelated to the issue in this bug. I agree that the Kerberos provider is different from Google or ownCloud in this respect, because it doesn't have a "remote login" switch to govern SSH or a "browser" switch for HTTP access. It has one big switch to rule them all. If we go for (1), then attempts to authenticate over Kerberos at the very same instant that the switch was turned off will succeed. That's not so bad, is it? It's like any other race and as long as we resolve it consistently it should be OK, no?
(In reply to Ray Strode [halfline] from comment #4) > Review of attachment 313183 [details] [review] [review]: > > So i chatted a bit with rishi about this on irc. The peek_ticking problem he > pointed out is actually a more serious general issue in goa. more than one > backend relies on the interfaces to remain on the object while > ensure_credentials_sync runs. We have a 2 fold plan for dealing with this: > > 1) don't remove interfaces until ensure_credentials operations complete > 2) cancel pending operations when the interface should get removed > > 1) could alternatively be "stop using the peek_* functions and only use the > get_* functions from the non main thread" If we do go for (1), then I prefer s/peek/get/ than giving the peek methods some special meaning. The semantics of peek vs get and get vs dup are already well documented and (thanks to gdbus-codegen) should be in wide circulation. So, special-casing this and giving peek a different meaning would be confusing.
Christophe submitted some patches (bug 755316) that refactor the cargo culted code fragments around EnsureCredentials and friends.
(In reply to Debarshi Ray from comment #5) > When the user toggles a switch off, the application has to cancel a bunch of > operations related to it, anyway. Hence, it is not that unreasonable to > expect it to cancel its own EnsureCredentials call too. eg., if you turn off > Google photos, then gnome-photos should cancel downloading thumbnails, or if > you turn off Google mail, then evolution should cancel downloading email, > and so on. After reading through the GDBus code, it appears to me that a cancellation on the client-side is not seen by the service. If that is the case, then I am not sure how we can avoid making an unnecessary network call. :/ > That said, there is room for some improvements. Since the credentials are > same across the whole account, we should compress multiple EnsureCredentials > invocations into one network call. This is now in place. See bug 751524
(In reply to Ray Strode [halfline] from comment #0) I was playing with Kerberos a bit this evening. > Right now, it's possible to crash gnome-online-account if you > turn of the kerberos slider at just the right moment when it's > doing a refresh. I am curious about the "just the right moment" bit. If I add a Kerberos account (either through the UI or via kinit) and then turn off the "Network Resources" switch, then I can very predictably crash goa-daemon because of this bug. It will either crash on it's own when polling EnsureCredentials due to using a kernel keyring, or as a result of an explicit EnsureCredentials call. I am a little surprised that we weren't flooded with bug reports about this for so long. Or maybe we were and I didn't notice? Or am I missing something?
Review of attachment 313183 [details] [review]: ::: src/goabackend/goakerberosprovider.c @@ +285,3 @@ identifier = goa_account_get_identity (account); ticketing = goa_object_get_ticketing (GOA_OBJECT (object)); Actually, I was hinting at the other goa_object_peek_* calls, and not specifically goa_object_peek_ticketing, which are not using at all.
Created attachment 319862 [details] [review] kerberos: Fail early on ticket request when ticketing disabled I fixed the minor nitpicks. While I am continuing to mark the patch as "rejected", I am inclined to merge it. Avoiding a crash seems more important than avoiding an extra network call due to not cancelling any queued/in-flight operations when the switch has been disabled. We can address the larger issue of cancelling pending calls in the near future.
From #gnome-hackers on GIMPNet: <mclasen> I agree, no crashes >> no network traffic
From #gnome-hackers on GIMPNet: 17:37 <halfline> rishi: okay, but i think when we do the more complete change that patch won't be necessary anymore 17:38 <halfline> we can pull it out then i guess 17:38 <rishi> halfline: Of course.
Pushed to master and gnome-3-18.
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled). If you can still reproduce the situation described in this ticket in a recent and supported software version, then please follow https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines and create a new ticket at https://gitlab.gnome.org/GNOME/gnome-online-accounts/-/issues/ Thank you for your understanding and your help.