GNOME Bugzilla – Bug 766733
kerberos, telepathy: Stopping goa-daemon removes account or expires credentials
Last modified: 2017-01-06 17:47:40 UTC
See: https://bugzilla.redhat.com/show_bug.cgi?id=1267534 When goa-daemon dies, GoaIdentityService signs out the Kerberos identities and GoaTpAccountLinker deletes the corresponding TpAccount. This is because they use GoaClient::account-removed to keep the GoaObjects in sync with the underlying Kerberos credentials and Telepathy accounts. Unfortunately, when the goa-daemon process dies (eg., due to restarting it or a crash) all GoaClient instances emit account-removed. Both GoaIdentityService and GoaTpAccountLinker mistake this emission as an explicit user action to delete the account. They shouldn't do that. This is documented [1] and recommended against: "... must not destroy data if an account object is removed (e.g. when the “account-removed” signal is emitted) - for example, if the goa-daemon program crashes or is restarted on software upgrade, account objects will be removed only to be added back the next time goa-daemon is started." [1] https://developer.gnome.org/goa/stable/overview-writing.html
Created attachment 328278 [details] [review] provider: Export ensure_builtins_loaded as internal API
Created attachment 328279 [details] [review] daemon: Abstract the way we initialize the Kerberos provider
Created attachment 328280 [details] [review] identity: Minor clean up
Created attachment 328281 [details] [review] provider: Re-order and re-align the GoaProviderClass members
Created attachment 328282 [details] [review] provider: Re-align the function prototypes
Created attachment 328283 [details] [review] provider: Add a new vfunc for backend-specific work on account removal
Created attachment 328284 [details] [review] daemon: Rename EnsureData as ObjectInvocationData
Created attachment 328285 [details] [review] daemon: Call goa_provider_remove_account during account removal
Created attachment 328286 [details] [review] kerberos: Implement GoaProvider:remove_account for signing out
Review of attachment 328278 [details] [review]: sure
Review of attachment 328279 [details] [review]: yea i like this a lot better. I guess there's no problem with doing the other providers' class initialization earlier. ::: src/daemon/goadaemon.c @@ -225,3 @@ - if (provider != NULL) - { - g_debug ("activated kerberos provider"); Might want to move this debug message into the the backend. (maybe there's enough other messages that it doesn't matter)
Review of attachment 328280 [details] [review]: if it's really only needed in that one place, that's fine. I guess starting to consolidate the goa code, move us toward removing the cycle we were talking about.
Review of attachment 328281 [details] [review]: this is fine, but the commit message only says why it's okay to do, not why you're doing it. maybe add something like "The order now is rather ad-hoc and the alignment uneven. build_object was previously erroneously in the pure virtual section when it has a concrete default implementation. We're free to move things around, because ..."
Review of attachment 328282 [details] [review]: sure
Review of attachment 328283 [details] [review]: this seems like an important enhancement, given the semantics of the account-removed signal, but i'm a little uneasy about the asymmetry between remove_account and add_account, thoughts?
Review of attachment 328284 [details] [review]: sure
Since an explicit user action to delete an account will lead to a org.gnome.OnlineAccounts:Remove call, we let the GoaProvider sub-classes hook into the handling of the call inside goa-daemon to do any extra clean up that they may need to run. eg., signing out the Kerberos identity or deleting the Telepathy account. So far these patches only address Kerberos, but the solution for Telepathy will be similar. I am planning to drive GoaTpAccountLinker from the Telepathy backend instead of just spawning it from goa-daemon's main.c. GoaTpAccountLinker will grow an API to let the backend trigger the clean-up from the GoaProvider:remove_account hook. Since we only want to run the linker inside goa-daemon, and not just any other process that might link to libgoa-backend-1.0.so (eg., gnome-control-center), we will now need a way to let GoaDaemon ask the backend to run it. This isn't an issue for Kerberos because GoaIdentityService is a separate D-Bus daemon and D-Bus ensures that it is a singleton. I am thinking of adding another virtual method in GoaProvider that GoaDaemon will call to start extra provider-specific monitoring, if any.
(In reply to Ray Strode [halfline] from comment #11) > Review of attachment 328279 [details] [review] [review]: > > yea i like this a lot better. I guess there's no problem with doing the > other providers' class initialization earlier. Yes, it should be fine. goa_provider_get_for_provider_type ensures all the builtin extensions, so we were already doing that. This is what I tried to convey by the "simpler" part of the commit message. Maybe I should reword it? > ::: src/daemon/goadaemon.c > @@ -225,3 @@ > - if (provider != NULL) > - { > - g_debug ("activated kerberos provider"); > > Might want to move this debug message into the the backend. (maybe there's > enough other messages that it doesn't matter) Yes, sure, I will move it. Even if we decide to drop it, we should do that in a separate commit.
(In reply to Ray Strode [halfline] from comment #15) > Review of attachment 328283 [details] [review] [review]: > > this seems like an important enhancement, given the semantics of the > account-removed signal, but i'm a little uneasy about the asymmetry between > remove_account and add_account, thoughts? The asymmetry between add_account and remove_account bothers me too. I thought about calling it remove_account_post or something, but couldn't come up with a nice name. Fortunately remove_account is entirely private, so we can change it without affecting any application.
Thanks for the reviews so far, halfline!
Review of attachment 328285 [details] [review]: seems fine. It's a little strange to have the invocation as a list, but i guess that's better than having a completely separate structure for Ensure and Remove
Review of attachment 328286 [details] [review]: ::: src/goaidentity/goaidentityservice.c @@ -1554,3 @@ - account_identity = goa_account_get_identity (account); - - g_debug ("Kerberos account %s removed and should now be signed out", account_identity); would be good to have this debug message in the provider.
> Yes, it should be fine. goa_provider_get_for_provider_type ensures all the > builtin extensions, so we were already doing that. This is what I tried to > convey by the "simpler" part of the commit message. Maybe I should reword it? Oh right, of course. So maybe change this: Instead of explicitly calling out to the Kerberos provider by creating a dummy instance, we can just ensure that all the provider extensions are registered. This is better because it hides the implementation details of each backend. to There is no reason to call out to the Kerberos provider explicitly by creating a dummy instance. The provider class will perform all necessary initialization at the time of extension registration. Merely ensuring the extensions are registered is better because it makes the backend implementation details more opaque. (or something)
(In reply to Debarshi Ray from comment #19) > The asymmetry between add_account and remove_account bothers me too. I > thought about calling it remove_account_post or something, but couldn't come > up with a nice name. One idea is we could rename add_account to "show_new_account_dialog" or something like that. anyway, something that can be changed later.
Comment on attachment 328278 [details] [review] provider: Export ensure_builtins_loaded as internal API Pushed to master.
Created attachment 328305 [details] [review] daemon: Abstract the way we initialize the Kerberos provider Pushed to master after improving the commit message and restoring the g_debug.
Review of attachment 328286 [details] [review]: ::: src/goabackend/goakerberosprovider.c @@ +1476,3 @@ + const gchar *identifier; + + ensure_identity_manager (); Stupid mistake on my part. We can't wait for the GCond in the same thread that is supposed to signal it. One option is to implement asynchronous remove_account as a synchronous function running in a separate thread.
Created attachment 328562 [details] [review] kerberos: Implement GoaProvider:remove_account for signing out * Restore the g_debug * Fix the threading problem
(In reply to Debarshi Ray from comment #27) > One option is to implement asynchronous remove_account as a synchronous > function running in a separate thread. Ah, yea, you could probably just remove that line, since the identity manager should already be available, I think, from the sync ensure credentials call at start up, but deferring to a thread is clearly better.
Created attachment 328593 [details] [review] provider: Add a synchronous version of goa_provider_get_all
Created attachment 328594 [details] [review] provider: Add a new vfunc for backend-specific initialization
Created attachment 328595 [details] [review] daemon: Call goa_provider_bus_acquired during start-up
Created attachment 328596 [details] [review] telepathy: Implement GoaProvider:bus_acquired to start the linker
Review of attachment 328593 [details] [review]: ::: src/goabackend/goaprovider.c @@ +1338,3 @@ + * GMainContext for invoking the asynchronous callbacks, we can't + * push a new GMainContext here. + */ I really don't like this, but I don't know of any better alternatives other than fixing telepathy-glib. It would be nice to be able to synchronously get the list of GoaProviders in the early stages of goa-daemon's start-up.
Review of attachment 328594 [details] [review]: ::: src/goabackend/goaprovider-priv.h @@ +85,3 @@ gboolean just_added, GError **error); + void (*bus_acquired) (GoaProvider *self); I couldn't come up with a nice name so I stole the nomenclature from g_bus_own_name. I considered things like "initialize", but this is initialization specific to goa-daemon. Any other process (eg., gnome-control-center) linking to libgoa-backend-1.0.so shouldn't run this. Suggestions?
Review of attachment 328562 [details] [review]: So one thing I wonder about... what happens if a user hits "Sign In" then removes the account very quickly. could we end up in a situation where the account gets signed in after it gets signed out?
Review of attachment 328593 [details] [review]: ::: src/goabackend/goaprovider.c @@ +1338,3 @@ + * GMainContext for invoking the asynchronous callbacks, we can't + * push a new GMainContext here. + */ I guess it doesn't matter that much, since it's all daemon code and we're not going to run it in other processes. @@ +1341,3 @@ + + context = g_main_context_ref_thread_default (); + data.loop = g_main_loop_new (context, FALSE); So I think what you're saying is it just uses the default main context unconditionally. I think it would be clearer to use g_main_context_default() or NULL here, instead of relying on the fact that a different thread-default main context hasn't been pushed.
(In reply to Debarshi Ray from comment #35) > I couldn't come up with a nice name so I stole the nomenclature from > g_bus_own_name. I considered things like "initialize", but this is > initialization specific to goa-daemon. Any other process (eg., > gnome-control-center) linking to libgoa-backend-1.0.so shouldn't run this. Well, I don't think initialize is that bad. The declaration is in the private header, so control-center won't get tempted to call it anyway. bus_acquired seems weird, since you call it before the bus connection is acquired. I think "initialize" or "daemon_initialize" are better than bus_acquired, but in the end, it doesn't really matter. The other option is you could provide a way for the provider to query if it's running in the daemon, and then just use the provider instance init function directly
Review of attachment 328595 [details] [review]: ::: src/daemon/goadaemon.c @@ +241,3 @@ + { + GoaProvider *provider = GOA_PROVIDER (l->data); + goa_provider_bus_acquired (provider); again it's a little strange to call it bus_acquired when it's called before g_bus_get_sync() below.
Review of attachment 328596 [details] [review]: i like that you're moving this to the backend and making the daemon more generic. I notice you're still connecting to account-removed. You'll switch to remove_account in a follow up commit? ::: src/daemon/main.c @@ +41,3 @@ static GoaDaemon *the_daemon = NULL; #ifdef GOA_TELEPATHY_ENABLED probably can get rid of this #ifdef
(In reply to Ray Strode [halfline] from comment #40) > Review of attachment 328596 [details] [review] [review]: > > i like that you're moving this to the backend and making the daemon more > generic. I notice you're still connecting to account-removed. You'll > switch to remove_account in a follow up commit? Yes. > ::: src/daemon/main.c > @@ +41,3 @@ > static GoaDaemon *the_daemon = NULL; > > #ifdef GOA_TELEPATHY_ENABLED > > probably can get rid of this #ifdef Of course. I will remove it.
(In reply to Ray Strode [halfline] from comment #36) > Review of attachment 328562 [details] [review] [review]: > > So one thing I wonder about... what happens if a user hits "Sign In" then > removes the account very quickly. could we end up in a situation where the > account gets signed in after it gets signed out? Good point. That's a distinct possibility. I will check.
(In reply to Ray Strode [halfline] from comment #37) > Review of attachment 328593 [details] [review] [review]: > > ::: src/goabackend/goaprovider.c > @@ +1338,3 @@ > + * GMainContext for invoking the asynchronous callbacks, we can't > + * push a new GMainContext here. > + */ > > I guess it doesn't matter that much, since it's all daemon code and we're > not going to run it in other processes. Ok. g_bus_own_name is (the only other asynchronous operation) running on the global default GMainContext at this point. It means that name_acquired_handler or name_lost_handler can be called while we are still inside the goa_daemon_new call. The former is a dummy, but the latter leads to g_main_loop_quit. As far as I can make out, if g_main_loop_quit is called before the internal asynchronous goa_provider_get_all is finished, we stand to leak some of its user_data, and will observe "strange" behaviour where a synchronous method is not blocking the thread. I guess that is ugly, but harmless. > @@ +1341,3 @@ > + > + context = g_main_context_ref_thread_default (); > + data.loop = g_main_loop_new (context, FALSE); > > So I think what you're saying is it just uses the default main context > unconditionally. I think it would be clearer to use g_main_context_default() > or NULL here, instead of relying on the fact that a different thread-default > main context hasn't been pushed. Of course. I will change it to NULL. It was easy to alternate between g_main_context_ref_thread_default, and g_main_context_new followed by g_main_context_push/pop_thread_default, when I was poking at it to figure out the hang. Got carried away and forgot to do the right thing.
(In reply to Ray Strode [halfline] from comment #38) > (In reply to Debarshi Ray from comment #35) > > I couldn't come up with a nice name so I stole the nomenclature from > > g_bus_own_name. I considered things like "initialize", but this is > > initialization specific to goa-daemon. Any other process (eg., > > gnome-control-center) linking to libgoa-backend-1.0.so shouldn't run this. > > Well, I don't think initialize is that bad. The declaration is in the > private header, so control-center won't get tempted to call it anyway. Ok, sold. I will change it to "initialize". I meant to say that this is unlike initializing the Kerberos backend, which is something that we do every time the type is registered for every process. Instead, this is code that will only be run inside goa-daemon. So, I was trying to come up with a name that reflects this, but I might have over-thought this. I wasn't worried about gnome-control-center accidentally calling it, because, as you noticed, it isn't meant to. This makes me wonder. Maybe we can turn goa-identity-service into a kitchen sink where we put all such helper code? That way, we just poke it and D-Bus will ensure that there is only instance per user/session. But, I guess, we shouldn't get too side-tracked in this bug. > bus_acquired seems weird, since you call it before the bus connection is > acquired. I think "initialize" or "daemon_initialize" are better than > bus_acquired, but in the end, it doesn't really matter. I meant that it is called after g_bus_own_name's bus_acquired_handler has been invoked. :) > The other option is you could provide a way for the provider to query if > it's running in the daemon, and then just use the provider instance init > function directly I thought about it, but I don't know how to do that.
Created attachment 329295 [details] [review] provider: Re-order and re-align the GoaProviderClass members
Created attachment 329296 [details] [review] provider: Re-align the function prototypes
Review of attachment 328596 [details] [review]: ::: src/goabackend/goatelepathyprovider.c @@ +758,3 @@ + if (g_once_init_enter (&once_init_value)) + { + tp_linker = goa_tp_account_linker_new (); We should only start the linker after the name as been acquired, as the original code was doing. Otherwise we will be in trouble due to our use of both GDBusObjectManagerClient and Server against the same name in the same process. Since "initialize" is called in goa_daemon_init, GoaTpAccountLinker might create the GoaClient before the name has been acquired. If it does, then GoaClient's GDBusObjectManagerClient will get a notify::g-name-owner. The handler will run in the main thread and make a synchronous call against GoaDaemon's GDBusObjectManagerServer. Since the Client is blocking the main thread, the Server won't be able to handle the call and it will time out.
Created attachment 329328 [details] [review] telepathy: Implement GoaProvider:initialize to start the linker
Created attachment 329329 [details] [review] tplinker: Shuffle some code around
Created attachment 329330 [details] [review] telepathy: Use GoaProvider:remove_account for deleting the TpAccount
I have put everything in wip/rishi/account-remove. Restarting goa-daemon gives these CRITICALs when "preparing" the TpAccountManager inside GoaTpAccountLinker: (goa-daemon:16022): tp-glib-CRITICAL **: tp_proxy_add_interface_by_id: assertion 'tp_proxy_get_invalidated (self) == NULL' failed (goa-daemon:16022): tp-glib-CRITICAL **: tp_proxy_add_interface_by_id: assertion 'tp_proxy_get_invalidated (self) == NULL' failed (goa-daemon:16022): tp-glib-CRITICAL **: tp_proxy_add_interface_by_id: assertion 'tp_proxy_get_invalidated (self) == NULL' failed (goa-daemon:16022): tp-glib-CRITICAL **: tp_proxy_add_interface_by_id: assertion 'tp_proxy_get_invalidated (self) == NULL' failed (goa-daemon:16022): tp-glib-CRITICAL **: tp_proxy_add_interface_by_id: assertion 'tp_proxy_get_invalidated (self) == NULL' failed (goa-daemon:16022): tp-glib-CRITICAL **: tp_proxy_add_interface_by_id: assertion 'tp_proxy_get_invalidated (self) == NULL' failed They aren't new. We used to get them before. I will see if we can do something about those.
(In reply to Ray Strode [halfline] from comment #36) > Review of attachment 328562 [details] [review] [review]: > > So one thing I wonder about... what happens if a user hits "Sign In" then > removes the account very quickly. could we end up in a situation where the > account gets signed in after it gets signed out? With the old code (let's say gnome-3-18), clicking the "sign in" button and the "-" button in quick succession first shows the shell modal dialog to enter the password and then an error dialog pops up with "Decrypt integrity check failed". That is krb5 lingo for "bad password" [1]. I am thinking of passing a GCancellable to the get_ticket_sync call in refresh_account (which is what happens during "Sign In") and then cancelling it in the remove_account vfunc. [1] http://www.cmf.nrl.navy.mil/krb/kerberos-faq.html#badpass
(In reply to Debarshi Ray from comment #52) > (In reply to Ray Strode [halfline] from comment #36) > > Review of attachment 328562 [details] [review] [review] [review]: > > > > So one thing I wonder about... what happens if a user hits "Sign In" then > > removes the account very quickly. could we end up in a situation where the > > account gets signed in after it gets signed out? > > With the old code (let's say gnome-3-18), clicking the "sign in" button and > the "-" button in quick succession first shows the shell modal dialog to > enter the password and then an error dialog pops up with "Decrypt integrity > check failed". That is krb5 lingo for "bad password" [1]. > > I am thinking of passing a GCancellable to the get_ticket_sync call in > refresh_account (which is what happens during "Sign In") and then cancelling > it in the remove_account vfunc. I wasn't thinking so hard ... We can't do the cancellation in libgoa-backend.so because remove_account will be invoked in goa-daemon, while the refresh_account will be happening in gnome-control-center. We will have to do it in goa-identity-service.
(In reply to Debarshi Ray from comment #52) > (In reply to Ray Strode [halfline] from comment #36) > > Review of attachment 328562 [details] [review] [review] [review]: > > > > So one thing I wonder about... what happens if a user hits "Sign In" then > > removes the account very quickly. could we end up in a situation where the > > account gets signed in after it gets signed out? > > With the old code (let's say gnome-3-18), clicking the "sign in" button and > the "-" button in quick succession first shows the shell modal dialog to > enter the password and then an error dialog pops up with "Decrypt integrity > check failed". That is krb5 lingo for "bad password" [1]. This was an utter thinko on my part. The "bad password" thing was happening simply because I was repeatedly entering the wrong password. When I enter the right password, I get asked for the password and if I click the "-" button again, then it removes the account. Looking at GoaKerberosProvider and GoaKerberosIdentityManager, I notice two things: (a) refresh_account performs a serious of synchronous operations. That makes it impossible to activate the "-" button before the "Sign In" button has finished what it set out to do. One can't switch windows to run 'kdestroy' because the system modal will prevent that. (b) GoaKerberosIdentityManager serializes all incoming sign in/out requests in a GAsyncQueue, so there is no way the sign-out operation can start before the sign-in operation has finished. This makes me think that there shouldn't be a race.
(In reply to Debarshi Ray from comment #54) > (a) refresh_account performs a serious of synchronous operations. That makes > it impossible to activate the "-" button before the "Sign In" button has > finished what it set out to do. One can't switch windows to run 'kdestroy' > because the system modal will prevent that. 'kdestroy' is irrelevant because we don't remove the GoaObject due to it. So, even if we have a shell script running that runs kdestroy in a loop, while we work through the "Sign In" sequence it doesn't matter. The account will be signed in, and then get signed out due to the loss of the kerberos ticket.
Hi, > (b) GoaKerberosIdentityManager serializes all incoming sign in/out requests > in a GAsyncQueue, so there is no way the sign-out operation can start before > the sign-in operation has finished. right, great thanks.
(In reply to Debarshi Ray from comment #51) > Restarting goa-daemon gives these CRITICALs when "preparing" the > TpAccountManager inside GoaTpAccountLinker: > > (goa-daemon:16022): tp-glib-CRITICAL **: tp_proxy_add_interface_by_id: > assertion 'tp_proxy_get_invalidated (self) == NULL' failed > > (goa-daemon:16022): tp-glib-CRITICAL **: tp_proxy_add_interface_by_id: > assertion 'tp_proxy_get_invalidated (self) == NULL' failed > > (goa-daemon:16022): tp-glib-CRITICAL **: tp_proxy_add_interface_by_id: > assertion 'tp_proxy_get_invalidated (self) == NULL' failed > > (goa-daemon:16022): tp-glib-CRITICAL **: tp_proxy_add_interface_by_id: > assertion 'tp_proxy_get_invalidated (self) == NULL' failed > > (goa-daemon:16022): tp-glib-CRITICAL **: tp_proxy_add_interface_by_id: > assertion 'tp_proxy_get_invalidated (self) == NULL' failed > > (goa-daemon:16022): tp-glib-CRITICAL **: tp_proxy_add_interface_by_id: > assertion 'tp_proxy_get_invalidated (self) == NULL' failed > > They aren't new. We used to get them before. I will see if we can do > something about those. These are caused by the GOA telepathy-mission-control plugin [*] using GoaClient::account-removed and trying to remove the TpAccounts as one goa-daemon goes away and another comes back. When the new goa-daemon tries to prepare a TpAccountManager that GoaTpAccountLinker can use, the GOA telepathy-mission-control plugin is trying to remove some accounts from underneath it. I don't see a way for the new goa-daemon to wait for the previous instance to unown the org.gnome.OnlineAccounts name *and* the telepathy-mission-control plugin to mark the accounts as deleted. As far as I know, these CRITICALs are harmless. I suggest ignoring them, unless someone can demonstrate that there is an actual user-visible problem. [*] The GoaTpAccountLinker bridge is for "standalone" Telepathy accounts like XMPP, SIP, etc.. For things like Google, there is a telepathy-mission-control plugin shipped by empathy that uses GoaClient and exports GoaObjects as TpAccounts.
Created attachment 329499 [details] [review] daemon: Add a synchronous version of goa_provider_get_all
Created attachment 329500 [details] [review] provider: Add a new vfunc for backend-specific initialization
Created attachment 329501 [details] [review] daemon: Call goa_provider_initialize during start-up
Created attachment 329502 [details] [review] telepathy: Implement GoaProvider:initialize to start the linker
Created attachment 329555 [details] [review] tplinker: Shuffle some code around
Created attachment 329556 [details] [review] telepathy: Use GoaProvider:remove_account for deleting the TpAccount
I went ahead and pushed the Telepathy-half of the patches after the 3.21.3 to master. I am reasonably confident that they are fine, but if you do notice some breakage or have any comments otherwise, let me know. We can fix them up in master. Thanks for all the comments so far!
*** Bug 731162 has been marked as a duplicate of this bug. ***
*** Bug 709490 has been marked as a duplicate of this bug. ***