GNOME Bugzilla – Bug 794848
keyring busy-loops forever if ssh-agent times out
Last modified: 2018-05-13 14:13:59 UTC
Created attachment 370357 [details] [review] patch Symptom: tried to ssh somewhere, client hung, gnome-keyring-daemon was pegging a CPU In gkd_ssh_agent_process_connect(), it manually iterates the main context, waiting for a 5s timeout. But it seems like the timeout never fires because g_main_context_iterate() is called with may_block=FALSE. This means it loops forever. Attached is a fix. Instead of manually iterating the main context, start a main loop that is stopped when the agent starts or when a timeout is reached. In principle, the main loop should be stopped if the ssh-agent process terminates (on_child_watch). This would mean not having to wait 5s for a timeout, and might make it easier to propagate errors from the agent. An more thorough approach would be to make the _connect() API asynchronous. Forgive my code; my GNOME is rusty.
(In reply to Dafydd Harries from comment #0) > In gkd_ssh_agent_process_connect(), it manually iterates the main context, > waiting for a 5s timeout. But it seems like the timeout never fires because > g_main_context_iterate() is called with may_block=FALSE. This means it loops > forever. It's surprising to me that ssh-agent could take 5s for normal startup, and also may_block=FALSE prevents dispatching events. Are these really true? For example, if I create a wrapper script like this: ``` #!/bin/sh sleep 10 exec /usr/bin/ssh-agent.orig "$@" ``` I actually get this line in the journal: ``` gnome-keyring-daemon[7988]: couldn't connect to ssh-agent: ssh-agent process is not ready ``` which means the timerout handler is called. Anyway, the symptom sounds very similar to bug 794631, where I added a fix in a different place. Could you check which version of gnome-keyring you are using?
Also, before going further, have you tried simply changing may_block to TRUE? Did that help?
Any update? If this is a real issue I would like to have a fix for 3.28.1, scheduled for the next Monday.
Closing this bug report as no further information has been provided. Please feel free to reopen this bug report if you can provide the information that was asked for in a previous comment. Thanks!
I found the recent commit fb0d66553753bdc0d700cb5c0bb2803d0690e9ff references this bug report and the commit message says it potentially fixes the issue described in this bug. However, it breaks ssh on my desktop. Both ssh and ssh-add seems to hang forever after this commit. Unsetting SSH_AUTH_SOCK or going back to the previous commit makes ssh work for me again.
Yes, that's true and the reason why I didn't merge the commit to the 3.28.2 release. I forgot to revert it in master (now it should be reverted). I really hope that the original reporter gives us a precise analysis rather than the potential solution.