After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 794848 - keyring busy-loops forever if ssh-agent times out
keyring busy-loops forever if ssh-agent times out
Status: RESOLVED INCOMPLETE
Product: gnome-keyring
Classification: Core
Component: ssh-agent
3.28.x
Other Linux
: Normal normal
: ---
Assigned To: GNOME keyring maintainer(s)
GNOME keyring maintainer(s)
Depends on:
Blocks:
 
 
Reported: 2018-03-30 19:17 UTC by Dafydd Harries
Modified: 2018-05-13 14:13 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
patch (3.55 KB, patch)
2018-03-30 19:17 UTC, Dafydd Harries
none Details | Review

Description Dafydd Harries 2018-03-30 19:17:49 UTC
Created attachment 370357 [details] [review]
patch

Symptom: tried to ssh somewhere, client hung, gnome-keyring-daemon was pegging a CPU

In gkd_ssh_agent_process_connect(), it manually iterates the main context, waiting for a 5s timeout. But it seems like the timeout never fires because g_main_context_iterate() is called with may_block=FALSE. This means it loops forever.

Attached is a fix. Instead of manually iterating the main context, start a main loop that is stopped when the agent starts or when a timeout is reached. In principle, the main loop should be stopped if the ssh-agent process terminates (on_child_watch). This would mean not having to wait 5s for a timeout, and might make it easier to propagate errors from the agent. An more thorough approach would be to make the _connect() API asynchronous.

Forgive my code; my GNOME is rusty.
Comment 1 Daiki Ueno 2018-03-31 05:34:57 UTC
(In reply to Dafydd Harries from comment #0)

> In gkd_ssh_agent_process_connect(), it manually iterates the main context,
> waiting for a 5s timeout. But it seems like the timeout never fires because
> g_main_context_iterate() is called with may_block=FALSE. This means it loops
> forever.

It's surprising to me that ssh-agent could take 5s for normal startup, and also may_block=FALSE prevents dispatching events.  Are these really true?

For example, if I create a wrapper script like this:
```
#!/bin/sh

sleep 10

exec /usr/bin/ssh-agent.orig "$@"
```
I actually get this line in the journal:
```
gnome-keyring-daemon[7988]: couldn't connect to ssh-agent: ssh-agent process is not ready
```
which means the timerout handler is called.

Anyway, the symptom sounds very similar to bug 794631, where I added a fix in a different place.  Could you check which version of gnome-keyring you are using?
Comment 2 Daiki Ueno 2018-03-31 11:49:48 UTC
Also, before going further, have you tried simply changing may_block to TRUE?  Did that help?
Comment 3 Daiki Ueno 2018-04-04 07:51:37 UTC
Any update?  If this is a real issue I would like to have a fix for 3.28.1, scheduled for the next Monday.
Comment 4 Daiki Ueno 2018-04-09 08:30:28 UTC
Closing this bug report as no further information has been provided. Please feel free to reopen this bug report if you can provide the information that was asked for in a previous comment.
Thanks!
Comment 5 Ting-Wei Lan 2018-05-13 13:30:01 UTC
I found the recent commit fb0d66553753bdc0d700cb5c0bb2803d0690e9ff references this bug report and the commit message says it potentially fixes the issue described in this bug. However, it breaks ssh on my desktop. Both ssh and ssh-add seems to hang forever after this commit. Unsetting SSH_AUTH_SOCK or going back to the previous commit makes ssh work for me again.
Comment 6 Daiki Ueno 2018-05-13 14:13:59 UTC
Yes, that's true and the reason why I didn't merge the commit to the 3.28.2 release.  I forgot to revert it in master (now it should be reverted).

I really hope that the original reporter gives us a precise analysis rather than the potential solution.