Bug 794014 – NetworkManager does not automatically connect to WLAN

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 794014 - NetworkManager does not automatically connect to WLAN


Summary:	NetworkManager does not automatically connect to WLAN


Status:	RESOLVED FIXED

Product:	NetworkManager
Classification:	Platform
Component:	general
Version:	1.10.x
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	NetworkManager maintainer(s)
QA Contact:	NetworkManager maintainer(s)

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2018-03-03 12:21 UTC by Dominique Leuenberger
Modified:	2018-03-08 10:37 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Fix setting of the block reason. (1.24 KB, patch) 2018-03-06 13:04 UTC, Fabian Vogt	none	Details \| Review

Description Dominique Leuenberger 2018-03-03 12:21:20 UTC

This was originally reported downstream, see
https://bugzilla.opensuse.org/show_bug.cgi?id=1079672

The issue started with the move from NM 1.8.x to 1.10.x

The original report was
"""
Given KDE Plasma 5.11.95, after updating to NM 1.10.2 (TW20180203) it does not connect automatically to my WLAN anymore, even though the option "Automatically connect to this network when it is available” in the KDE Network applet is chosen. In fact, it tries to connect, but quits after a while with the message "no password provided", even though its there (I am using Kwallet). 

I have to connect manually by using the "connect" button in the KDE Network applet.
"""

One of our KDE Maintainers helped track it down and came to this conclusion:

"""
Fix setting the autoconnect block reason if no agents registered

I debugged it a bit and I think I found the issue.
NM didn't set the right flag anymore when a connection fails because of no registered agents.
"""

And this simple patch that would fix it:
>--- NetworkManager-1.10.4/src/settings/nm-agent-manager.c	
>+++ NetworkManager-1.10.4/src/settings/nm-agent-manager.c	
>@@ -1571,7 +1571,7 @@ nm_agent_manager_init (NMAgentManager *s
> {
> 	NMAgentManagerPrivate *priv = NM_AGENT_MANAGER_GET_PRIVATE (self);
> 
>-	priv->agent_version_id = 1;
>+	priv->agent_version_id = 0;
> 	c_list_init (&priv->requests);
> 	priv->agents = g_hash_table_new_full (nm_str_hash, g_str_equal, g_free, g_object_unref);
> }

It has been tested on the systems that expressed the error before and they auto-connect again, as the users would expect.

Comment 1 Dominique Leuenberger 2018-03-03 14:06:08 UTC

Turns out this only had a one-time effect on boot, but not on logout/login of the system.

A 2nd chunk is needed there:

>--- NetworkManager-1.10.4/src/nm-policy.c	
>+++ NetworkManager-1.10.4/src/nm-policy.c	
>@@ -1760,8 +1760,7 @@ device_state_changed (NMDevice *device,
> 				 * We detect this by using a version-id of the agent-manager, which increments
> 				 * whenever new agents register. */
> 				con_v = nm_settings_connection_get_last_secret_agent_version_id (connection);
>-				if (   con_v == 0
>-				    || con_v != nm_agent_manager_get_agent_version_id (priv->agent_mgr))
>+				if (con_v == nm_agent_manager_get_agent_version_id (priv->agent_mgr))
> 					block_no_secrets = TRUE;
> 			}
> 


The entire issue seems to be a regression in e2c8ef45a

Comment 2 Thomas Haller 2018-03-04 16:42:31 UTC

I think the only issue here is:

    || con_v != nm_agent_manager_get_agent_version_id (priv->agent_mgr)
             ^^

this should be ==, no?

Comment 3 Thomas Haller 2018-03-04 16:48:30 UTC

(if you agree, can you create a patch with `git-format-patch`, including a nice commit message + author information, and attach it? Thanks)

Comment 4 Fabian Vogt 2018-03-06 13:04:58 UTC

Created attachment 369380 [details] [review]
Fix setting of the block reason.

I attached the full patch, it contains an explanation as well.

Comment 5 Thomas Haller 2018-03-06 13:32:23 UTC

> - The impossible happens (con_v can never be 0, except for integer overflows)

I think this is not correct. con_v will be zero, if we never requested any secrets for this connection. Arguably, if we never asked secrets for a connection, how can it fail to activate with NM_DEVICE_STATE_REASON_NO_SECRETS? I don't know, and maybe this condition can never happen anyway. But if it can happen for some reasons, we still should block it due to no secrets.

In other words: I think the "con_v == 0 ||" part should stay.

If you test it, is it necessary to remove the "con_v == 0" condition to fix the issue?

Comment 6 Fabian Vogt 2018-03-06 13:38:17 UTC

(In reply to Thomas Haller from comment #5)
> > - The impossible happens (con_v can never be 0, except for integer overflows)
> 
> I think this is not correct. con_v will be zero, if we never requested any
> secrets for this connection. Arguably, if we never asked secrets for a
> connection, how can it fail to activate with
> NM_DEVICE_STATE_REASON_NO_SECRETS? I don't know, and maybe this condition
> can never happen anyway. But if it can happen for some reasons, we still
> should block it due to no secrets.

IMO no. You just move from one unexpected state into another.

> In other words: I think the "con_v == 0 ||" part should stay.

If you really want, you can add it again.

> If you test it, is it necessary to remove the "con_v == 0" condition to fix
> the issue?

No, I only tested cases where the second part of the || is true, so the first one doesn't concern me.

Also: I very very much recommend not to use magic constants.
This should be a good example why: I did not know that "0" in this case means "no secrets requested" and it shouldn't as ID 0 is normally valid.

Comment 7 Thomas Haller 2018-03-07 22:37:36 UTC

> IMO no. You just move from one unexpected state into another.

Zero means that the connection never asked for secrets, although failing with REASON_NO_SECRETS. My guess is, that cannot happen. If we could be sure, we could instead write "nm_assert(con_s != 0)", but it's hard to be sure whether this condition holds. Also, a crash due to an wrong assertion is extremely annoying, especially because this condition is not an essential requirement that would invalidate the program. Meaning, even if the assertion would fail, we might just accept that, and not search for a bug somewhere else.

In this case, it seems easier to just handle it the best way, than proving that it cannot happen.

Without having a strong argument, I still think that blocking autoconnect for con_s==0 is the best way. 

> Also: I very very much recommend not to use magic constants.

Then let's add a comment?

Comment 8 Fabian Vogt 2018-03-08 07:56:14 UTC

> Also, a crash due to an wrong assertion is extremely annoying,

Not finding a bug because of silent breakage is even more annoying.
IMO this is exactly what assertions are there for.
You could just print a warning and continue ("soft assert").

> Then let's add a comment?

What about an enum instead? That way you're more or less forced to use it.
IMO it doesn't fit into the patch anyway.

Comment 9 Thomas Haller 2018-03-08 10:37:24 UTC

merged modified patch upstream:

master: https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=d2f019409d0906814ccd2050ce39609903f879f7

nm-1-10: https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=0824a327039d7fe7f9723f316da30538c7b688d0


Patches for any follow-up improvements welcome.


Thanks