After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 742823 - Connectivity checking sets wrong state when DNS servers cannot be reached
Connectivity checking sets wrong state when DNS servers cannot be reached
Status: RESOLVED FIXED
Product: NetworkManager
Classification: Platform
Component: IP and DNS config
unspecified
Other Linux
: Normal normal
: ---
Assigned To: NetworkManager maintainer(s)
NetworkManager maintainer(s)
Depends on:
Blocks:
 
 
Reported: 2015-01-12 20:55 UTC by Dan Williams
Modified: 2015-01-15 16:14 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Dan Williams 2015-01-12 20:55:50 UTC
After a687d1f9e0f75b987f40335934b54aa748f6724b, if there is an ACTIVATED connection, but DNS servers cannot be contacted (or, presumably the connectivity URL fails to be read), the nm_connectivity_check_cb() will return NM_CONNECTIVITY_LIMITED, which causes nm-manager.c::checked_connectivity() to set NM_STATE_CONNECTED_SITE.  This is fine and is what should happen.

But the next time nm_manager_update_state() is called, find_best_device_state() will return NM_STATE_CONNECTING which then overrides NM_STATE_CONNECTED_SITE because find_best_device_state() ignores the NMConnectivity state in the NM_ACTIVE_CONNECTION_STATE_ACTIVATED case for non-global connectivity.

find_best_device_state() should somehow mirror what checked_connectivity() does to determine GLOBAL/SITE connectivity perhaps?
Comment 1 Dan Williams 2015-01-12 20:56:48 UTC
Should be pretty easy to reproduce if you just give NM bogus DNS servers and ensure connectivity checking is on.
Comment 2 Dan Winship 2015-01-14 15:31:56 UTC
pushed danw/connectivity-bgo742823

In addition to the actual fix, it includes some debug logging improvements to nm-connectivity, and a fix for a bug noticed because of the debug logging...

We still have a problem that we end up doing two or three connectivity checks rather than just one during the SITE->GLOBAL transition, but this is more difficult to fix, and my first attempt didn't work, so I'm leaving it "broken" for now. (This is not new breakage, we just hadn't noticed it before.)
Comment 3 Dan Williams 2015-01-14 18:27:10 UTC
> connectivity: avoid redundant connectivity checks

Why the g_timeout_add (0, ...) instead of g_idle_add(); is that to get the higher priority G_PRIORITY_DEFAULT?  Maybe a comment there about why we want higher priority if this is the case.

The rest LGTM
Comment 4 Dan Winship 2015-01-14 19:59:14 UTC
(In reply to comment #3)
> Why the g_timeout_add (0, ...) instead of g_idle_add(); is that to get the
> higher priority G_PRIORITY_DEFAULT?  Maybe a comment there about why we want
> higher priority if this is the case.

There's no specific reason why I wanted the higher priority. I've just come to the conclusion that g_idle_add() is almost always wrong (for the case of "I want this to happen the next time the main loop runs"), because of its lower priority.
Comment 5 Jiri Klimes 2015-01-15 11:51:13 UTC
The branch looks good to me.

It fixes the endless loop. I run into the problem too when I was about to make a new NM update for Fedora 21. When I tested the build I got struck by the connectivity loop.
I have applied the patches on nm-0-9-10 (not pushed upstream) and made Fedora 21 scratch-build. It works for me.
http://koji.fedoraproject.org/koji/taskinfo?taskID=8623419
Comment 6 Dan Winship 2015-01-15 15:28:19 UTC
pushed to master and nm-1-0
Comment 7 Jiri Klimes 2015-01-15 16:14:04 UTC
and to nm-0-9-10 as well

master commits are:
5828a4b connectivity: merge branch 'danw/connectivity-bgo742823'
66b8f2b connectivity: avoid redundant connectivity checks
0997c4b connectivity: simplify redundant code
5e182d5 connectivity: fix manager state during connectivity check (bgo #742823)
53f2642 connectivity: improve debug logging