After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 794190 - VPN doesn't work after underlying connection restarts
VPN doesn't work after underlying connection restarts
Status: RESOLVED OBSOLETE
Product: NetworkManager
Classification: Platform
Component: VPN (general)
git master
Other Linux
: Normal normal
: ---
Assigned To: NetworkManager maintainer(s)
NetworkManager maintainer(s)
Depends on:
Blocks:
 
 
Reported: 2018-03-09 01:19 UTC by David Woodhouse
Modified: 2020-11-12 14:28 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description David Woodhouse 2018-03-09 01:19:27 UTC
I made NetworkManager-openconnect set 'can-persist':
https://git.gnome.org/browse/network-manager-openconnect/commit/?id=f8e386e2c9522cb93a604eab9e534cb1f98ff18c

Has anyone ever actually tested this? It's been in NM for a *long* time, but doesn't seem to work.

When the underlying Ethernet goes down, the VPN stays logically connected, which is OK... but when a new physical connection is started, the VPN never works again. This is because the route to the VPN gateway... is through the VPN device 'vpn0'.

On initial VPN bringup, NM will set a host route for the VPN server, through the original device. It needs to do that again when a new device is used. 

It would also be really useful to *tell* the VPN service that it should trigger a reconnect. Sending SIGUSR2 to openconnect would make it reconnect immediately, regardless of where in its Dead Peer Detection and periodic timed retry cycle it was.

It's possible that NM really is trying to route through the 'vpn0' device intentionally, since it sees a separate unmanaged logical 'vpn0' connection in addition to the VPN connection it brought up for itself:
NAME                             UUID                                  TYPE             DEVICE    
VPN2                             bda8e791-d172-491d-b41e-35aa8a26fb3d  vpn              vpn0      
Wired connection 1               2415d366-c770-4fd1-9f5a-403e417998af  802-3-ethernet   enp0s31f6 
virbr0                           491c2da0-a8b2-4a8f-8e88-9ad0ea56f282  bridge           virbr0    
vpn0                             ddacf9cb-0ea3-4317-8488-2f9ed0888c3a  tun              vpn0
Comment 1 David Woodhouse 2018-03-09 01:23:08 UTC
To clarify:

After 'nmcli con down eth0; nmcli con up eth0', VPN is broken.

I must manually

 ip route add $VPNGATEWAY via $LOCALGW dev eth0

and then either wait an unspecified amount of time, or 

 killall -USR2 openconnect
Comment 2 David Woodhouse 2018-03-09 11:51:14 UTC
I filed the 'unmanaged' vpn0 device separately as bug 794200. Not sure if it's contributing to the problem here or not.
Comment 3 David Woodhouse 2018-03-20 16:14:10 UTC
FWIW this hack as a dispatcher script seems to suffice to make things work.


DEV="$1"

case "$2" in
    up)
        # Is the VPN running? Ideally NM would be able to tell us this (and the 'vpn0' device name)
        VPNGW=$(ps axf | sed -n "s%.*/usr/sbin/openconnect.*--interface vpn0 \([0-9.]\+\):443$%\1%p")
        [ "$VPNGW" = "" ] && exit 0

        # Is there a default route on the currently-coming-up physical device?
        DEVROUTE=$(ip route | sed -n  "s/^default via \([0-9.]\+\) .*dev $DEV .*/\1/p")
        [ "$DEVROUTE" = "" ] && exit 0

        # Is the VPN trying to route through itself?
        VPNGWROUTE=$(ip route get $VPNGW | sed 's/.*dev [^ ]\+.*//')
        [ "$VPNGWROUTE" ] != "vpn0" && exit 0

        # XX: We ideally want to do this for a wired Corporate connection too
        if nmcli con show 'Corporate Wi-Fi Settings (wpa2)' | grep -q GENERAL.STATE:.*activated; then
            logger -p daemon.info "$0 killing VPN because wpa2 is connected"
            nmcli con down 'Corporate VPN' &
            exit 0
        fi
        logger -p daemon.info "$0 restoring route to $VPNGW via $DEVROUTE on $DEV"
        ip route add $VPNGW via $DEVROUTE dev $DEV
        killall -USR2 openconnect
        ;;
    *)
        exit 0
        ;;
esac

exit 0
Comment 4 David Woodhouse 2018-09-19 18:57:30 UTC
Don't think this is fixed yet; I still need the workaround at least in 1.10.6...
Comment 5 David Woodhouse 2018-10-19 09:57:25 UTC
There are also issues with captive portals — if the hotel network is wanting me to click every single day to tell it I haven't changed my mind about the terms and conditions (grrr) then that doesn't work when my default route goes through the VPN.

The captive portal detection — and the pop-up browser to deal with it — need to run in a non-VPN network namespace.

It would also be good to *remove* the VPN routes when there's no underlying network connection, instead of leaving them active. That way, when I was on the VPN and resume my laptop on a plane, my online login can fail immediately and cached login can proceed. As it is, I get to wait a number of minutes as the system has a *route* to the corporate network, but it strangely seems that all connections are timing out...
Comment 6 David Woodhouse 2018-10-19 10:20:25 UTC
Correction: Add an unreachable default route while the underlying connection is down, don't just delete the VPN routes.

Deleting them would be a security problem during the period when stuff is coming back up again, and you temporarily have a route on the local network when you *should* be routing everything to the VPN.
Comment 7 David Woodhouse 2019-02-06 13:02:34 UTC
I had a play with implementing the unreachable route thing in my hackish dispatcher script workaround. However, I don't have the necessary hook to *remove* the unreachable route, when the VPN client actually manages to re-establish its connection.

OpenConnect does re-invoke its "vpnc-script" (which in our case is just nm-openconnect-server-openconnect-helper which sends everything back to NM by D-Bus) with reason=reconnect, but NM would have to be able to act on that event.
Comment 8 David Woodhouse 2019-05-08 15:19:02 UTC
As discussed on IRC, it looks like this probably ought to work... but it doesn't.

It looks like the DEVICE_CHANGED signal only gets emitted when the underlying connection does *down*, not when it comes back up.

I added a debug message in the device_changed() function...


<info>  [1557328368.8554] device (vpn0): Activation: successful, device activated.
<info>  [1557328384.2709] device (ens3): state change: activated -> deactivating (reason 'user-requested', sys-iface-state: 'managed')
<info>  [1557328384.2743] audit: op="connection-deactivate" uuid="9ffa4756-2375-3716-add2-787167c1984c" name="ens3" pid=4239 uid=0 result="success"
<info>  [1557328384.2746] device (ens3): state change: deactivating -> disconnected (reason 'user-requested', sys-iface-state: 'managed')
<info>  [1557328384.2873] dhcp4 (ens3): canceled DHCP transaction, DHCP client pid 4160
<info>  [1557328384.2874] dhcp4 (ens3): state changed bound -> done
VPN plugin: device changed, can_persist 1 state activated (7) ifindex 3 old (nil) new 0x562d894a4a90
<info>  [1557328384.2998] policy: set 'Amazon VPN' (vpn0) as default for IPv4 routing and DNS
<info>  [1557328389.3759] device (ens3): Activation: starting connection 'ens3' (9ffa4756-2375-3716-add2-787167c1984c)
<info>  [1557328389.3766] audit: op="connection-activate" uuid="9ffa4756-2375-3716-add2-787167c1984c" name="ens3" pid=4256 uid=0 result="success"
<info>  [1557328389.3769] device (ens3): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
<info>  [1557328389.3775] device (ens3): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
<info>  [1557328389.3964] device (ens3): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
<info>  [1557328389.3972] dhcp4 (ens3): activation: beginning transaction (timeout in 45 seconds)
<info>  [1557328389.4001] dhcp4 (ens3): dhclient started with pid 4262
<info>  [1557328389.4300] dhcp4 (ens3):   address 192.168.122.10
<info>  [1557328389.4303] dhcp4 (ens3):   plen 24 (255.255.255.0)
<info>  [1557328389.4305] dhcp4 (ens3):   gateway 192.168.122.1
<info>  [1557328389.4307] dhcp4 (ens3):   lease time 3600
<info>  [1557328389.4308] dhcp4 (ens3):   nameserver '192.168.122.1'
<info>  [1557328389.4309] dhcp4 (ens3):   domain name 'infradead.org'
<info>  [1557328389.4311] dhcp4 (ens3): state changed unknown -> bound
<info>  [1557328389.4336] device (ens3): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed')
<info>  [1557328389.4348] device (ens3): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed')
<info>  [1557328389.4353] device (ens3): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
<info>  [1557328389.4477] device (ens3): Activation: successful, device activated.
<info>  [1557328391.4326] policy: set 'ens3' (ens3) as default for IPv6 routing and DNS
^C<info>  [1557328568.9393] caught SIGINT, shutting down normally.
Comment 9 David Woodhouse 2019-05-08 15:36:15 UTC
[dwmw2@localhost src]$ nmcli con show 'Amazon VPN' | grep ens
GENERAL.DEVICES:                        ens3
[dwmw2@localhost src]$ sudo nmcli con down ens3 ; ip route ; sleep 5 ; sudo nmcli con up ens3 ; ip route
Connection 'ens3' successfully deactivated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/6)
default dev vpn0 proto static scope link metric 50 
10.95.122.252 dev vpn0 proto kernel scope link src 10.95.122.252 metric 50 
Connection successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/9)
default dev vpn0 proto static scope link metric 50 
default via 192.168.122.1 dev ens3 proto dhcp metric 100 
10.95.122.252 dev vpn0 proto kernel scope link src 10.95.122.252 metric 50 
192.168.122.0/24 dev ens3 proto kernel scope link src 192.168.122.10 metric 100 
[dwmw2@localhost src]$ nmcli con show 'Amazon VPN' | grep ens
[dwmw2@localhost src]$
Comment 10 André Klapper 2020-11-12 14:28:55 UTC
bugzilla.gnome.org is being shut down in favor of a GitLab instance. 
We are closing all old bug reports and feature requests in GNOME Bugzilla which have not seen updates for a long time.

If you still use NetworkManager and if you still see this bug / want this feature in a recent and supported version of NetworkManager, then please feel free to report it at https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/

Thank you for creating this report and we are sorry it could not be implemented (workforce and time is unfortunately limited).