GNOME Bugzilla – Bug 792252
NetworkManager only trying the first gateway in redundant remote Gateways list
Last modified: 2018-01-19 16:24:33 UTC
My university provides us with a client.ovpn profile that we use to connect to the VPN. In the file it has a configuration section that looks like this: setenv FORWARD_COMPATIBLE 1 client server-poll-timeout 4 nobind remote vpn.dss.ucdavis.edu 1194 udp remote vpn.dss.ucdavis.edu 1194 udp remote vpn.dss.ucdavis.edu 443 tcp remote vpn.dss.ucdavis.edu 1194 udp remote vpn.dss.ucdavis.edu 1194 udp remote vpn.dss.ucdavis.edu 1194 udp remote vpn.dss.ucdavis.edu 1194 udp remote vpn.dss.ucdavis.edu 1194 udp dev tun dev-type tun ns-cert-type server setenv opt tls-version-min 1.0 or-highest reneg-sec 604800 sndbuf 100000 rcvbuf 100000 auth-user-pass # NOTE: LZO commands are pushed by the Access Server at connect time. # NOTE: The below line doesn't disable LZO. comp-lzo no verb 3 setenv PUSH_PEER_INFO As you can see, there are 8 redundant remote Gateways (I don't know why they are all the same, but I have zero control over it). Now, if I evoke the client.ovpn file from the openvpn command line, I can get a connection ONLY through the port 443 tcp gateway. This usually wouldn't bother me, however, if I try to connect through NetworkManager after importing the file it consistently fails. The Gateway box is populated as follows: vpn.dss.ucdavis.edu:1194:udp, vpn.dss.ucdavis.edu:1194:udp, vpn.dss.ucdavis.edu:443:tcp, vpn.dss.ucdavis.edu:1194:udp, vpn.dss.ucdavis.edu:1194:udp, vpn.dss.ucdavis.edu:1194:udp, vpn.dss.ucdavis.edu:1194:udp, vpn.dss.ucdavis.edu:1194:udp Which looks fine, meaning that it's not parsing weird. On trying to connect and failing, journalctl reports: >> <warn> [1515177413.9186] vpn-connection[0x555fd1c24550,c29b6aa3-27f2-41d2-9389-b0450f623fea,"DSS",0]: VPN plugin: failed: connect-failed (1) >> SIGTERM[hard,] received, process exiting event_wait : Interrupted system call (code=4) >> <warn> [1515177413.9179] vpn-connection[0x555fd1c24550,c29b6aa3-27f2-41d2-9389-b0450f623fea,"DSS",0]: VPN connection: connect timeout exceeded. >> Connect timer expired, disconnecting. >> NOTE: UID/GID downgrade will be delayed because of --client, --pull, or --up-delay >> UDP link remote: [AF_INET]169.237.160.75:1194 >> UDP link local: (not bound) >> TCP/UDP: Preserving recently used remote address: [AF_INET]169.237.160.75:1194 >> NOTE: the current --script-security setting may allow this configuration to call user-defined scripts >> WARNING: --ns-cert-type is DEPRECATED. Use --remote-cert-tls instead. >> library versions: OpenSSL 1.1.0g 2 Nov 2017, LZO 2.10 >> OpenVPN 2.4.4 x86_64-unknown-linux-gnu [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [PKCS11] [MH/PKTINFO] [AEAD] built on Sep 26 2017 Which seems to show that it is only trying a single of the multiple remote gateways. Further, if I switch the 443 tcp gateway to be the first of the list of 8, I can get a successful connection. Thus, I think there's an issue with it trying all the connections in series. I'm running this on ArchLinux btw.
Created attachment 366480 [details] [review] [PATCH] utils: fix parsing of empty remote
Created attachment 366509 [details] [review] utils: fix failure return value for nmovpn_remote_parse() Fixes: 3c5c7efba75ffd121be3b0ac179c36ca9aa772b0
(In reply to Beniamino Galvani from comment #1) > Created attachment 366480 [details] [review] [review] > [PATCH] utils: fix parsing of empty remote I think this is not the right fix. If you follow the code, then an empty @str should be handled just right. How about comment 2? Btw, I don't understand how this fixes the original report (comment 0).
Created attachment 366516 [details] [review] properties: fix parsing remptes when checking gateway entry Fixes: 3c5c7efba75ffd121be3b0ac179c36ca9aa772b0
(In reply to Thomas Haller from comment #3) > (In reply to Beniamino Galvani from comment #1) > > Created attachment 366480 [details] [review] [review] [review] > > [PATCH] utils: fix parsing of empty remote > > I think this is not the right fix. If you follow the code, then an empty > @str should be handled just right. > > How about comment 2? Yes, looks good. > Btw, I don't understand how this fixes the original report (comment 0). As explained in the commit message of comment 1, empty strings (resulting from splitting the gateway string at spaces and commas) are not ignored as they should, and thus a NULL remote is passed as second argument to the openvpn binary: /usr/sbin/openvpn --remote valid.remote.n1 1194 udp --remote 1194 udp ... Britt, can you please run this command: nmcli general logging level KEEP domains VPN_PLUGIN:DEBUG then try to connect and paste here the openvpn command line from the system journal found, for example, in this way: journalctl -u NetworkManager -b | grep nm-openvpn | grep EXEC
both merged to master: https://git.gnome.org/browse/network-manager-openvpn/commit/?id=350196b1f7652c241c00e75075e6120cd9df5360 https://git.gnome.org/browse/network-manager-openvpn/commit/?id=74e2f1f789a8bc0dfd43f51a3d8301cbda597947
Here is the output from the command you asked for: Jan 10 09:05:45 Michelangelo NetworkManager[380]: nm-openvpn[2211] <debug> EXEC: '/usr/sbin/openvpn --remote vpn.dss.ucdavis.edu 1194 udp --remote vpn.dss.ucdavis.edu 1194 udp --remote vpn.dss.ucdavis.edu 443 tcp-client --remote vpn.dss.ucdavis.edu 1194 udp --remote vpn.dss.ucdavis.edu 1194 udp --remote vpn.dss.ucdavis.edu 1194 udp --remote vpn.dss.ucdavis.edu 1194 udp --remote vpn.dss.ucdavis.edu 1194 udp --comp-lzo no --nobind --dev tun --dev-type tun --auth-nocache --tls-auth /home/britt/.cert/nm-openvpn/client-tls-auth.pem 1 --ns-cert-type server --reneg-sec 604800 --verb 5 --syslog nm-openvpn --script-security 2 --up /usr/lib/nm-openvpn-service-openvpn-helper --debug 6 2211 --bus-name org.freedesktop.NetworkManager.openvpn.Connection_2 --tun -- --up-restart --persist-key --persist-tun --management /var/run/NetworkManager/nm-openvpn-c29b6aa3-27f2-41d2-9389-b0450f623fea unix --management-client-user root --management-client-group root --management-query-passwords --auth-retry interact --route-noexec --ifconfig-noexec --client --ca /home/britt/.cert/nm-openvpn/client-ca.pem --cert /home/britt/.cert/nm-openvpn/client-cert.pem --key /home/britt/.cert/nm-openvpn/client-key.pem --auth-user-pass --user nm-openvpn --group nm-openvpn' The output for just nm-openvpn (not EXEC) is: https://pastebin.com/ZjKWC9SR
So, it seems the patch above is not related to your problem. This needs more investigation... in the meantime, the other following patch should fix the failed assertion seen in your log.
Created attachment 366832 [details] [review] [PATCH] service: keep reference to plugin while main loop is running
(In reply to Beniamino Galvani from comment #9) > Created attachment 366832 [details] [review] [review] > [PATCH] service: keep reference to plugin while main loop is running lgtm. Merged: https://git.gnome.org/browse/network-manager-openvpn/commit/?id=9f45977058f7c8cf1fb0c67636cab43e2c85f228
VPN plugins use a timer to ensure that the connection is established within 60 seconds and so there is not enough time to try other servers in the list. I'm not sure how to fix this, though...
Weird, when I connect directly in the command line with openvpn it doesn't take 60 seconds to connect, it tries the servers in order within just a couple seconds before finding the one that works
(In reply to Britt Yazel from comment #12) > Weird, when I connect directly in the command line with openvpn it doesn't > take 60 seconds to connect, it tries the servers in order within just a > couple seconds before finding the one that works Right, it's because of the: server-poll-timeout 4 directive, which is not supported by NM at the moment. I think we should implement it.
Created attachment 366877 [details] [review] [PATCH] service,properties: add support for connect-timeout
(In reply to Beniamino Galvani from comment #14) > Created attachment 366877 [details] [review] [review] > [PATCH] service,properties: add support for connect-timeout The files properties/tests/conf/server-poll-timeout.ovpn properties/tests/conf/connect-timeout.ovpn are missing can you add tooltips to the GTK GUI? Especially, like we commonly do mention how this option is called in openvpn manual. + <child> + <placeholder/> + </child> </object> drop this? rest lgtm
Created attachment 366887 [details] [review] [PATCH v2] service,properties: add support for connect-timeout (In reply to Thomas Haller from comment #15) > The files > properties/tests/conf/server-poll-timeout.ovpn > properties/tests/conf/connect-timeout.ovpn > are missing > can you add tooltips to the GTK GUI? Especially, like we commonly do mention > how this option is called in openvpn manual. > drop this? All fixed, thanks.
(In reply to Beniamino Galvani from comment #16) > Created attachment 366887 [details] [review] [review] > [PATCH v2] service,properties: add support for connect-timeout lgtm. This patch allows the user to get the configuration accordingly. But it doesn't make it work by default, with the 60 seconds timeout. Do you think we should do something about that? Maybe if - user didn't configure --connect-timeout - and there are multiple remotes, automatically specify a connect-timeout of ... e.g. 30 seconds?
Created attachment 367066 [details] [review] [PATCH v3 1/2] service,properties: add support for connect-timeout
Created attachment 367067 [details] [review] [PATCH v3 2/2] service: lower the connect timeout if there are multiple remotes
(In reply to Thomas Haller from comment #17) > This patch allows the user to get the configuration accordingly. But it > doesn't make it work by default, with the 60 seconds timeout. Do you think > we should do something about that? Maybe if > - user didn't configure --connect-timeout > - and there are multiple remotes, > automatically specify a connect-timeout of ... e.g. 30 seconds? Sounds like a good idea to me, added in patch 2.
+ g_ptr_array_add (args, + (gpointer) g_strdup_printf ("%u", NM_MAX (60 / num_remotes, 20U))); indentation? Also, maybe in this case you don't need the (gpointer) cast? Otherwise, v3 lgtm
(In reply to Thomas Haller from comment #21) > + g_ptr_array_add (args, > + (gpointer) g_strdup_printf ("%u", NM_MAX (60 / > num_remotes, 20U))); > > indentation? Also, maybe in this case you don't need the (gpointer) cast? Right. Applied to master: https://git.gnome.org/browse/network-manager-openvpn/commit/?id=665ba502e27c962867b338a255957b6a9b9003ec https://git.gnome.org/browse/network-manager-openvpn/commit/?id=b7195649ffdb66c8b9f4dbae6b868d797dd29406