After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 746422 - (CVE-2018-1000135) [CVE-2018-1000135] Unencrypted DNS queries leaked outside full-tunnel VPN
(CVE-2018-1000135)
[CVE-2018-1000135] Unencrypted DNS queries leaked outside full-tunnel VPN
Status: RESOLVED OBSOLETE
Product: NetworkManager
Classification: Platform
Component: IP and DNS config
unspecified
Other Linux
: Normal normal
: ---
Assigned To: NetworkManager maintainer(s)
NetworkManager maintainer(s)
: 780913 (view as bug list)
Depends on:
Blocks: nm-review
 
 
Reported: 2015-03-18 22:30 UTC by Dan Williams
Modified: 2020-11-12 14:31 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Dan Williams 2015-03-18 22:30:03 UTC
If the VPN routes all traffic (eg, its ipv4.never-default=false) that usually indicates that the VPN's nameservers should be used instead of the parent interface's nameservers, since the parent interface's nameservers would be accessed over the VPN anyway (since it's routing all traffic).

But with dns=dnsmasq, the dnsmasq plugin always does split DNS regardless of the never-default value of the VPN's IPv4 config:

	/* Use split DNS for VPN configs */
	for (iter = (GSList *) vpn_configs; iter; iter = g_slist_next (iter)) {
		if (NM_IS_IP4_CONFIG (iter->data))
			add_ip4_config (conf, NM_IP4_CONFIG (iter->data), TRUE);
		else if (NM_IS_IP6_CONFIG (iter->data))
			add_ip6_config (conf, NM_IP6_CONFIG (iter->data), TRUE);
	}

instead I think that each config should be added with split DNS only if ipv4.never-default=true for that config.  That would ensure that when the VPN was routing all traffic, split DNS was not used, but when the VPN was not routing all traffic, split DNS was used.

If the user really does want to use the parent interface's nameservers even though they will be contacted over the VPN, they can either add custom dnsmasq options to /etc/NetworkManager/dnsmasq.d or enter them manually for the connection.

ISTR that the behavior I'm suggesting was always intended, but apparently we changed that behavior a long time ago and possibly didn't realize it?
Comment 1 Pavel Simerda 2015-03-19 11:15:43 UTC
In my opinion it is useful to use split DNS view in all cases and only use never-default setting to decide the global DNS.

Rationale: There is no such think as sending all traffic across VPN, only default route traffic, i.e. traffic for which there's no specific route over a specific interface. As specific routes (as found in the routing table) are still used even with default route over VPN, I believe that specific zones (as found in per-connection lists of domains) should be maintained as well.
Comment 2 John 'Warthog9' Hawley 2015-03-20 07:58:02 UTC
Pavel, I'll admit to not 100% following what you've suggested, so please excuse me if I've horribly miss-understood.  I disagree with the assertion that "There is no such think as sending all traffic across VPN".  The parent interface's adapter will have a local route mainly so you can get to the gateway, as well as a route for vpn endpoint you need to push traffic at however, there are some mitigating circumstances that forcing split-dns, so that the DNS on the VPN is ONLY serving the search spaces pushed, is actually exactly the opposite of what a user likely wants and/or causes some rather broken behavior.

- VPNs can, and often do, have IP space overlap issues.  So if the parent interface's network you are on happens to be in the 10.0.0.0/255.255.252.0 (gateway 10.0.0.1, DNS server 10.0.0.2 & 10.0.0.3) ip range, and the VPN uses 10.0.1.0/255.255.255.0, you can end up in some very screwed up situation.  This is actually taken from a real world scenario (which is why I learned of the change to default to split DNS at all).  If you are routing all traffic over the VPN you now have lost access to the two parent interface's DNS servers, and with split DNS you now have *NO* DNS access at all.  As it currently stands the only way to fix this is to either manually edit /etc/resolv.conf or to restart NM without dnsmasq.

- DNS is not equal at all locations, which your assumption about split DNS I think assumes.  DNS zones mean that something that resolves externally one way, may resolve completely differently (and potentially).  example.com, to an external resolver may go to a coloed and public instance, while the same dns entry from an internal dns server may not.  Assuming the VPN only pushes a search of internal.example.com, but doesn't push a search for example.com (making the assumption that people will just type it), the internal site is now unreachable.  Keeping in mind I'm talking about VPNs, and those are typically used in more corporate environments where you are dealing with corporate IT departments.

- In a more casual environment, lets say a hotel, part of the reasons to use a VPN is because the user EXPLICITLY doesn't trust anything about the network they are on, up to and including the DNS servers.  The user, if they are routing everything, is pretty likely to trust the DNS servers on the far side of the VPN.  There's actually plenty of security concerns there, and I for one would have assumed that if I'm routing all my traffic over the VPN, my DNS traffic was as well (meaning I wouldn't have been relying on, or trusting, DNS servers I don't trust).

These are the three issues I'm most concerned about keeping split-dns as the default without choice.  There are situations where I want split-dns, it's great and it's a fantastic feature, but in the case of a vpn that's intending to route all traffic, I'd argue the expected case is that all traffic, including all DNS queries, go over the vpn and it is not split.
Comment 3 Alexander E. Patrakov 2015-11-28 17:37:01 UTC
There is one more interesting use case currently unsupported by NetworkManager.

SafeDNS Inc. (a company that provides a filtered DNS server for parental control, with configurable filter settings) currently runs private beta testing of their VPN service, based on OpenVPN, with mobile phones and tablets being the primary target.

Their VPN is a split-tunnel setup, i.e. the pushed config explicitly routes their subnet (which contains their DNS server and various types of the "site is blocked" page) through the VPN, and nothing else. Their DNS server uses the fact that the queries come through the VPN in order to reliably identify the user and to apply his/her personal filtering settings.

For the filter to work, it is crucial that there are no other DNS servers used in the system. That's how OpenVPN Connect on both Android and iOS behaves, but, unfortunately, not how NetworkManager works.

So, in summary, they absolutely need all other DNS servers to be forgotten by the client, even though they don't want to route all traffic through their VPN servers.


For the record, here is what their ovpn file looks like:

client
remote vpn.safedns.com 1194 udp
nobind
dev tun
persist-tun
persist-key
verify-x509-name vpn.safedns.com name
remote-cert-tls server
cipher AES-128-CBC

<ca>
-----BEGIN CERTIFICATE-----
...snip...
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
...snip...
-----END CERTIFICATE-----
</ca>
<cert>
...snip...
-----END CERTIFICATE-----
</cert>
<key>
-----BEGIN PRIVATE KEY-----
...snip...
-----END PRIVATE KEY-----
</key>

They push routes to 195.46.39.0/29 and 195.46.39.39, and push 195.46.39.39 as the DNS server.
Comment 4 Dan Williams 2015-12-03 18:21:04 UTC
Thanks for the data point Alexander.  We probably need another option on the IP4 and IP6 settings for all connections indicating whether or not split DNS should be used if it is available for that connection.  There would be a default value that would preserve existing behavior.

For Pavel's case (with a VPN grabbing all default-route traffic, but where some routes/devices are not routed over the VPN) that property would indicate that split DNS should be used.

For the SafeDNS case (where grabs all default-route traffic, but no other DNS should be used) that property would indicate that no split DNS should be used, and the DNS manager would merge/ignore DNS based on connection priority (since we will very soon support multiple VPN connections).

If the property was left as default, then ipX.never-default would control whether split DNS was used or not.
Comment 5 Thomas Haller 2016-05-25 16:03:01 UTC
(In reply to Alexander E. Patrakov from comment #3)

> So, in summary, they absolutely need all other DNS servers to be forgotten
> by the client, even though they don't want to route all traffic through
> their VPN servers.

For the record, this is now possible by bug 758772
https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=8da3e658f7313f56928d22cfe13f9ab78cc1dd3c
Comment 6 David Woodhouse 2017-04-05 08:51:56 UTC
*** Bug 780913 has been marked as a duplicate of this bug. ***
Comment 7 David Woodhouse 2017-04-05 08:56:08 UTC
If a VPN connection is set to take all traffic then we should definitely be using its DNS servers for all lookups. We might not even be able to *reach* the DNS server advertised by the "local" network. We *might* if it's physically on the local subnet, but almost certainly not further afield.

Note also that if you *wanted* to do split DNS, you have no idea which domains to do split DNS *FOR*. You have a list of default search domains, but that is a DIFFERENT THING. A search domain of example.com means "if the user looks up foo and it doesn't exist, then also look for foo.example.com before failing". It doesn't mean any more than that. In particular, there can be domains which exist in a VPN (such as example.internal) which are *not* added to the search domains.

If you want to add an option for doing split-DNS, it can't be a boolean and abuse the search domains. It has to be an explicit list of the domains for which you want to use that DNS service. Unless we have a separate list of "domains which exist here" for each connection, which is *distinct* from the search domains?
Comment 8 Beniamino Galvani 2017-04-11 08:24:45 UTC
(In reply to David Woodhouse from comment #7)
> Note also that if you *wanted* to do split DNS, you have no idea which
> domains to do split DNS *FOR*. You have a list of default search domains,
> but that is a DIFFERENT THING. A search domain of example.com means "if the
> user looks up foo and it doesn't exist, then also look for foo.example.com
> before failing". It doesn't mean any more than that.

It also means that foo.example.com should be queried over the
interface that has it the search list. I don't think it makes any
sense that interface X specifies domain D in the search list but we
try to resolve it on interface Y.

> In particular, there
> can be domains which exist in a VPN (such as example.internal) which are
> *not* added to the search domains.

Yes, basically the search list is only a subset of the domains for
which we want to do split DNS.

> If you want to add an option for doing split-DNS, it can't be a boolean and
> abuse the search domains. It has to be an explicit list of the domains for
> which you want to use that DNS service. Unless we have a separate list of
> "domains which exist here" for each connection, which is *distinct* from the
> search domains?

I looked at systemd-resolved documentation and it introduces the
concept of "routing-only" domains, which are those domains only used
for deciding the output interface, as opposed to "search" domains that
are used for both completing unqualified names and for routing:

 https://www.freedesktop.org/software/systemd/man/systemd.network.html#Domains=

and I think we could do something similar. For example, interpret
domains starting with a tilde in ipvX.dns-search as routing-only. All
other domains automatically obtained from DHCP/VPN/... would still be
considered as standard search domains.

When the DNS backend supports split DNS (i.e. dnsmasq or
systemd-resolved) I think we should always use split-DNS for the
domains in the search list; and with always I mean also for non-VPN
connections. In this way, queries for a domain in the search list will
be forwarded only to the interface that specified it. Then, of course,
we need to add wildcard rules to forward non-matching queries to all
interfaces.

Borrowing another concept from systemd-resolved, we could support the
wildcard routing domain "~." that means "send all all queries that
don't match specific search domains to this interface". To keep
backwards compatibility, if no connection provides a wildcard routing
domain, we would forward all non-matching queries to all interfaces,
except to VPNs that provides a search list. In this way:
 - we still do split DNS for VPNs by default
 - this https://bugzilla.gnome.org/show_bug.cgi?id=766769 (VPNs that
   don't push any domains should get all queries) keeps working as is

In case of a full-tunnel VPN, one could set ipv4.dns-search to "~*" on
the VPN connection to direct all to the VPN DNS server. Queries for a
domain provided by a local connection would still go on through local
interface.

Any opinions about this idea? I pushed a draft implementation to branch
bg/dns-domains-bgo746422 and it seems that it should cover all the use
cases mentioned in this bz.
Comment 9 David Woodhouse 2017-04-11 09:22:08 UTC
(In reply to Beniamino Galvani from comment #8)
>  - we still do split DNS for VPNs by default
>  - this https://bugzilla.gnome.org/show_bug.cgi?id=766769 (VPNs that
>    don't push any domains should get all queries) keeps working as is

VPNs which don't push any *routing* domains should get all queries. So that's *all* existing VPN configs. From the automatic configuration of VPNs we only ever get *search* domains.

> In case of a full-tunnel VPN, one could set ipv4.dns-search to "~*" on
> the VPN connection to direct all to the VPN DNS server.

This needs to be the default, surely?

> Queries for a domain provided by a local connection would still go on 
> through local interface.

Doesn't that leave me with the same problem, that it's trying to perform DNS queries to the "local" DNS server which is actually upstream (e.g. 4.2.2.1), and I can't even *route* to that IP address because all my traffic is going to the VPN?

At the very least, this logic would need to be based on whether the VPN takes the default route or not, wouldn't it? If a VPN takes the default route, it *definitely* needs all DNS traffic. If it doesn't, it probably still should unless explicitly configured otherwise.
Comment 10 Beniamino Galvani 2017-04-11 12:47:02 UTC
(In reply to David Woodhouse from comment #9)
> (In reply to Beniamino Galvani from comment #8)
> >  - we still do split DNS for VPNs by default
> >  - this https://bugzilla.gnome.org/show_bug.cgi?id=766769 (VPNs that
> >    don't push any domains should get all queries) keeps working as is
>
> VPNs which don't push any *routing* domains should get all queries. So
> that's *all* existing VPN configs. From the automatic configuration of VPNs
> we only ever get *search* domains.

I think a search domain should also be implicitly used for routing,
and thus VPNs do push routing domains.

IOW, if a connection provides search domain X, queries for names
ending in X should only go through that connection, no?

> > In case of a full-tunnel VPN, one could set ipv4.dns-search to "~*" on
> > the VPN connection to direct all to the VPN DNS server.
>
> This needs to be the default, surely?

See below.

> > Queries for a domain provided by a local connection would still go on
> > through local interface.
>
> Doesn't that leave me with the same problem, that it's trying to perform DNS
> queries to the "local" DNS server which is actually upstream (e.g. 4.2.2.1),
> and I can't even *route* to that IP address because all my traffic is going
> to the VPN?

The scenario I'm referring to is: I'm connected to a VPN getting the
default route. I configure "~." as search domain on it to perform all
queries through the VPN. At the same time, the DHCP server on LAN
network announces a local DNS server with domain "local.foobar.com". I
want that every query ending in this domains is resolved locally, not
using the VPN DNS server.

If the DNS server announced by DHCP is not on LAN, I don't expect any
search domain to be present for the LAN connection and so every DNS
query will go through the VPN.

> At the very least, this logic would need to be based on whether the VPN
> takes the default route or not, wouldn't it? If a VPN takes the default
> route, it *definitely* needs all DNS traffic.

First, if we start to decide DNS policy based on routing, this would
be a change in behavior and will possibly break users'
configurations. If we restrict the change only to VPNs with default
routes, probably it's less of a problem, and I think we can do it.

> If it doesn't, it probably still should unless explicitly configured
> otherwise.

I think many people using split tunnel VPNs would complain about this
change in behavior because suddenly (and possibly, without knowing it)
they would start sending all DNS queries to the VPN, which can have
bad privacy implications (e.g. when it's a corporate VPN).
Comment 11 Beniamino Galvani 2017-06-26 13:39:06 UTC
I've reworked a bit the branch. The main change is a new
ipv{4,6}.dns-default property that replaces the "~." domain of the
previous version.

I've also written a short description of the changes at:

https://wiki.gnome.org/Projects/NetworkManager/DNS

Please have a look at branch bg/dns-domains-bgo746422.
Comment 12 Thomas Haller 2017-06-27 11:20:06 UTC
>> dns: introduce routing domains
    

+         if (domain[0] == '~' && search_only)
+              continue;

+         if (domain[0] == '~') {
+              if (search_only)
+                   continue;
+         }


I don't mind which, but let's make it consistent.



>> libnm-core,cli: add support for DNS default property

+     * name servers specified by this connection. When set to %FALSE, such
+     * queries are sent through all connections, excluding VPNs that provide
+     * a search list.

this is not clear to me. Instead of "When set to %FALSE,...", should it be:
"when no currently active connections have this property set to %TRUE, ..."




>> dns: add 'default' attribute to exported DNS entries
    

+         /* Add default */
+         g_variant_builder_add (&entry_builder,
+                                "{sv}",
+                                "default",
+                                g_variant_new_boolean (is_default));

maybe only add the "default" key, if the value is actually TRUE. A missing value already means FALSE. The reason would be, not to teach users to rely on "default" being present.


minor fixup commits pushed.


the rest lgtm
Comment 13 David Woodhouse 2017-06-27 11:40:42 UTC
(In reply to Beniamino Galvani from comment #11)
> I've reworked a bit the branch. The main change is a new
> ipv{4,6}.dns-default property that replaces the "~." domain of the
> previous version.

I'm still really unhappy with this. DNS is *not* a per-protocol thing. It doesn't make sense to have separate ipv4 vs. ipv6 dns defaults, any more than it does to have defaults for different IP ranges (i.e. DNS for 10.x.x.x vs. for 11.x.x.x). (Sure we have those for reverse-DNS but not like this.)

I understand that we have inherited this to a certain extent but I'd really like to avoid entrenching it further.
Comment 14 Thomas Haller 2017-06-29 08:13:01 UTC
(In reply to David Woodhouse from comment #13)
> (In reply to Beniamino Galvani from comment #11)
> > I've reworked a bit the branch. The main change is a new
> > ipv{4,6}.dns-default property that replaces the "~." domain of the
> > previous version.
> 
> I'm still really unhappy with this. DNS is *not* a per-protocol thing. It
> doesn't make sense to have separate ipv4 vs. ipv6 dns defaults, any more
> than it does to have defaults for different IP ranges (i.e. DNS for 10.x.x.x
> vs. for 11.x.x.x). (Sure we have those for reverse-DNS but not like this.)
> 
> I understand that we have inherited this to a certain extent but I'd really
> like to avoid entrenching it further.

that is true.

but we cannot change API, so, all we could do is deprecate the current properties and add a new one. But the old one still has to be supported. It's not clear that this is cleaner in the result.

As for adding new DNS related settings, it probably makes sense to instead add a new "ip" section and place it there. One day, we might deprecated ipv4.dns and ipv6.dns in favor of a ip.dns.
So let's do ^^?
Comment 15 Thomas Haller 2017-06-29 08:14:20 UTC
Or

[network]
dns-default=...


like, [Network] in `man systemd.network`
Comment 16 Beniamino Galvani 2017-06-29 08:46:52 UTC
(In reply to Thomas Haller from comment #14)
> but we cannot change API, so, all we could do is deprecate the current
> properties and add a new one. But the old one still has to be supported.
> It's not clear that this is cleaner in the result.
> 
> As for adding new DNS related settings, it probably makes sense to instead
> add a new "ip" section and place it there. One day, we might deprecated
> ipv4.dns and ipv6.dns in favor of a ip.dns.

Or maybe DNS parameters should have their own "DNS" setting as:

 dns.servers
 dns.domains
 dns.options
 dns.priority
 dns.is-default

The old properties would be copied to the new setting when the connection is normalized, to provide backwards compatibility.

An "IP" setting would instead make sense if there are other protocol-agnostic IP properties we plan to add (but I can't imagine which ones now).
Comment 17 Thomas Haller 2017-06-29 09:23:36 UTC
proxy...

Each setting instance has a large implementation over-head. We could have instead

[network]
proxy-*=
dns-*=


arguably,

  `nmcli connection modify $NAME dns.*`
is much nicer then
  `nmcli connection modify $NAME network.dns-*`

(while nmcli is not required to expose nm-settings exactly the same way as they are in libnm and on D-Bus, it makes sense to do that).


tl;dr: +1 for a "dns" section.
Comment 18 Beniamino Galvani 2018-01-10 14:01:10 UTC
(In reply to Thomas Haller from comment #17)
> tl;dr: +1 for a "dns" section.

Ok, I'll add the new 'default' property to a 'dns' setting and I'll also move the existing connection.mdns property there, since we haven't done any official release that includes it.

I pushed some preliminary patches to branch bg/dns-domains-pt1-bgo746422, please review.
Comment 19 Thomas Haller 2018-01-11 12:55:39 UTC
(In reply to Beniamino Galvani from comment #18)
> (In reply to Thomas Haller from comment #17)
> > tl;dr: +1 for a "dns" section.
> 
> Ok, I'll add the new 'default' property to a 'dns' setting and I'll also
> move the existing connection.mdns property there, since we haven't done any
> official release that includes it.

I agree.

> I pushed some preliminary patches to branch bg/dns-domains-pt1-bgo746422,
> please review.

+         if (search_only && domain_is_routing_only (str))
+              continue;

it's a bit confusing that the parameter is called "search_only", while it compares it with "routing_only". Could you rename "search_only"?
Also, the inverse naming is confusing to me:

  add_dns_domains (array, ip_config, FALSE, FALSE);

has search_only=FALSE, the double-inverse will result in ~all~. Could we rename "search_only" to "with_routing_only" (and inverse meaning)?




+              if (!domain_is_valid (str, FALSE))
+                   continue;

@str has possible a leading tilde. Shouldn't you strip it with "nm_utils_parse_dns_domain (str, NULL)" ?



-    /* If this link is never the default (e.g. only used for resources on this
-     * network) add a routing domain. */
-    route_only =   addr_family == AF_INET
-                 ? !nm_ip4_config_best_default_route_get (NM_IP4_CONFIG (config))
-                 : !nm_ip6_config_best_default_route_get (NM_IP6_CONFIG (config));
-

this behaviour came originally from https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=c4864ba63f4a13e9938a978787490005f5ba48fb. But despite the commit message, I don't understand why we would set routing-only if nm_ip4_config_get_never_default(). I guess, this new behavior makes much more sense to me.



pushed one fixup.
Comment 20 Beniamino Galvani 2018-01-11 14:55:32 UTC
(In reply to Thomas Haller from comment #19)
> (In reply to Beniamino Galvani from comment #18)
> > (In reply to Thomas Haller from comment #17)
> > > tl;dr: +1 for a "dns" section.
> > 
> > Ok, I'll add the new 'default' property to a 'dns' setting and I'll also
> > move the existing connection.mdns property there, since we haven't done any
> > official release that includes it.
> 
> I agree.
> 
> > I pushed some preliminary patches to branch bg/dns-domains-pt1-bgo746422,
> > please review.
> 
> +         if (search_only && domain_is_routing_only (str))
> +              continue;
> 
> it's a bit confusing that the parameter is called "search_only", while it
> compares it with "routing_only". Could you rename "search_only"?
> Also, the inverse naming is confusing to me:
> 
>   add_dns_domains (array, ip_config, FALSE, FALSE);
>
> has search_only=FALSE, the double-inverse will result in ~all~. Could we
> rename "search_only" to "with_routing_only" (and inverse meaning)?

Changed.


> +              if (!domain_is_valid (str, FALSE))
> +                   continue;
> 
> @str has possible a leading tilde. Shouldn't you strip it with
> "nm_utils_parse_dns_domain (str, NULL)" ?

Good catch, fixed.

> -    /* If this link is never the default (e.g. only used for resources on
> this
> -     * network) add a routing domain. */
> -    route_only =   addr_family == AF_INET
> -                 ? !nm_ip4_config_best_default_route_get (NM_IP4_CONFIG
> (config))
> -                 : !nm_ip6_config_best_default_route_get (NM_IP6_CONFIG
> (config));
> -
> 
> this behaviour came originally from
> https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/
> ?id=c4864ba63f4a13e9938a978787490005f5ba48fb. But despite the commit
> message, I don't understand why we would set routing-only if
> nm_ip4_config_get_never_default(). I guess, this new behavior makes much
> more sense to me.

The original behavior didn't make much sense and caused troubles (as in bug 783024 and bug 782469). I think it's better if we add domain as search-domains by default.
Comment 21 Thomas Haller 2018-01-11 15:53:26 UTC
bg/dns-domains-pt1-bgo746422 lgtm
Comment 23 David Woodhouse 2018-03-08 21:24:21 UTC
So... a VPN configuration which wants *all* DNS queries to go to the VPN's nameservers would add '~.' to the dns search list? Or just '~'?
Comment 24 Beniamino Galvani 2018-03-09 08:41:44 UTC
(In reply to David Woodhouse from comment #23)
> So... a VPN configuration which wants *all* DNS queries to go to the VPN's
> nameservers would add '~.' to the dns search list? Or just '~'?

Yes, that is the plan ('~.') for part 2 (not implemented yet).
Comment 25 David Woodhouse 2018-03-09 09:25:41 UTC
Per discussion on IRC, I really don't like this very much. There is too much confusion already.

We should have a *clear* separation of two entirely different fields.

 • Search list

This is the list of domains that will be appended to failing lookups and tried again, before eventually failing. So if my search list contains "company.com, company.internal" and I attempt to look up "foo", then I get to wait while all these lookups are done in turn:

   foo.
   foo.company.com.
   foo.company.internal.

Note that this is an *ordered* list. And it should be kept as short as possible because we don't *want* to wait for all those extra lookups. And it might not be set at all, because we might want people to use fully qualified domain names in hyperlinks and other places, without relying on this auto-completion.

 • Lookup list

This is the list of domains for which this connection's nameserver should be used. For a large company which has merged with lots of others, there may be *many* domains which are only visible internally, or which have schizo-DNS presenting a different view to the internal network. This may contain all of the elements in the search list, but would *also* contain others like "some-company-we-bought.com", "£company-corp.com", etc.

This is an unordered list. And for full-tunnel VPNs we should use the VPN DNS for *everything*, which is the subject of this bug. This is quite a serious security bug, as described in https://bugzilla.redhat.com/show_bug.cgi?id=1553634
Comment 26 David Woodhouse 2018-03-09 10:07:05 UTC
Some further notes:

 • I can make NetworkManager-openconnect provide both 'search domains' and
   'split dns domain' in separate config items. Right now we mache them all
   into the search domains because we know NM abuses that field for both,
   but we should fix that too.

 • We might perhaps take this opportunity to kill the weirdness that we have
   redundant DNS information in *both* IPv4 and IPv6 configs. That was always
   wrong, and a clean break to make it sane might be better than letting it
   persist for ever.

 • For a split-tunnel VPN, it's OK to have the split-dns lookup list default
   to the same as the search list (plus all the reverse domains which we do
   already get right, I believe). It should be possible to *add* entries to
   the split-dns lookup list that aren't in the search list, both manually
   in the config, and automatically from the VPN plugin (see first point above).

 • for a full-tunnel VPN, the split-dns lookup list should cover everything.
   There *might* be an exception list — for example IF we allow routing to
   the local network (cf. bug 767288 which asks for an option not to) then
   the reverse lookup ranges in .ip6.arpa and the split-dns domain list of
   the local network would be the only thing that's permitted *not* to go
   to the VPN's nameservers.


In the general case, DNS lookups for reverse ranges in .ip6.arpa and .in-addr.arpa should go to the nameservers of the connection to which we are routing those addresses. We *mostly* get that right, except for full-tunnel VPN right now.

Forward DNS lookups by default should go to the DNS servers of the connection which has the default route. Again, we don't get that right for full-tunnel VPN right now.

Forward DNS lookups for domains which are listed in some connection's "split-dns domains list" (which I called the 'lookup list' above), could go to that connection's DNS server UNLESS there's a reason not to (like a full tunnel VPN wanting to disable *all* local access as discussed in bug 767288).
Comment 27 David Woodhouse 2018-03-12 12:27:26 UTC
Hm... since commit db9b7e10a for bug 707617, NMIP4Config already *has* separate fields for 'domains' vs. 'searches'.

Surely the "domains" list is already intended to be "routing domains", i.e. "domains for which we should do the DNS lookup to this connection's nameservers"? 
While "searches" is purely the auto-completion one? 

However, we seem to process them all the same. Perhaps we should stop doing that, then we don't need the '~' prefix hack...
Comment 28 Beniamino Galvani 2018-03-22 08:44:56 UTC
What is currently missing in my opinion is a flexible way to decide
which connections are used for default DNS queries (those not matching
any lookup domain).

A possible way to do this is to choose connections that have the
highest value of a new 'dns.default-priority' property. Since we want
to have default values that work for most users, the default value of
the property would be 'auto' (0), which means:

 * 1000 for full-tunnel VPNs
 *  500 for non-VPN connections
 *   -1 for split-tunnel VPNs. -1 means that the connection is never
        used for default DNS lookups

For example, if you have a full-tunnel VPN with search domain
'example.com' and a local connection with search domain 'local.com',
the following entries would be added to dnsmasq:

/example.com/VPN-nameserver
/local.com/local-nameserver
VPN-nameserver                 # default

But if the VPN is split-tunnel (doesn't get the default route):

/example.com/VPN-nameserver
/local.com/local-nameserver
local-nameserver               # default

If you want that all queries go through the full-tunnel VPN with no
exceptions, also set ipvx.dns-priority -1 for the VPN and dnsmasq will
be configured with:

/example.com/VPN-nameserver
VPN-nameserver                 # default

BTW, for ipvx.dns-priority we consider lower values with higher
priority while for dns.default-priority it's the other way around. I
believe doing ipvx.dns-priority that way was a mistake because it is
counterintuitive.

Users can also set custom value for dns.default-priority to tweak the
configuration to their needs.

What do you think? Any other ideas?
Comment 29 Beniamino Galvani 2018-03-22 08:56:40 UTC
(In reply to David Woodhouse from comment #26)
> Some further notes:
> 
>  • I can make NetworkManager-openconnect provide both 'search domains' and
>    'split dns domain' in separate config items. Right now we mache them all
>    into the search domains because we know NM abuses that field for both,
>    but we should fix that too.

You can put them all in the NM_VPN_PLUGIN_IP4_CONFIG_DOMAINS key and prefix routing-only domains with a '~'.
 
>  • We might perhaps take this opportunity to kill the weirdness that we have
>    redundant DNS information in *both* IPv4 and IPv6 configs. That was always
>    wrong, and a clean break to make it sane might be better than letting it
>    persist for ever.

Changing this is a huge pain for users, and developers too (but this probably doesn't matter much). I can't think of a way to achieve a smooth transition from the separate DNS properties into a new common setting.
Comment 30 David Woodhouse 2018-03-22 09:11:45 UTC
> For example, if you have a full-tunnel VPN with search domain
> 'example.com' and a local connection with search domain 'local.com',
> the following entries would be added to dnsmasq:

Please let's not talk about search domains. Those are completely different things, not related to what we're talking about here.

Search domains are purely a way to enable users to be lazy. I can type 'intranet" into my browser and it gets autocompleted to "intranet.company.com", for example.

They (should) have *nothing* to do with the choice of which DNS lookups get sent out on which connection. (Apart from the fact that we're doing this horrid thing with mixing them all together and prefixing one with ~, which is a technical detail.)

A full-tunnel VPN should end up with a *LOOKUP* domain of "" or "*" or "." or however you want to represent the default (currently there's no way for me to specify that even manually to work around this issue, I think). 

I think that implementing the "~." support as suggested in comment 24 and then making full-tunnel VPNs automatically add that, would go a long way to dealing with this problem.

I'm not sure I understand the benefit of adding 'dns.default-priority' too. Is that *purely* a special case of ipvx.dns-priority, except for the "~." lookup domain alone? Does it need special-casing? Can't *all* the lookup domains of a given connection have the same priority, whether they're "~." or "~company.com." ?
Comment 31 David Woodhouse 2018-03-22 09:19:31 UTC
(In reply to Beniamino Galvani from comment #29)
> >  • We might perhaps take this opportunity to kill the weirdness that we have
> >    redundant DNS information in *both* IPv4 and IPv6 configs. That was always
> >    wrong, and a clean break to make it sane might be better than letting it
> >    persist for ever.
> 
> Changing this is a huge pain for users, and developers too (but this
> probably doesn't matter much). I can't think of a way to achieve a smooth
> transition from the separate DNS properties into a new common setting.

It's a short-term pain, which will eventually go away.

Can't we start by shadowing the ipv?.dns-* options into a generic dns.* option set so that they're identical? We can give people a long time to start using the new location, before eventually taking away the old ones.

We definitely shouldn't be making the problem *worse* by doing things like the ~ prefix hack or adding any *more* fields to ipv?.dns-*. Can't we at least add that as dns.lookup-domain even if it's all by itself in the "DNS" settings for now?
Comment 32 David Woodhouse 2018-03-22 09:37:09 UTC
I think this can work without a new default-priority field, and just a simple set of lookup domains per connection in conjunction with the existing dns-priority.

You have a set of { domain, connection, priority } tuples. (Where all lookup domains of a given connection have the *same* priority right now; there's no real need to make them individually configurable I think).

Where multiple lookup rules exist for a given domain, the highest priority (numerically lowest) one wins. And when a given rule is for a domain which is a *subdomain* of another domain with a negative dns-priority, that subdomain also loses (and is dropped).

So your first example in comment 28 would look like this:

{ "example.com", vpn0, 50 }
{ local.com, eth0, 100 }
{ ".", vpn0, 50 }
{ ".", eth0, 100 }  # This one gets dropped due to the previous one

Your second example looks like this (because a split tunnel VPN doesn't add the "~." lookup domain:

{ "example.com", vpn0, 50 }
{ "local.com", eth0, 100 }
{ ".", eth0, 100 }

And your final example looks like this, because the user has set dns-priority=-1:


{ "example.com", vpn0, -1 }
{ local.com, eth0, 100 } # This one gets dropped due to the next one
{ ".", vpn0, -1 }
{ ".", eth0, 100 }  # This one gets dropped due to the previous one
Comment 33 David Woodhouse 2018-03-22 09:39:05 UTC
The idea is that VPNs would automatically have the "~." default lookup domain added, according to the never-default flag.
Comment 34 Beniamino Galvani 2018-03-22 10:22:04 UTC
(In reply to David Woodhouse from comment #31)
> (In reply to Beniamino Galvani from comment #29)
> > >  • We might perhaps take this opportunity to kill the weirdness that we have
> > >    redundant DNS information in *both* IPv4 and IPv6 configs. That was always
> > >    wrong, and a clean break to make it sane might be better than letting it
> > >    persist for ever.
> > 
> > Changing this is a huge pain for users, and developers too (but this
> > probably doesn't matter much). I can't think of a way to achieve a smooth
> > transition from the separate DNS properties into a new common setting.
> 
> It's a short-term pain, which will eventually go away.

Maybe, but existing properties are part of the API and so they will never be
dropped because we don't break API between releases. We'll have to maintain
them, together with the new properties and the code to sync them forever.

> Can't we start by shadowing the ipv?.dns-* options into a generic dns.*
> option set so that they're identical? We can give people a long time to
> start using the new location, before eventually taking away the old ones.

If you mean that ipv4.dns-*, ipv6.dns-* and dns.* should be all identical,
that is a change in behavior and would badly break users.

If by shadowing you mean keeping them is sync (with the dns.* as the union
of ipv4.dns-* and ipv6.dns-*), that is possible but would create some other
problems in my opinion. 
 
> We definitely shouldn't be making the problem *worse* by doing things like
> the ~ prefix hack or adding any *more* fields to ipv?.dns-*. Can't we at
> least add that as dns.lookup-domain even if it's all by itself in the "DNS"
> settings for now?

Ok, we aren't going to add any new properties to ipvx.dns-*. Yes, I think we
can add a dns.lookup-domain property.
Comment 35 David Woodhouse 2018-03-22 12:16:25 UTC
Right, the idea was the dns.* properties would be the union of the ipv4.dns-* and ipv6.dns-* (which right now are treated identically and just appended to each other, I believe?).


Note: it's "lookup domains" we should be using for the proxy setup we poke into PacRunner, not "search domains". Can we fix that too please?
Comment 36 Beniamino Galvani 2018-03-26 16:40:20 UTC
I pushed branch bg/dns-bgo746422. I think it should solve the leaks in
case of full-tunnel VPNS when using dnsmasq or systemd-resolved. It
basically:

 - allows users to specify that connections get the default lookup
   domain through the special entry "~."

 - automatically adds "~." to connections with the default route

 - applies rules from comment 32 to decide which domains are used
   based on DNS priority.

Other things that I didn't do, but can be done later (if necessary)
are:

 - as noticed by Thomas, you can't override the NM decision of
   automatically adding "~." to connections with default route.
   Actually, you can add "~." to another connection with a lower
   priority value, and it will override the "~." added by NM on the
   first connection. Perhaps this is enough. Otherwise we could
   introduce another special domain.

 - I haven't added a new dns setting with duplicates of the existing
   ipvx properties. I think it's really a different issue and should
   be solved separately.

 - Also I didn't add the dns.lookup-domain property, as the "~."
   domain is sufficient and it is just another case of the
   routing-only domains we already support.

What do you think?
Comment 37 Thomas Haller 2018-03-26 17:21:21 UTC
> dns: dnsmasq: fix adding multiple domains

can you add a "Fixes" comment?


> core: allow '~.' domain in ip4 and ip6 configs
    
+    if (!nm_streq (search, "~.")) {
          len = strlen (search);
          if (search[len - 1] == '.')
               search[len - 1] = 0;
+    }

this seems wrong, for example if "search" is "~..".
Either, do a full normalization and drop such entires (or clean up the duplicate dots). Or: normalize "~." to "~" too. It's odd that the special domain is treated entirely different regardingt the trailing dot.


> dns: use dns-priority to provide a preprocessed domain list to plugins

          _LOGD ("update-dns: updating plugin %s", plugin_name);
+         rebuild_domain_lists (self);
          if (!nm_dns_plugin_update (plugin,

do we need to rebuild the list every time? We know when an update changes something. Can we cache the result, generate it when needed, and clear it when something changes?


+    struct {
+         const char **search;
+         char **reverse;
+    } domains;
 } NMDnsIPConfigData;


I think these are leaked.


> dns: dnsmasq: honor dns-priority
    
+    int addr_family, i, j, num;

let's use guint for index variables of arrays? Also matches 
  num = nm_ip_config_get_num_nameservers (ip_config);



> dns: sd-resolved: honor dns-priority
    
-         NMIPConfig *ip_config = elem->data;
+         NMDnsIPConfigData *data = elem->data;
+         NMIPConfig *ip_config = data->ip_config;

this seems wrong.



Overall, lgtm
Comment 38 Beniamino Galvani 2018-03-28 12:29:49 UTC
(In reply to Thomas Haller from comment #37)
> > dns: dnsmasq: fix adding multiple domains
> 
> can you add a "Fixes" comment?

Done.

> > core: allow '~.' domain in ip4 and ip6 configs
>     
> +    if (!nm_streq (search, "~.")) {
>           len = strlen (search);
>           if (search[len - 1] == '.')
>                search[len - 1] = 0;
> +    }
> 
> this seems wrong, for example if "search" is "~..".
> Either, do a full normalization and drop such entires (or clean up the
> duplicate dots). Or: normalize "~." to "~" too. It's odd that the special
> domain is treated entirely different regardingt the trailing dot.

What do you mean by "full normalization"? 

We can convert 'domain.' into 'domain' because it's equivalent, but
'domain..' or 'my..domain' are invalid and they should be dropped. I
think this would be the best approach.

If we normalize '~.' to '~', once we strip the ~ prefix the domain
would become empty, which is not desirable in my opinion.

> > dns: use dns-priority to provide a preprocessed domain list to plugins
> 
>           _LOGD ("update-dns: updating plugin %s", plugin_name);
> +         rebuild_domain_lists (self);
>           if (!nm_dns_plugin_update (plugin,
> 
> do we need to rebuild the list every time? We know when an update changes
> something. Can we cache the result, generate it when needed, and clear it
> when something changes?

If nothing changes, the DNS configuration hash will be the same and
update_dns() won't be called at all by nm_dns_manager_end_updates().

Ok, nm_dns_manager_set_hostname() calls update_dns() without checking
the hash, but do we need another caching mechanism different from the
existing one just for this case?

> +    struct {
> +         const char **search;
> +         char **reverse;
> +    } domains;
>  } NMDnsIPConfigData;
>
> I think these are leaked.

Ops, fixed.

> > dns: dnsmasq: honor dns-priority
>     
> +    int addr_family, i, j, num;
> 
> let's use guint for index variables of arrays? Also matches 
>   num = nm_ip_config_get_num_nameservers (ip_config);

Ok.

> > dns: sd-resolved: honor dns-priority
>     
> -         NMIPConfig *ip_config = elem->data;
> +         NMDnsIPConfigData *data = elem->data;
> +         NMIPConfig *ip_config = data->ip_config;
> 
> this seems wrong.

Why? Looks fine to me.
Comment 39 Thomas Haller 2018-03-28 12:50:01 UTC
(In reply to Beniamino Galvani from comment #38)
> (In reply to Thomas Haller from comment #37)

> What do you mean by "full normalization"? 
> 
> We can convert 'domain.' into 'domain' because it's equivalent, but
> 'domain..' or 'my..domain' are invalid and they should be dropped. I
> think this would be the best approach.

I mean, to handle "my..domain". Either normalize the double . away, or verify it and drop it as invalid. If you don't do either, then "~.." will end up being treated like "~." which is wrong.

> If we normalize '~.' to '~', once we strip the ~ prefix the domain
> would become empty, which is not desirable in my opinion.

Yes, after dropping the default domain becomes "". It's not that you currently use the domain "." as-is. You do:

  nm_streq (domain, ".") ? NULL : domain)

that could also be:

  nm_streq (domain, "") ? NULL : domain)

or

#define DEFAULT_DOMAIN ""
  nm_streq (domain, DEFAULT_DOMAIN) ? NULL : domain)

(maybe a define is in order either way).


> > > dns: use dns-priority to provide a preprocessed domain list to plugins
> > 
> >           _LOGD ("update-dns: updating plugin %s", plugin_name);
> > +         rebuild_domain_lists (self);
> >           if (!nm_dns_plugin_update (plugin,
> > 
> > do we need to rebuild the list every time? We know when an update changes
> > something. Can we cache the result, generate it when needed, and clear it
> > when something changes?
> 
> If nothing changes, the DNS configuration hash will be the same and
> update_dns() won't be called at all by nm_dns_manager_end_updates().
> 
> Ok, nm_dns_manager_set_hostname() calls update_dns() without checking
> the hash, but do we need another caching mechanism different from the
> existing one just for this case?

Maybe the hashing mechanism is anyway ugly, and should be dropped (in a future commit). Instead of implementing a SHA-hashing of ~some~ parameters, implement a cmp() function. It's more efficient, and easier to get right.


> > > dns: sd-resolved: honor dns-priority
> >     
> > -         NMIPConfig *ip_config = elem->data;
> > +         NMDnsIPConfigData *data = elem->data;
> > +         NMIPConfig *ip_config = data->ip_config;
> > 
> > this seems wrong.
> 
> Why? Looks fine to me.

you are right.
Comment 40 Beniamino Galvani 2018-04-08 09:09:45 UTC
(In reply to Thomas Haller from comment #39)
> (In reply to Beniamino Galvani from comment #38)
> > (In reply to Thomas Haller from comment #37)
> 
> > What do you mean by "full normalization"? 
> > 
> > We can convert 'domain.' into 'domain' because it's equivalent, but
> > 'domain..' or 'my..domain' are invalid and they should be dropped. I
> > think this would be the best approach.
> 
> I mean, to handle "my..domain". Either normalize the double . away, or
> verify it and drop it as invalid. If you don't do either, then "~.." will
> end up being treated like "~." which is wrong.
> 
> > If we normalize '~.' to '~', once we strip the ~ prefix the domain
> > would become empty, which is not desirable in my opinion.
> 
> Yes, after dropping the default domain becomes "". It's not that you
> currently use the domain "." as-is. You do:
> 
>   nm_streq (domain, ".") ? NULL : domain)
> 
> that could also be:
> 
>   nm_streq (domain, "") ? NULL : domain)
> 
> or
> 
> #define DEFAULT_DOMAIN ""
>   nm_streq (domain, DEFAULT_DOMAIN) ? NULL : domain)
> 
> (maybe a define is in order either way).

Yeah, considering "" internally as the wildcard domain simplifies things a bit. I've updated the branch.

> Maybe the hashing mechanism is anyway ugly, and should be dropped (in a
> future commit). Instead of implementing a SHA-hashing of ~some~ parameters,
> implement a cmp() function. It's more efficient, and easier to get right.

A agree, this could be an improvement (for the future).
Comment 41 Thomas Haller 2018-04-13 11:45:05 UTC
> core: allow '~.' domain in ip4 and ip6 configs

commit bebafff7844228d018d17f0d11c5d6035dbf6a91,
lgtm, but the commit message no longer seams to match, does it?


> dns: use dns-priority to provide a preprocessed domain list to plugins

+    c_list_for_each_entry (ip_data, _ip_config_lst_head (self), ip_config_lst) {

the list head is accessed on every iteration. Could we pre-cache the value like:

    CList *ip_config_lst_head;

    ip_config_lst_head = _ip_config_lst_head (self);

    c_list_for_each_entry (ip_data, ip_config_lst_head, ip_config_lst) {

    (and below again)


+         int i, n, n_domains = 0;

these variables are used to iterate over a number of guint values. Like

  n = nm_ip_config_get_num_searches (ip_config);

where get_num_searches() returns a guint. Could we consistently use the correct type? (yes, I know that C might optimize signed for loops better, because it assumes that signed cannot overflow. But IMO consistent use of types is more important than what the compiler might do).


+         priority = nm_ip_config_get_dns_priority (ip_config);
+         nm_assert (priority != 0);

this invariant is not enforced by NMIPxConfig, so, you basically assert that all callers that create an NMIPxConfig properly initialize priority to a non-zero value. It asserts something that is two layers of code away. That is non-obvious. Not sure what to do about this. Ok, fine as is :)


add the wilcard domain to all non-VPN
        ^^^^^^^


+    parent_priority = GPOINTER_TO_INT (g_hash_table_lookup (ht, "."));
+    if (parent_priority < 0 && parent_priority < priority) {
+         *out_parent = ".";
+         *out_parent_priority = parent_priority;

is "." still right? Should this be "" to mean the wildcard domain?




could you add a
   nm_assert (!g_hash_table_contains (ht, domain));
to domain_is_shadowed() -- because that one is already checked before calling domain_is_shadowed().



         g_free (ip_data->domains.search);

I always like avoiding unnecessary copies. But in this case, "search" array will point inside strings owned by NMIPxConfig. That seems quite fragile to me. Should we not clone them? In practice it's not necessary, but it feels fragile. What can we do about that?



»···/* Free the array and return NULL if the only element was the ending NULL */
»···strv = (char **) g_ptr_array_free (domains, (domains->len == 1));

»···return _nm_utils_strv_cleanup (strv, FALSE, FALSE, TRUE);

skip_repeated=TRUE likely won't help much, because duplicated domains are probably not sorted after each other. Maybe sort them first? Maybe by cherry-picking "shared: add nm_utils_strv_sort() helper" from https://github.com/NetworkManager/NetworkManager/pull/91
Comment 42 Michael Biebl 2018-04-14 11:11:17 UTC
Will those changes be pulled into the nm-1-10 branch as well?
Comment 43 Beniamino Galvani 2018-04-30 12:32:23 UTC
(In reply to Thomas Haller from comment #41)
> > core: allow '~.' domain in ip4 and ip6 configs
> 
> commit bebafff7844228d018d17f0d11c5d6035dbf6a91,
> lgtm, but the commit message no longer seams to match, does it?

> the list head is accessed on every iteration. Could we pre-cache the value
> like:

> +         int i, n, n_domains = 0;
> 
> these variables are used to iterate over a number of guint values. Like
> 
>   n = nm_ip_config_get_num_searches (ip_config);
> 
> where get_num_searches() returns a guint. Could we consistently use the
> correct type?

> add the wilcard domain to all non-VPN

> is "." still right? Should this be "" to mean the wildcard domain?

> could you add a
>    nm_assert (!g_hash_table_contains (ht, domain));
> to domain_is_shadowed() -- because that one is already checked before
> calling domain_is_shadowed().

Fixed all the above.


> I always like avoiding unnecessary copies. But in this case, "search" array
> will point inside strings owned by NMIPxConfig. That seems quite fragile to
> me. Should we not clone them? In practice it's not necessary, but it feels
> fragile. What can we do about that?

Now the list is cleared just after nm_dns_plugin_update() returns to ensure
that elements in the list don't become stale. What do you think?

> »···/* Free the array and return NULL if the only element was the ending
> NULL */
> »···strv = (char **) g_ptr_array_free (domains, (domains->len == 1));
> 
> »···return _nm_utils_strv_cleanup (strv, FALSE, FALSE, TRUE);
> 
> skip_repeated=TRUE likely won't help much, because duplicated domains are
> probably not sorted after each other. Maybe sort them first? Maybe by
> cherry-picking "shared: add nm_utils_strv_sort() helper" from
> https://github.com/NetworkManager/NetworkManager/pull/91

I don't understand, duplicate domains don't have to be consecutive with
current _nm_utils_strv_cleanup() implementation.
Comment 44 Beniamino Galvani 2018-04-30 12:33:39 UTC
I've added some NM-CI tests to ensure the behavior is the one we expect:

https://github.com/NetworkManager/NetworkManager-ci/pull/180
Comment 45 Beniamino Galvani 2018-04-30 12:42:40 UTC
(In reply to Michael Biebl from comment #42)
> Will those changes be pulled into the nm-1-10 branch as well?

Let's wait until the branch is merged to master and see if there are
any issues. If there are no complaints, we can think about backporting
it to a stable branch.
Comment 46 Thomas Haller 2018-04-30 13:18:56 UTC
> I don't understand, duplicate domains don't have to be consecutive with
> current _nm_utils_strv_cleanup() implementation.

How do you mean?

calling "_nm_utils_strv_cleanup (strv, FALSE, FALSE, TRUE);" with skip_repeated=TRUE will remove "consecutive duplicates". That doesn't seem useful, because most of the time (given how the array is constructed), duplicates are not consecutive.
so either: do not drop any duplicates and do skip_repeated=FALSE
or: drop all duplicates, by sorting the array first, followed by skip_repeated=TRUE.
Comment 47 Lubomir Rintel 2018-05-14 07:36:41 UTC
bg/dns-domains-bgo746422 LGTM
Comment 48 Lubomir Rintel 2018-05-14 07:59:27 UTC
That is, bg/dns-bgo746422 LGTM
Comment 50 David Woodhouse 2018-05-14 14:44:19 UTC
FWIW the original situation appears to be fairly broken regardless of the security situation. I've lost count of the number of times in my recent travels that the DNS servers of the airport/hotel/etc. in which I find myself are *not* directly on the local subnet. And thus aren't accessible as soon as I join the VPN.

If that situation is going to be allowed to persist in any form (especially if it's the default and we have to take special action to give the VPN nameservers top priority), then we should fix that by explicitly adding routes to them, as I've been having to do manually:

 $ ip route
default dev vpn0  proto static  scope link  metric 50 
default via 10.246.8.1 dev wlp2s0  proto static  metric 600 
8.8.4.4 via 10.246.8.1 dev wlp2s0 
8.8.8.8 via 10.246.8.1 dev wlp2s0
Comment 51 Olivier Tilloy 2018-10-15 15:02:19 UTC
(In reply to Beniamino Galvani from comment #45)
> (In reply to Michael Biebl from comment #42)
> > Will those changes be pulled into the nm-1-10 branch as well?
> 
> Let's wait until the branch is merged to master and see if there are
> any issues. If there are no complaints, we can think about backporting
> it to a stable branch.

Now that this has been in master and 1.12 for a few months, is this something you would consider backporting to 1.10 ?
Comment 52 Thomas Haller 2018-10-15 18:35:16 UTC
(In reply to Olivier Tilloy from comment #51)
> (In reply to Beniamino Galvani from comment #45)
> > (In reply to Michael Biebl from comment #42)
> > > Will those changes be pulled into the nm-1-10 branch as well?
> > 
> > Let's wait until the branch is merged to master and see if there are
> > any issues. If there are no complaints, we can think about backporting
> > it to a stable branch.
> 
> Now that this has been in master and 1.12 for a few months, is this
> something you would consider backporting to 1.10 ?

Can you describe who would pick up the upstream patches if they get backported? 
Would 1.10 branch be far enough?

Would you then rebase to a minor 1.10.z release, or would you just cherry-pick the patches?
Comment 53 David Woodhouse 2018-10-15 21:03:59 UTC
I'd like to see them in a Fedora release, which is preferably via a 1.10.z release rather than just cherry-picking the patches.

Also Ubuntu but they basically never fix anything so I suppose that's unlikely. I'd still like it in a 1.10.z release so they don't have any *excuse* though.
Comment 54 Sebastien Bacher 2018-10-16 08:39:10 UTC
@David, I don't think that comment is either constructive or true. We have somewhat limited resources and struggle a bit sometime to keep up with network-manager bugfixes/update but we would backport a fix for this bug to our stable serie if there is one landing in the 1.10 vcs.
Comment 55 Beniamino Galvani 2018-10-17 15:46:55 UTC
The backport is quite complicated due to several changes that made the
DNS code in nm-1-12 diverge from nm-1-10. The th/policy-and-mdns
branch [1] alone changed:

 src/dns/nm-dns-dnsmasq.c           | 330 +++++++++++++--------------------
 src/dns/nm-dns-manager.c           | 581 +++++++++++++++++++++++++++++++++-------------------------
 src/dns/nm-dns-systemd-resolved.c  | 219 ++++++++++++++--------

so, unless we decide to also backport that branch, the backport of the
fix requires reimplementing several non-trivial patches on top of
1.10, with the high risk of breaking things.

If we also backport the th/policy-and-mdns branch, we would need to
backport a total of 30+ commits, including mDNS support and related
libnm API.

Any opinions? This latter approach seems doable, but it is a big
change for a minor release.

[1] https://cgit.freedesktop.org/NetworkManager/NetworkManager/log/?qt=range&q=5eea9be983bd95f5a482e483a438a100b80c716f
Comment 56 Thomas Haller 2018-10-17 16:27:36 UTC
(In reply to Beniamino Galvani from comment #55)
>
> Any opinions? This latter approach seems doable, but it is a big
> change for a minor release.

Backporting th/policy-and-mdns too, seems favourable to me. As long, as the cherry-picks are trivial, the result will all be code which is on master and tested for a while already. I'd rather cherry-pick more patches just to make them apply, instead of reimplementing them. YMMV.
Comment 57 Olivier Tilloy 2018-11-16 18:48:22 UTC
I confirm that the Ubuntu desktop team would be interested in picking up the update if this fix and related branches were to be backported to the 1.10 branch.

Is this something we can reasonably expect?
Comment 59 Olivier Tilloy 2018-11-19 21:51:26 UTC
Excellent, thank you Thomas!
Comment 60 Beniamino Galvani 2018-11-20 10:58:33 UTC
The fix is included also in the 1.10.14 release.
Comment 61 Olivier Tilloy 2018-11-20 21:25:17 UTC
That's handy. The backport to Ubuntu 18.04 is being tracked by https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/1754671.
Comment 62 David Woodhouse 2019-02-04 22:37:35 UTC
I've finally managed to test the Ubuntu 18.04 backport; apologies for the delay.

At first it seemed to work. I tried an internal domain which isn't the main $COMPANY.com and isn't in the default search domains for the VPN, and it worked after the upgrade.

Some hours later, I was unable to get a Kerberos ticket because various other DNS lookups, even within the main $COMPANY.com domain, were being done on the local network and not the VPN.

I manually set ipv4.dns-priority=-1 and ipv4.dns-search=~. but this is a full-tunnel VPN; surely I shouldn't have needed to do that part manually?

(Sebastian, apologies for the somewhat grumpy comment earlier. I am very happy to be proved wrong.)
Comment 63 David Woodhouse 2019-02-04 22:38:28 UTC
Gr, and apologies too for spelling your name wrong, Sebastien.
Comment 64 Beniamino Galvani 2019-02-05 08:18:22 UTC
Hi,

can you please attach the log file generated in this way:

 # nmcli general logging level trace
 # journalctl -u NetworkManager -f > dns-log.txt &
 # killall -HUP NetworkManager
 # kill %1

when the DNS is wrongly configured? Thanks.
Comment 65 David Woodhouse 2019-02-05 09:44:20 UTC
Before I do that, do we agree that I shouldn't have needed to manually set dns-priority and dns-search as described in comment 62, for a VPN connection which is full-tunnel?
Comment 66 Beniamino Galvani 2019-02-05 10:00:35 UTC
(In reply to David Woodhouse from comment #65)
> Before I do that, do we agree that I shouldn't have needed to manually set
> dns-priority and dns-search as described in comment 62, for a VPN connection
> which is full-tunnel?

Yes, the full-tunnel VPN should have higher priority.

We have an automated CI test to check this scenario:

 https://github.com/NetworkManager/NetworkManager-ci/blob/master/nmcli/features/dns.feature#L403

and it seems to be working well.
Comment 67 David Woodhouse 2019-02-05 11:37:09 UTC
Ack, then I will remove those settings and test again. With dnsmasq, OK? This can never work with plain resolv.conf anyway.

Thanks.
Comment 68 Beniamino Galvani 2019-02-05 11:50:56 UTC
(In reply to David Woodhouse from comment #67)
> Ack, then I will remove those settings and test again. With dnsmasq, OK?

Yes, thanks.
Comment 69 David Woodhouse 2019-08-21 09:03:43 UTC
Without the ipv4.dns-priority and ipv4.dns-search settings being manually set, I see the wired device still being used for its own search domain. This is what causes the problem.

It seems to work for me if I set *only* ipv4.dns-priority=-1 though.

My biggest practical problem here was that we had set ipv4.dns-priority and ipv4.dns-search options in an emergency deployment to our users after Ubuntu shipped the 1.10.14 update to 18.04 to fix this... and then when they *pulled* the update, new installations got the new config but older NM, and that didn't work correctly either.

I'm going to experiment with *just* setting ipv4.dns-priority=-1 and not ipv4.dns-search. Do we expect that to work for everyone, whether they have the updated package or not? 

We should be setting ipv4.dns-priority=1 by default for full-tunnel VPNs, but at least if I can find a way to work around it that works for everyone, that'll be an improvement.
Comment 70 David Woodhouse 2019-08-21 11:33:44 UTC
No, ipv4.dns-priority=-1 breaks older NM because there, there's no default routing domain ~. and not even any way to set it. So lookups outside the VPN domains are just refused instead of being leaked to the local nameservers as they were before.

As it happens, we configure our DHCP client to always override the discovered search domains to our corporate AD domain dom.company.com, so I can work around NM's failure to automatically set ipv4.dns-priority=-1 a different way — by manually configuring the VPN with ipv4.dns-domains=dom.company.com;company.com;. That works for both old and new NetworkManager.
Comment 71 Beniamino Galvani 2019-08-22 07:51:33 UTC
> My biggest practical problem here was that we had set
> ipv4.dns-priority and ipv4.dns-search options in an emergency
> deployment to our users after Ubuntu shipped the 1.10.14 update to
> 18.04 to fix this... and then when they *pulled* the update, new
> installations got the new config but older NM, and that didn't work
> correctly either.

> I'm going to experiment with *just* setting ipv4.dns-priority=-1 and
> not ipv4.dns-search. Do we expect that to work for everyone, whether
> they have the updated package or not?

Yes, this should work the same on 1.10.14 and 1.12+ releases.

> We should be setting ipv4.dns-priority=1 by default for full-tunnel
> VPNs, but at least if I can find a way to work around it that works
> for everyone, that'll be an improvement.

Do you mean -1? Why? This will cause an opposite problem: local
queries leaked to the VPN, i.e. I look up nas.myhome and the query
goes through the VPN.

> No, ipv4.dns-priority=-1 breaks older NM because there, there's no default routing domain ~. and not even any way to set it

Which version is 'older'?
Comment 72 David Woodhouse 2019-08-22 10:25:44 UTC
In my case, 'older' is 1.10.6. At this point I'm just trying to get things working for my users on Ubuntu 18.04. So I'm comparing 1.10.6 (the current Ubuntu package) with 1.10.14 (the update, that was briefly shipped in 18.04 updates and then withdrawn).

When the 1.10.14 update went out, it broke our users because lookups for the dom.company.com AD domain were now always going to the local network (because of our dhclient config override). We immediately rolled out a manual setting of ipv4.dns-priority=-1 (yes, not =1) to fix it.

Then the Ubuntu 1.10.14 update was withdrawn, and new installations got 1.10.6 which *doesn't* work with ipv4.dns-priority=-1, because there's no ~. routing domain and thus lookups outside the VPN domains don't work at all.

(We don't care about local lookups not working. In fact we'd prefer local lookups *not* to work, which might be part of the reason we've overridden the search domain obtained by DHCP. If we could stop *routing* to the local network from working, we would. It's an open feature request, I believe.)

But *because* I know the search domain on the local physical network will always be just 'dom.company.com', adding that explicitly in the VPN config is enough and I don't need ipv4.dns-priority=-1. Now I have a config that works with both 1.10.6 and 1.10.14. So I can wait for the regressions that caused the 1.10.14 update to be pulled, to get fixed and then maybe later I can set ipv4.dns-priority=-1 again.



Anyway, we now know that the reason my initial testing of the Ubuntu 1.10.14 backport was unsuccessful, was because of the search domain on the *local* network. I've been able to work around that, and things are now working OK.
Comment 73 André Klapper 2020-11-12 14:31:19 UTC
bugzilla.gnome.org is being shut down in favor of a GitLab instance. 
We are closing all old bug reports and feature requests in GNOME Bugzilla which have not seen updates for a long time.

If you still use NetworkManager and if you still see this bug / want this feature in a recent and supported version of NetworkManager, then please feel free to report it at https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/

Thank you for creating this report and we are sorry it could not be implemented (workforce and time is unfortunately limited).