Bug 731014 – [review] lr/udev-unmanaged-fd731014: Don't treat veth/vmnet/VirtualBox/etc as NMDeviceEthernet (and create a default wired connection)

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 731014 - [review] lr/udev-unmanaged-fd731014: Don't treat veth/vmnet/VirtualBox/etc as NMDeviceEthernet (and create a default wired connection)


Summary:	[review] lr/udev-unmanaged-fd731014: Don't treat veth/vmnet/VirtualBox/etc as...


Status:	RESOLVED FIXED

Product:	NetworkManager
Classification:	Platform
Component:	general
Version:	unspecified
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	Lubomir Rintel
QA Contact:	NetworkManager maintainer(s)

URL:
Whiteboard:

Duplicates:	740526 (view as bug list)
Depends on:
Blocks:	nm-review nm-1-2 743546

Reported:	2014-05-30 20:45 UTC by Colin Walters
Modified:	2015-05-28 18:40 UTC

See Also:	725647
GNOME target:	---
GNOME version:	---

Attachments
log messages from a single docker run (8.69 KB, text/plain) 2014-06-13 21:36 UTC, Colin Walters		Details
core: make veth devices default-unmanaged for now (1021 bytes, patch) 2014-06-24 14:51 UTC, Dan Winship	none	Details \| Review
mark virtual ethernet devices as unmanaged (2.04 KB, patch) 2014-06-30 00:36 UTC, Michael Biebl	none	Details \| Review
mark virtual ethernet devices as unmanaged (1.98 KB, patch) 2014-06-30 00:37 UTC, Michael Biebl	none	Details \| Review
mark virtual ethernet devices as unmanaged (1.60 KB, patch) 2014-06-30 00:38 UTC, Michael Biebl	needs-work	Details \| Review

Description Colin Walters 2014-05-30 20:45:12 UTC

Docker creates a veth device every time I start a container.

It looks like NetworkManager tries to do DHCP on it, and also hits an internal assertion.

Docker basically owns both parts of this link - we need some way to tell NM to ignore it.

Comment 1 Pavel Simerda 2014-05-30 21:41:06 UTC

Which version of NM?

Comment 2 Colin Walters 2014-05-31 12:13:27 UTC

NetworkManager-0.9.9.1-13.git20140326.4dba720.el7.x86_64

Comment 3 Pavel Simerda 2014-06-01 09:41:08 UTC

May be worth considering for 0.9.10.

Comment 4 Dan Williams 2014-06-10 13:54:03 UTC

(In reply to comment #0)
> Docker creates a veth device every time I start a container.
> 
> It looks like NetworkManager tries to do DHCP on it, and also hits an internal
> assertion.
> 
> Docker basically owns both parts of this link - we need some way to tell NM to
> ignore it.

Yeah, NM shouldn't try to control it or touch it, but we do want to still recognize the device and report its IP configuration and such.

There's two ways to deal with this:

1) move NMDeviceVeth from a subclass of NMDeviceEthernet to a subclass of NMDeviceGeneric, which would make the 'veth' device default-unmanaged.  The reason NMDeviceVeth was a subclass of NMDeviceEthernet was that (a) we didn't have generic devices yet, and (b) it was useful for testing or something like that.  But it's really not an ethernet device.

2) enhance no-auto-default to include drivers

Since we want to do #1 anyway, let's just do #1.  It might have some side-effects, such as NM not configuring interfaces by default inside some VMs or something.  Anything else anyone can think of?

Comment 5 Pavel Simerda 2014-06-10 17:54:58 UTC

(In reply to comment #4)
> 1) move NMDeviceVeth from a subclass of NMDeviceEthernet to a subclass of
> NMDeviceGeneric, which would make the 'veth' device default-unmanaged.  The
> reason NMDeviceVeth was a subclass of NMDeviceEthernet was that (a) we didn't
> have generic devices yet, and (b) it was useful for testing or something like
> that.  But it's really not an ethernet device.

Wouldn't it be possible to just make NMDevicVeth default-unmanaged while keeping it a subclass of Ethernet for all other purposes?

Comment 6 Dan Winship 2014-06-10 18:14:31 UTC

(In reply to comment #4)
> 1) move NMDeviceVeth from a subclass of NMDeviceEthernet to a subclass of
> NMDeviceGeneric, which would make the 'veth' device default-unmanaged.  The
> reason NMDeviceVeth was a subclass of NMDeviceEthernet was that (a) we didn't
> have generic devices yet, and (b) it was useful for testing or something like
> that.  But it's really not an ethernet device.

No, NMDeviceVeth was originally going to be NMDeviceGeneric, but we switched it to being a subclass of NMDeviceEthernet instead because otherwise NM wouldn't work correctly inside containers.

The problem is that both ends of the veth are identical, but we want to treat them differently depending on whether it is the inside-the-container end or the outside-the-container end.

(In reply to comment #5)
> Wouldn't it be possible to just make NMDevicVeth default-unmanaged while
> keeping it a subclass of Ethernet for all other purposes?

Yes. So, ideally, we would do this iff we were not in a container.

Comment 7 Dan Winship 2014-06-10 18:17:52 UTC

Presumably the inside of the container only sees one end of the veth? So we could say, if the veth's peer exists, then make it default-unmanaged, but if it doesn't, treat it as ethernet

Comment 8 Dan Williams 2014-06-13 20:41:35 UTC

Is there a problem with requiring minimal network configuration inside the container?  I'd presume that systemd requires this already, and while it would be nice to automatically detect that we're inside a container and treat veth differently, why is veth special?  shouldn't this "inside container" behavior apply to any kind of software device inside a container?

Should part of container setup, even with Docker, drop a quick config file into /etc/NetworkManager/system-connections/ (or an ifcfg file) that explicitly wants DHCP on the interface?

Docker people might be amenable to that... Colin, what's the "this would be our perfect world" behavior that you're looking for on the Docker front for network configuration inside the container?

Comment 9 Dan Williams 2014-06-13 20:48:20 UTC

systemd does various things to detect whether it's running inside a container or not, see src/shared/virt.c::detect_container().  That includes:

- Looking at /proc/vz and /proc/bc
- Looking at /run/systemd/container

which really isn't that complicated.  If we decide to special-case inside-the-container, we should probably keep the same mechanisms that systemd does to detect it.

Also note that systemd sources indicate that udev isn't run inside a container?  So we may need to modify some stuff in nm-platform to account for that.

Comment 10 Colin Walters 2014-06-13 21:05:06 UTC

(In reply to comment #8)
> Is there a problem with requiring minimal network configuration inside the
> container? 

Yes.  Broadly speaking, the Docker world is driving towards containers that are *just* a single app, and don't carry along Unix with them.  Example: Docker containers shouldn't rely on on a running cron or logrotate.  If they have periodic jobs, they need to schedule them internally.  Log files should go out to syslog or in general be configured by the ops side at runtime.

So, to answer your question: it's really not expected in the Docker world for containers to run dhclient or to do ip configuration.

The network configuration is handled by the Docker runtime.

> I'd presume that systemd requires this already, and while it would
> be nice to automatically detect that we're inside a container and treat veth
> differently, why is veth special?  shouldn't this "inside container" behavior
> apply to any kind of software device inside a container?

No no, I'm talking about NetworkManager *outside* of a container reacting to the creation of one half of the bridge veth device and doing DHCP on it.

> Should part of container setup, even with Docker, drop a quick config file into
> /etc/NetworkManager/system-connections/ (or an ifcfg file) that explicitly
> wants DHCP on the interface?

I don't think anyone right now is thinking of having containers internally drive network configuration.

> Docker people might be amenable to that... Colin, what's the "this would be our
> perfect world" behavior that you're looking for on the Docker front for network
> configuration inside the container?

There is a lot of potential for better integration of NetworkManager and docker on the *host* - particularly for complex network setups, and most particularly with respect to firewalling/port forwarding.  But that's all host side.

Comment 11 Colin Walters 2014-06-13 21:05:41 UTC

(In reply to comment #9)

> Also note that systemd sources indicate that udev isn't run inside a container?
>  So we may need to modify some stuff in nm-platform to account for that.

Nooo =)  This is about NetworkManager on the host, nothing to do with running NM inside a container.

Comment 12 Colin Walters 2014-06-13 21:36:37 UTC

Created attachment 278427 [details]
log messages from a single docker run

Comment 13 Dan Winship 2014-06-13 23:23:44 UTC

(In reply to comment #10)
> I don't think anyone right now is thinking of having containers internally
> drive network configuration.

"Containers" != "Docker". The original requests for veth support in NetworkManager (bug 700822, bug 687254) were about using NetworkManager in LXC containers, and having NM treat the "inside" end of the veth as an ordinary ethernet device. (We dropped the ball a bit though in that we knew we didn't ever want to treat the "outside" as ordinary ethernet, but we didn't implement that.)

The fact that veths are completely symmetric makes this impossible to do non-heuristically though. The patches on the old bugs used the rule that veths named "veth*" are "outside" and veths with other names are "inside". Maybe we could have a conf variable for specifying that (using "ignore_veth=veth*" as the default value). Then if NM finds a veth, it checks if it's inside or outside a container; if outside, it treats the veth as default-unmanaged no matter what. If inside, it checks the veth's name, and treats it as default-unmanaged only if it matches the ignore_veth setting.

(We could also get a patch into the kernel to allow differentiating the two ends somehow so we could do this better in the future.)

(In reply to comment #9)
> If we decide to special-case
> inside-the-container, we should probably keep the same mechanisms that systemd
> does to detect it.

It doesn't export that information somehow?

Comment 14 Colin Walters 2014-06-14 14:42:38 UTC

(In reply to comment #13)
> (In reply to comment #10)
> > I don't think anyone right now is thinking of having containers internally
> > drive network configuration.
> 
> "Containers" != "Docker". 

True.

> The original requests for veth support in
> NetworkManager (bug 700822, bug 687254) were about using NetworkManager in LXC
> containers, and having NM treat the "inside" end of the veth as an ordinary
> ethernet device. (We dropped the ball a bit though in that we knew we didn't
> ever want to treat the "outside" as ordinary ethernet, but we didn't implement
> that.)

I see.

> The fact that veths are completely symmetric makes this impossible to do
> non-heuristically though.

True, though NM does know if it's *not* in a container.  Perhaps the best way to do this right now is to check whether /sys is mounted read-only.  That's how udev does it:

[Unit]
Description=udev Kernel Device Manager
...
ConditionPathIsReadWrite=/sys

Unfortunately systemd is unaware of docker containerization right now:

$ docker run fedora:20 systemd-detect-virt -c
none
$

The /sys thing isn't ideal for a variety of reasons; for example, if NM shouldn't be writing to /sys then it might be denied by SELinux policy.  A better way to check writablility would be via mount flags, as I do in:
https://git.gnome.org/browse/ostree/commit/?id=cb43d2942f99bb9200b212e0b76397ba66ab70ee

> The patches on the old bugs used the rule that veths
> named "veth*" are "outside" and veths with other names are "inside".

I'm not enthusiastic about naming heuristics.

> (We could also get a patch into the kernel to allow differentiating the two
> ends somehow so we could do this better in the future.)

Differentiating how?  You could create two containers with a private bridge between them.  In that case there's no explicit "host" side.

> It doesn't export that information somehow?

There's ConditionVirtualization in the unit file, but I don't think an exposed API.

Comment 15 Dan Winship 2014-06-14 16:50:37 UTC

(In reply to comment #14)
> > The fact that veths are completely symmetric makes this impossible to do
> > non-heuristically though.
> 
> True, though NM does know if it's *not* in a container.

Right. I guess once we figure out the right way to answer "am I in a container", then we can fix NM to always ignore veths outside containers, and worry about the inside-container case separately.

> The /sys thing isn't ideal for a variety of reasons; for example, if NM
> shouldn't be writing to /sys then it might be denied by SELinux policy.

Yeah, is udev really checking /sys writability because it's using it as a heuristic for "in a container", or is it just checking it because it can't do anything useful if /sys isn't writable, regardless of the reason why it's non-writable?

bug 687254 comment 25 suggests that there is not any single heuristic that would detect all container types. Fun!

> > (We could also get a patch into the kernel to allow differentiating the two
> > ends somehow so we could do this better in the future.)
> 
> Differentiating how?  You could create two containers with a private bridge
> between them.  In that case there's no explicit "host" side.

Yes, I was thinking there'd be a flag you could set on either or both peers, giving four possibilities ("A is host, B is container", "B is host, A is container", "Neither peer has the flag set, so we don't know what the intent was" (the default state), "Both peers have the flag set, so [insert useful semantic here]"). Alternatively, the kernel could just enforce that the flag can be set on 0 or 1 of the peers, but not both.

Comment 16 Pavel Simerda 2014-06-19 10:24:17 UTC

(In reply to comment #14)
> True, though NM does know if it's *not* in a container.  Perhaps the best way
> to do this right now is to check whether /sys is mounted read-only.  That's how
> udev does it:

What if you have a container inside a container? Isn't it a bad practice to check whether you're in a container and then assume you cannot run containers.

> > The patches on the old bugs used the rule that veths
> > named "veth*" are "outside" and veths with other names are "inside".
> 
> I'm not enthusiastic about naming heuristics.

I would guess it's more reliable than /sys, given the issue above.

> > (We could also get a patch into the kernel to allow differentiating the two
> > ends somehow so we could do this better in the future.)
> 
> Differentiating how?  You could create two containers with a private bridge
> between them.  In that case there's no explicit "host" side.

Would be best if the virtualization tools could somehow *mark* the host side if the link is between a host and a guest. After all it's the host side that's expected to be treated different than ordinary ethernet.

Comment 17 Colin Walters 2014-06-19 11:38:48 UTC

(In reply to comment #16)

> Would be best if the virtualization tools could somehow *mark* the host side if
> the link is between a host and a guest. After all it's the host side that's
> expected to be treated different than ordinary ethernet.

From what I can tell of the kernel veth side, there's no ability to attach any metadata at all; basically you can specify the peer side, that's it.  We could add something, but that would then require an updated docker to set that property, which is a lot of deployment latency.

So...

(In reply to comment #7)
> Presumably the inside of the container only sees one end of the veth? So we
> could say, if the veth's peer exists, then make it default-unmanaged, but if it
> doesn't, treat it as ethernet

Right.  That may work best as an immediate term fix.  We'll still end up with NM trying to do DHCP on the host docker0 veth side, but that's less embarrassing as it only happens once on boot.

Comment 18 Dan Winship 2014-06-19 13:04:04 UTC

(In reply to comment #16)
> What if you have a container inside a container?

a) is that actually possible?

b) containers and veths are currently implemented in such a way that they don't give us enough information to be able to do the right thing in every possible configuration. Given that we have no reports of people running NM inside a container inside a container, I'm not going to worry about that case for now.

(In reply to comment #17)
> > Presumably the inside of the container only sees one end of the veth? So we
> > could say, if the veth's peer exists, then make it default-unmanaged, but if it
> > doesn't, treat it as ethernet
> 
> Right.  That may work best as an immediate term fix.

As mentioned later on, comments in the other bug suggest that this isn't how it works; both the inside and the outside of the container see both sides of the veth.

> We'll still end up with
> NM trying to do DHCP on the host docker0 veth side, but that's less
> embarrassing as it only happens once on boot.

No, the suggestion is that on the host side we'd ignore all veths.

Comment 19 Colin Walters 2014-06-19 13:36:23 UTC

(In reply to comment #18)
>
> As mentioned later on, comments in the other bug suggest that this isn't how it
> works; both the inside and the outside of the container see both sides of the
> veth.

Not with Docker containers at least:

# docker run fedora:20 ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
14: eth0: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether fa:61:c5:a8:71:c7 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.3/16 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::f861:c5ff:fea8:71c7/64 scope link tentative 
       valid_lft forever preferred_lft forever
#

> No, the suggestion is that on the host side we'd ignore all veths.

That does seems reasonable, *if* we can determine the "host" side reliably.

Comment 20 Dan Williams 2014-06-19 21:21:24 UTC

Updated title to reflect that this isn't just veth, it's also a problem with VMWare and VirtualBox software interfaces; they are also treated as Ethernet.

Perhaps we can broaden the reach here and just check whether they are virtual (eg stat /sys/class/virtual/<ifname>) or something.

Comment 21 Dan Williams 2014-06-19 21:27:06 UTC

(In reply to comment #18)
> As mentioned later on, comments in the other bug suggest that this isn't how it
> works; both the inside and the outside of the container see both sides of the
> veth.

Not with network namespaces, potentially.  If the container is running in its own network namespace, then I think best practice is to move the container-owned veth into the containers namespace, otherwise obviously the container couldn't see it.  In that case, all NM sees is that *something* is attached to the other side of the veth interface, but it gets an ifindex that doesn't point to any known netdev that NM can see (since it's in the other namespace).

Comment 22 Dan Williams 2014-06-19 21:36:53 UTC

Regardless of above discussion, I think we certainly should:

1) move the Veth interface to be NMDeviceGeneric not NMDeviceWired
2) recognize that things we think are NMDeviceEthernet might be "virtual" and treat those as Generic devices
3) re-evaluate whether some things that used to be NMDeviceWired subclasses (like veth) might want Default Wired connections even though they are no longer NMDeviceEthernet subclasses, and possibly generalize default wired connections to cover those too

For example, we have some special casing for "easytether" Android devices, which is an out-of-kernel module that lots of people used to use for tethering; in that case a Default Wired connection is pretty useful, since the MAC address and interface name of that device change every time you plug the phone in.  We might also want to create a default wired connection for veth devices inside a container if we detect we're in one.

But doing these things would prevent the issues Colin is seeing with Docker too.

Thoughts?

Comment 23 Dan Winship 2014-06-23 18:52:38 UTC

(In reply to comment #21)
> (In reply to comment #18)
> > As mentioned later on, comments in the other bug suggest that this isn't how it
> > works; both the inside and the outside of the container see both sides of the
> > veth.
> 
> Not with network namespaces, potentially.

OK, so yeah, if you use network namespaces, then the container would only see one side of the veth. Unfortunately, the host *also* only sees one side of the veth. So "can only see one side" is only a useful heuristic if we also can tell if we're in or out.

(In reply to comment #22)
> Regardless of above discussion, I think we certainly should:
> 
> 1) move the Veth interface to be NMDeviceGeneric not NMDeviceWired

Why? We can make it default-unmanaged while still having it as a subclass of NMDeviceWired.


For now (0.9.10), I think maybe we want to just make veths always default-unmanaged. This will make NM-inside-containers more awkward again, but it will fix the docker case (and even the outside-the-container side of the NM-inside-containers case).

Comment 24 Dan Winship 2014-06-24 14:51:15 UTC

Created attachment 279123 [details] [review]
core: make veth devices default-unmanaged for now

We only want to treat a veth device as ethernet if (a) NM is running
inside a container, and (b) the veth in question is the "inside" end
of the veth, not the outside. Unfortunately, we don't have good
heuristics for this at the moment, so just ignore veths for now.

Comment 25 Dan Williams 2014-06-26 19:45:34 UTC

(In reply to comment #23)
> (In reply to comment #21)
> > (In reply to comment #18)
> > > As mentioned later on, comments in the other bug suggest that this isn't how it
> > > works; both the inside and the outside of the container see both sides of the
> > > veth.
> > 
> > Not with network namespaces, potentially.
> 
> OK, so yeah, if you use network namespaces, then the container would only see
> one side of the veth. Unfortunately, the host *also* only sees one side of the
> veth. So "can only see one side" is only a useful heuristic if we also can tell
> if we're in or out.

Yeah, which is why we need to add some namespace and container detection at some point here, so we're smarter about this kind of stuff.

> (In reply to comment #22)
> > Regardless of above discussion, I think we certainly should:
> > 
> > 1) move the Veth interface to be NMDeviceGeneric not NMDeviceWired
> 
> Why? We can make it default-unmanaged while still having it as a subclass of
> NMDeviceWired.

What does keeping it as NMDeviceWired bring us then, when it's really a software device?  Possibly some additional D-Bus properties in the API, but other than that?

> For now (0.9.10), I think maybe we want to just make veths always
> default-unmanaged. This will make NM-inside-containers more awkward again, but
> it will fix the docker case (and even the outside-the-container side of the
> NM-inside-containers case).

This is fine with me if it fixes the problem.  Colin?

Also, I think one reason veth was originally treated as NMDeviceWired was so that people could test NM more reasonably in a VM because veth was almost-ethernet but still software that could simulate real hardware to some degree.  We'll lose that too.

Comment 26 Colin Walters 2014-06-27 12:43:22 UTC

(In reply to comment #25)

> Yeah, which is why we need to add some namespace and container detection at
> some point here, so we're smarter about this kind of stuff.

I'm a bit uncertain to the degree to which this should be built-in versus configuration.
 
> This is fine with me if it fixes the problem.  Colin?
 
It likely will, if you remind me next week I can test it.

> Also, I think one reason veth was originally treated as NMDeviceWired was so
> that people could test NM more reasonably in a VM because veth was
> almost-ethernet but still software that could simulate real hardware to some
> degree.  We'll lose that too.

For VMs there's virtio-net.  I assume you mean test NM in a container, right.

Comment 27 Michael Biebl 2014-06-27 20:01:15 UTC

The proposed patch doesn't cover virtual network devices in general, afaics.

At least with the patch applied, NM still tries to manage virtualbox and vmware network interfaces.

Comment 28 Dan Williams 2014-06-27 20:46:12 UTC

(In reply to comment #24)
> Created an attachment (id=279123) [details] [review]
> core: make veth devices default-unmanaged for now
> 
> We only want to treat a veth device as ethernet if (a) NM is running
> inside a container, and (b) the veth in question is the "inside" end
> of the veth, not the outside. Unfortunately, we don't have good
> heuristics for this at the moment, so just ignore veths for now.

One thing I forgot to ask about veth vs. NMDeviceWired, why is veth special here compared to VirtualBox or VMWare software interfaces?  It feels like they should all be treated the same way, either they are all subclasses of NMDeviceWired, or they are all not...  this question is relevant for git master at least, but probably not 0.9.10.

As always there's a simple workaround for these problems, though it doesn't require manual intervention:

no-auto-default=*

Comment 29 Dan Williams 2014-06-27 20:46:35 UTC

(In reply to comment #28)
> As always there's a simple workaround for these problems, though it doesn't

That should read "though it *does*"...

Comment 30 Dan Winship 2014-06-29 15:07:49 UTC

(In reply to comment #25)
> What does keeping it as NMDeviceWired bring us then, when it's really a
> software device?  Possibly some additional D-Bus properties in the API, but
> other than that?

If it's always default-unmanaged then it doesn't really matter. But once we make it sometimes not default-unmanaged, it means that it would look like an ordinary ethernet interface to clients. Eg, it would show up in nm-applet. Also, it would get an auto-default connection. Whereas if we made it NMDeviceGeneric, then either it wouldn't, or else we'd have to change the auto-default code in order to make it happen.

> Also, I think one reason veth was originally treated as NMDeviceWired was so
> that people could test NM more reasonably in a VM because veth was
> almost-ethernet but still software that could simulate real hardware to some
> degree.  We'll lose that too.

Right, but anyone trying to do that would already be running into problems with NM trying to auto-default one or both ends of the veth on the outside too, so it's not like that feature was really working anyway.

Comment 31 Dan Winship 2014-06-29 15:13:58 UTC

(In reply to comment #27)
> The proposed patch doesn't cover virtual network devices in general, afaics.
> 
> At least with the patch applied, NM still tries to manage virtualbox and vmware
> network interfaces.

NM does not currently have any support for recognizing virtualbox and vmware devices. Patches for that are welcome.

IIRC, with vmware at least, the inside-the-vm interface actually looks like a real ethernet card, and so we don't need to do anything special there; all we need to do is just ignore the outside-the-vm interface, right? I assume virtualbox is probably the same. So NMPlatform just needs to recognize them and give them their own NMLinkType values, and then NMManager should automatically make them NMDeviceGeneric.

Comment 32 Michael Biebl 2014-06-29 23:55:11 UTC

(In reply to comment #31)
> (In reply to comment #27)
> > The proposed patch doesn't cover virtual network devices in general, afaics.
> > 
> > At least with the patch applied, NM still tries to manage virtualbox and vmware
> > network interfaces.
> 
> NM does not currently have any support for recognizing virtualbox and vmware
> devices. Patches for that are welcome.

The virtualbox and vmware ethernet interfaces show up under 
/sys/devices/virtual/net/

vbox:
udevadm info --query=path --path=/sys/class/net/vboxnet0
/devices/virtual/net/vboxnet0

vmware:
udevadm info --query=path --path=/sys/class/net/vmnet1
/devices/virtual/net/vmnet1


Might be a good idea to treat all virtual interfaces (besides lo, well lo is special anyway) as unmanaged.

Comment 33 Michael Biebl 2014-06-30 00:35:45 UTC

Dan, I've attached a PoC patch which marks devices that match   /devices/virtual/net/ as unmanaged.

Comment 34 Michael Biebl 2014-06-30 00:36:25 UTC

Created attachment 279566 [details] [review]
mark virtual ethernet devices as unmanaged

Comment 35 Michael Biebl 2014-06-30 00:37:28 UTC

Created attachment 279567 [details] [review]
mark virtual ethernet devices as unmanaged

Comment 36 Michael Biebl 2014-06-30 00:38:41 UTC

Created attachment 279568 [details] [review]
mark virtual ethernet devices as unmanaged

Comment 37 Colin Walters 2014-07-01 22:31:43 UTC

(In reply to comment #24)
> Created an attachment (id=279123) [details] [review]
> core: make veth devices default-unmanaged for now

This patch did *not* work for me.  I'm as yet unsure why.  My testing methodology was to add this patch to the Fedora rawhide NetworkManager, recompose an Atomic Docker host, reboot, docker run fedora:20 echo hello world.

Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> Activation (veth57da) Stage 2 of 5 (Device Configure) complete.
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> Activation (veth57da) Stage 2 of 5 (Device Configure) successful.
Jul 01 17:49:29 localhost.localdomain kernel: SELinux: initialized (dev devpts, type devpts), uses mountpoint labeling
Jul 01 17:49:29 localhost.localdomain kernel: SELinux: initialized (dev tmpfs, type tmpfs), uses mountpoint labeling
Jul 01 17:49:29 localhost.localdomain kernel: SELinux: initialized (dev tmpfs, type tmpfs), uses mountpoint labeling
Jul 01 17:49:29 localhost.localdomain kernel: SELinux: initialized (dev sysfs, type sysfs), uses genfs_contexts
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> (veth57da): device state change: prepare -> config (reason 'none') [40 50 0]
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> Activation (veth57da) Stage 2 of 5 (Device Configure) starting...
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> Activation (veth57da) Stage 1 of 5 (Device Prepare) complete.
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> Activation (veth57da) Stage 2 of 5 (Device Configure) scheduled...
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> (veth57da): device state change: disconnected -> prepare (reason 'none') [30 40 0]
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> Activation (veth57da) Stage 1 of 5 (Device Prepare) started...
Jul 01 17:49:29 localhost.localdomain docker[1186]: [c9e08627] -job start(e442651b940d6f2a8d58d8d4d254725588261c987a7d2610bf483f5f5df7c431) = OK (0)
Jul 01 17:49:29 localhost.localdomain kernel: docker0: port 1(veth57da) entered forwarding state
Jul 01 17:49:29 localhost.localdomain kernel: docker0: port 1(veth57da) entered forwarding state
Jul 01 17:49:29 localhost.localdomain kernel: IPv6: ADDRCONF(NETDEV_CHANGE): veth57da: link becomes ready
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> Activation (veth57da) Stage 1 of 5 (Device Prepare) scheduled...
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> Activation (veth57da) starting connection 'veth57da'
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> (veth57da): device state change: unavailable -> disconnected (reason 'connection-assumed') [20 30 41]
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> (docker0): link connected
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> (veth57da): link connected
Jul 01 17:49:29 localhost.localdomain kernel: IPv6: ADDRCONF(NETDEV_UP): veth57da: link is not ready
Jul 01 17:49:29 localhost.localdomain kernel: device veth57da entered promiscuous mode
Jul 01 17:49:29 localhost.localdomain kernel: netlink: 1 bytes leftover after parsing attributes in process `docker'.
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> (veth57da): device state change: unmanaged -> unavailable (reason 'connection-assumed') [10 20 41]
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info>     read connection 'veth57da'
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> (docker0): IPv6 config waiting until carrier is on
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> (veth57da): enslaved to docker0
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> (docker0): bridge port veth57da was attached
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> (veth57da): exported as /org/freedesktop/NetworkManager/Devices/4
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> (veth57da): new Veth device (driver: 'unknown' ifindex: 5)
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> (veth57da): carrier is OFF
Jul 01 17:49:29 localhost.localdomain systemd-udevd[1299]: Could not apply link config to veth659f
Jul 01 17:49:29 localhost.localdomain systemd-udevd[1300]: Could not apply link config to veth57da
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> (veth659f): exported as /org/freedesktop/NetworkManager/Devices/3
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> (veth659f): new Veth device (driver: 'unknown' ifindex: 4)
Jul 01 17:49:29 localhost.localdomain NetworkManager[704]: <info> (veth659f): carrier is OFF

Comment 38 Dan Williams 2014-07-02 19:47:46 UTC

(In reply to comment #37)
> (In reply to comment #24)
> > Created an attachment (id=279123) [details] [review] [details] [review]
> > core: make veth devices default-unmanaged for now
> 
> This patch did *not* work for me.  I'm as yet unsure why.  My testing
> methodology was to add this patch to the Fedora rawhide NetworkManager,
> recompose an Atomic Docker host, reboot, docker run fedora:20 echo hello world.

(Took me a second to figure out the logs are reversed, oldest entries are at the bottom and newest at the top...)

Are you sure it did not work?  Remember, we *do* expect NM to notice and seamlessly manage (without touching!) all interfaces on the system.

It looks like the patch correctly suppresses the "Wired connection 1" creation, which is the bug we all talked about originally.  In the logs you show here, NM is correctly "assuming" the connection of the docker veth interfaces:

NetworkManager[704]: <info> (veth57da): device state change: unmanaged -> unavailable (reason 'connection-assumed') [10
20 41]
NetworkManager[704]: <info>     read connection 'veth57da'

which means that NM is reading the existing configuration and not touching it.

Does the behavior with the patch actually cause any problems?  It should have caused problems *without* the patch, which you would see by the presence of "Wired connection 1" connections for the veth interfaces.

Comment 39 Dan Williams 2014-07-02 20:20:52 UTC

(In reply to comment #24)
> Created an attachment (id=279123) [details] [review]
> core: make veth devices default-unmanaged for now
> 
> We only want to treat a veth device as ethernet if (a) NM is running
> inside a container, and (b) the veth in question is the "inside" end
> of the veth, not the outside. Unfortunately, we don't have good
> heuristics for this at the moment, so just ignore veths for now.

I pushed this patch to git master and nm-0-9-10.

Leaving this bug open to discuss mbiebl's additional patches and further issues around virtual interfaces.

Comment 40 Colin Walters 2014-07-03 12:53:18 UTC

Ok, the "Activating..." log entries made me think there was something else going on.

So...all of these log messages are basically NM telling us it's not doing anything?  =/

Comment 41 Dan Williams 2014-07-03 16:19:04 UTC

(In reply to comment #40)
> Ok, the "Activating..." log entries made me think there was something else
> going on.
> 
> So...all of these log messages are basically NM telling us it's not doing
> anything?  =/

Yeah, basically.  I suppose they are more relevant when NM *is* actually configuring the interface.  "Assumed" connections go through the same steps as configured connections, so the same log messages get printed.

Comment 42 John Conroy 2014-07-20 00:37:38 UTC

(In reply to comment #36)
> Created an attachment (id=279568) [details] [review]
> mark virtual ethernet devices as unmanaged

I'm using VMware Workstation and have applied this patch against 0.9.9.95 as a test.  The patch did flag the interface as "unmanaged" upon creation but NM was still attempting to take over the interface.  Turns out, the ifcfg-rh plugin (I'm running CentOS 7) was resetting the flag.

Creating the appropriate config files in /etc/sysconfig/network-scripts/ (ifcfg-vmnet1, ifcfg-vmnet8, etc) with the following lines fixed the issue for me (but the plugin should probably ignore virtual interfaces when no configs are present):

DEVICE=vmnet1
NAME=vmnet1
ONBOOT=no
NM_CONTROLLED=no

Comment 43 Dan Winship 2014-07-25 13:38:18 UTC

Comment on attachment 279568 [details] [review]
mark virtual ethernet devices as unmanaged

You don't need to manually fetch its sysfs path from gudev; NMDevice already has it. Just call nm_device_get_udi().

Comment 44 Michael Biebl 2014-07-25 13:43:34 UTC

Thanks for the review, Dan.

I also noticed that the patch is not sufficient.
The assume functionality still seems to kick in *after* the device has been marked as unmanaged. So the device is shown to the user and can be disabled

See e.g. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=755015

Comment 45 John Conroy 2014-07-25 17:29:39 UTC

(In reply to comment #44)
> Thanks for the review, Dan.
> 
> I also noticed that the patch is not sufficient.
> The assume functionality still seems to kick in *after* the device has been
> marked as unmanaged. So the device is shown to the user and can be disabled
> 
> See e.g. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=755015


See comment #42, the assuming of control (at least for me) was being performed as the plugins read their configuration files.  The plugins would also require updating to prevent the assumption of control of unmanaged devices or the devices will require config files turning off NM control for that device.

Comment 46 Dan Winship 2014-11-15 14:00:22 UTC

doesn't make sense for this to be blocking the nm-0.9.10 tracker...

Comment 47 Domen Kožar 2014-11-27 14:02:35 UTC

Attached patch does not remove unmanaged interfaces from nm-applet. In NixOS, we have 50 vboxnet interfaces that bloat the nm-applet window and thus prevent us from upgrading until this is fixed. Happy to test any changes :)

Comment 48 Dan Williams 2014-12-03 22:39:02 UTC

(In reply to comment #47)
> Attached patch does not remove unmanaged interfaces from nm-applet. In NixOS,
> we have 50 vboxnet interfaces that bloat the nm-applet window and thus prevent
> us from upgrading until this is fixed. Happy to test any changes :)

Instead of having NM ignore them, the appropriate fix for this is to selectively ignore certain interfaces in the applet.  The decision needs to be pushed to UI clients, not the NM core.  So another bug for nm-applet to ignore specific kinds of interfaces would be good, if you can file one!  (or clone this one using the tiny link at the bottom-right corner of this page and set the component to "applet").

Comment 49 Domen Kožar 2014-12-16 15:35:51 UTC

See https://bugzilla.gnome.org/show_bug.cgi?id=705197

Comment 50 Lubomir Rintel 2015-01-22 17:00:05 UTC

As the veth and software-devices part is now taken care of, I'm hijacking this for the rest (that is -- out-of-tree vbox/vmware drivers that masquerade as ordinary ethernets).

After some discussion with Thomas it seems that making this configurable with udev properties and blacklisting the drivers via udev rules seems to be a straightforward and flexible fix.

Please review/comment.

http://cgit.freedesktop.org/NetworkManager/NetworkManager/log/?h=lr/udev-unmanaged-fd731014

Thank you!

Comment 51 Thomas Haller 2015-01-22 17:19:04 UTC

>> platform: give the platform an opportunity to override default-unmanaged
    
fake platform should return FALSE for link_get_unmanaged().

otherwise LGTM

Comment 52 Michael Biebl 2015-01-22 18:17:15 UTC

(In reply to comment #50)
> As the veth and software-devices part is now taken care of, I'm hijacking this
> for the rest (that is -- out-of-tree vbox/vmware drivers that masquerade as
> ordinary ethernets).
> 
> After some discussion with Thomas it seems that making this configurable with
> udev properties and blacklisting the drivers via udev rules seems to be a
> straightforward and flexible fix.
> 
> Please review/comment.
> 
> http://cgit.freedesktop.org/NetworkManager/NetworkManager/log/?h=lr/udev-unmanaged-fd731014
> 

I like the idea. This would work for me.
I've tested the (two) patches, and they seem to do the trick.

Thanks

Comment 53 Dan Williams 2015-01-22 23:01:56 UTC

So this looks OK in general to me, but lets get danw's opinion too...

Should we also revert the veth change (ebeaeaed4d3180c27cc059fe0ff199153fa9ec24) and just ship udev rules for it in a separate package?  That would allow containers to *not* ship the rule and achieve default-DHCP inside the container, but NM on the host to ship the rule and leave the docker interface alone.

However, one drawback of this approach is that if we bundle all the rules into a single udev file, users cannot selectively enable/disable this for certain classes of interfaces, because the NM package owns the udev rules and it'll get overwritten on the next update for a package managed system.  Also, at least in RPM land, we shouldn't mark the file %config either because then we can never update the file if the user touches it.

Maybe each interface type should have a separate rules file?

Comment 54 Dan Williams 2015-01-22 23:12:12 UTC

mbiebl points out that customization can be done by the user adding a rule to /etc/udev/rules.d that overrides the NM-shipped rule, which sets NM_UNMANAGED to 0.  So ignore my comment about that.

So the oonly question I have is whether we should revert ebeaeaed4d3180c27cc059fe0ff199153fa9ec24 and convert that to a udev rule that could be disabled by user overrides?

Comment 55 Michael Biebl 2015-01-23 12:41:25 UTC

As mentioned on IRC, extending the list is quite easy by shipping a file as /etc/udev/rules.d/99-my-custom-interfaces.rules.

Overriding the complete list is easy by creating a file with the same name in etc, i.e. /etc/udev/rules.d/70-nm-unmnaged.rules.


Overriding specific entries from /lib/udev/rules.d/70-nm-unmanaged.rules to mark them managed again, should be simple as well. For example, say you wanted the vmware interfaces be managed by NM again, you can ship a file like /etc/udev/rules.d/99-vmware-managed.rules, containing

SUBSYSTEM=="net", ENV{INTERFACE}=="vmnet[0-9]*", ENV{NM_UNMANAGED}="0"


I think, it's important that this NM_UNMANAGED variable is documented in the NetworkManager man page

Comment 56 Pavel Simerda 2015-01-23 14:21:34 UTC

Ah, the udev way looks neat!

Comment 57 Dan Winship 2015-01-23 15:21:29 UTC

I don't think we should use default-unmanaged for this. Add a new NM_UNMANAGED_PLATFORM or NM_UNMANAGED_UDEV bit. (And update both the code and the commit messages accordingly.)

>+# These are out-of-tree drivers that create Ethernet devices, however
>+# they ship with their own management tools.
>+SUBSYSTEM=="net", ENV{INTERFACE}=="vboxnet[0-9]*", ENV{NM_UNMANAGED}="1"
>+SUBSYSTEM=="net", ENV{INTERFACE}=="vmnet[0-9]*", ENV{NM_UNMANAGED}="1"

Maybe add comments clarifying that the first is VirtualBox and the second is VMWare.

Also, would it be better to do ENV{NM_CONTROLLED}="no" rather than ENV{NM_UNMANAGED}="1", so that it matches the syntax in ifcfg-rh? (I'm not convinced it would, I'm just wondering.)

(In reply to comment #53)
> Should we also revert the veth change
> (ebeaeaed4d3180c27cc059fe0ff199153fa9ec24) and just ship udev rules for it in a
> separate package?

NMPlatform doesn't wait for udev rules to be processed on software devices, so that wouldn't work, would it? (I don't remember why NMPlatform behaves that way though...)

Comment 58 Dan Williams 2015-01-23 16:18:31 UTC

(In reply to comment #57)
> I don't think we should use default-unmanaged for this. Add a new
> NM_UNMANAGED_PLATFORM or NM_UNMANAGED_UDEV bit. (And update both the code and
> the commit messages accordingly.)
> 
> >+# These are out-of-tree drivers that create Ethernet devices, however
> >+# they ship with their own management tools.
> >+SUBSYSTEM=="net", ENV{INTERFACE}=="vboxnet[0-9]*", ENV{NM_UNMANAGED}="1"
> >+SUBSYSTEM=="net", ENV{INTERFACE}=="vmnet[0-9]*", ENV{NM_UNMANAGED}="1"
> 
> Maybe add comments clarifying that the first is VirtualBox and the second is
> VMWare.
> 
> Also, would it be better to do ENV{NM_CONTROLLED}="no" rather than
> ENV{NM_UNMANAGED}="1", so that it matches the syntax in ifcfg-rh? (I'm not
> convinced it would, I'm just wondering.)
> 
> (In reply to comment #53)
> > Should we also revert the veth change
> > (ebeaeaed4d3180c27cc059fe0ff199153fa9ec24) and just ship udev rules for it in a
> > separate package?
> 
> NMPlatform doesn't wait for udev rules to be processed on software devices, so
> that wouldn't work, would it? (I don't remember why NMPlatform behaves that way
> though...)

I recall it works that way because udev processing actually does take a while, and when NM creates a software interface it needs the interface to be available in the platform immediately so it can continue with interface setup and dependency stuff (like system_create_virtual_device() in the activation paths when creating the parent).

Also, previously udev was only used for detecting hardware stuff like DEVTYPE or the driver, and for software devices that's all known via netlink so udev rules were unecessary.

But yes, you bring up a great point and I think this would be a problem, but wouldn't it also be a problem for *any* software device including virtualbox/VMWare too?  If they create an interface, the kernel event will come in long before the udev event, which means NM might completely miss the fact that the interface should be unmanaged...

Possibly the only way around that is to use config options instead of udev rules, since we have those available immediately?

Comment 59 Thomas Haller 2015-01-23 17:14:19 UTC

(In reply to comment #58)
> (In reply to comment #57)

> > NMPlatform doesn't wait for udev rules to be processed on software devices, so
> > that wouldn't work, would it? (I don't remember why NMPlatform behaves that way
> > though...)
> 
> I recall it works that way because udev processing actually does take a while,
> and when NM creates a software interface it needs the interface to be available
> in the platform immediately so it can continue with interface setup and
> dependency stuff (like system_create_virtual_device() in the activation paths
> when creating the parent).
> 
> Also, previously udev was only used for detecting hardware stuff like DEVTYPE
> or the driver, and for software devices that's all known via netlink so udev
> rules were unecessary.
> 
> But yes, you bring up a great point and I think this would be a problem, but
> wouldn't it also be a problem for *any* software device including
> virtualbox/VMWare too?  If they create an interface, the kernel event will come
> in long before the udev event, which means NM might completely miss the fact
> that the interface should be unmanaged...
> 
> Possibly the only way around that is to use config options instead of udev
> rules, since we have those available immediately?

I think at least vboxnet0 looks like an ethernet device.

Also, for software devices we could leave them default-unmanged by default
 1.) unless we created them ourselves
 2.) or until we have response from udev.
the latter would mean, we might clear the unmanaged-flag even after construction of the device instance (previously we didn't).


(In reply to comment #57)
> I don't think we should use default-unmanaged for this. Add a new
> NM_UNMANAGED_PLATFORM or NM_UNMANAGED_UDEV bit. (And update both the code and
> the commit messages accordingly.)

NM_UNMANAGED_DEFAULT behaves differently from other unmanaged flags, in that a device still can be activated on user-request. In that sense, NM_UNMANAGED_PLATFORM should behave more like NM_UNMANAGED_DEFAULT.
For now, I think that differentiation is unnecessary.

Comment 60 Dan Winship 2015-01-23 18:53:38 UTC

(In reply to comment #58)
> But yes, you bring up a great point and I think this would be a problem, but
> wouldn't it also be a problem for *any* software device including
> virtualbox/VMWare too?

As Thomas said, NM thinks those devices are hardware (which is why we need this special casing in the first place; if we could already recognize that they were software, then we could just treat them like veth).

(In reply to comment #59)
> > I don't think we should use default-unmanaged for this. Add a new
> > NM_UNMANAGED_PLATFORM or NM_UNMANAGED_UDEV bit. (And update both the code and
> > the commit messages accordingly.)
> 
> NM_UNMANAGED_DEFAULT behaves differently from other unmanaged flags, in that a
> device still can be activated on user-request. In that sense,
> NM_UNMANAGED_PLATFORM should behave more like NM_UNMANAGED_DEFAULT.
> For now, I think that differentiation is unnecessary.

But I don't think we want that behavior for these devices; the discussion above was "if the user wants NM to configure the devices, they can override/uninstall the udev files". So just make them fully-unmanaged, and if the user doesn't want that, they can make them fully-managed.

(Also, unmanaged-default is pure concentrated evil.)

Comment 61 Thomas Haller 2015-01-24 18:04:08 UTC

(In reply to comment #60)
> (In reply to comment #58)
> > But yes, you bring up a great point and I think this would be a problem, but
> > wouldn't it also be a problem for *any* software device including
> > virtualbox/VMWare too?
> 
> As Thomas said, NM thinks those devices are hardware (which is why we need this
> special casing in the first place; if we could already recognize that they were
> software, then we could just treat them like veth).
> 
> (In reply to comment #59)
> > > I don't think we should use default-unmanaged for this. Add a new
> > > NM_UNMANAGED_PLATFORM or NM_UNMANAGED_UDEV bit. (And update both the code and
> > > the commit messages accordingly.)
> > 
> > NM_UNMANAGED_DEFAULT behaves differently from other unmanaged flags, in that a
> > device still can be activated on user-request. In that sense,
> > NM_UNMANAGED_PLATFORM should behave more like NM_UNMANAGED_DEFAULT.
> > For now, I think that differentiation is unnecessary.
> 
> But I don't think we want that behavior for these devices; the discussion above
> was "if the user wants NM to configure the devices, they can override/uninstall
> the udev files". So just make them fully-unmanaged, and if the user doesn't
> want that, they can make them fully-managed.
> 
> (Also, unmanaged-default is pure concentrated evil.)

A fully unmanaged device also does not assume connections. But assuming external connections is a great feature to expose runtime information and run dispatcher scripts.
I think we really want an UNMANGED_DEFAULT behavior.

-- on the other hand, probably we also would like to assume connections for fully unmanaged devices, to get these features. But that should be done in a later step and is a different discussion.

For now, setting UNMANAGED_DEFAULT seems good to me -- but eventually, I think we should refactor the code to separate the UNMANGAED_DEFAULT flag from the other UNMANGED flags (as they have different meaning)

Comment 62 Lubomir Rintel 2015-01-26 16:51:02 UTC

(In reply to comment #51)
> >> platform: give the platform an opportunity to override default-unmanaged
> 
> fake platform should return FALSE for link_get_unmanaged().
> 
> otherwise LGTM

Whoops, it was get_managed() before, but I decided to reverse it to match "default-unmanaged" and forgot to reverse it.

Fixed.

(In reply to comment #53)
> So this looks OK in general to me, but lets get danw's opinion too...
> 
> Should we also revert the veth change
> (ebeaeaed4d3180c27cc059fe0ff199153fa9ec24) and just ship udev rules for it in a
> separate package?  That would allow containers to *not* ship the rule and
> achieve default-DHCP inside the container, but NM on the host to ship the rule
> and leave the docker interface alone.

Yes. I've reverted it in an updated version of the branch.

Also, there's no udev in the containers, which has a couple of implications:

* No device ever reaches "initalized" state, thus the g_udev_enumerator_add_match_is_initialized() match never works.  I've tried to address that in bug #740526; I've now included a modified version of the patch here.

* UDev rules don't work inside a container. That means we get the desired behavior (unmanaged outside, managed inside) by default. Whether this is good enough or just dumb luck remains questionable to me. I tend to think it's just fine.

(In reply to comment #54)
> SUBSYSTEM=="net", ENV{INTERFACE}=="vmnet[0-9]*", ENV{NM_UNMANAGED}="0"
> 
> I think, it's important that this NM_UNMANAGED variable is documented in the
> NetworkManager man page

Yes, that is a good idea. Added.

(In reply to comment #57)
> >+# These are out-of-tree drivers that create Ethernet devices, however
> >+# they ship with their own management tools.
> >+SUBSYSTEM=="net", ENV{INTERFACE}=="vboxnet[0-9]*", ENV{NM_UNMANAGED}="1"
> >+SUBSYSTEM=="net", ENV{INTERFACE}=="vmnet[0-9]*", ENV{NM_UNMANAGED}="1"
> 
> Maybe add comments clarifying that the first is VirtualBox and the second is
> VMWare.

A good idea.  Added some rather extensive comments.

> Also, would it be better to do ENV{NM_CONTROLLED}="no" rather than
> ENV{NM_UNMANAGED}="1", so that it matches the syntax in ifcfg-rh? (I'm not
> convinced it would, I'm just wondering.)

Hmm, I'd like to stick with the negative form, since it seems most other tools do and somewhat makes sense in the context of overridin an inclusive default.

The semantics is a bit different from that of NM_CONTROLLED does at least as long as it uses the default-managed flag.

A quick look at other similar udev properties reveals the followin ones: UDISKS_IGNORE ID_MM_DEVICE_IGNORE PULSE_IGNORE COLORD_IGNORE. Should we use NM_IGNORE? We don't really ignore the device...

(In reply to comment #58)
> (In reply to comment #57)
> > I don't think we should use default-unmanaged for this. Add a new
> > NM_UNMANAGED_PLATFORM or NM_UNMANAGED_UDEV bit. (And update both the code and
> > the commit messages accordingly.)
> > 
> > >+# These are out-of-tree drivers that create Ethernet devices, however
> > >+# they ship with their own management tools.
> > >+SUBSYSTEM=="net", ENV{INTERFACE}=="vboxnet[0-9]*", ENV{NM_UNMANAGED}="1"
> > >+SUBSYSTEM=="net", ENV{INTERFACE}=="vmnet[0-9]*", ENV{NM_UNMANAGED}="1"
> > 
> > Maybe add comments clarifying that the first is VirtualBox and the second is
> > VMWare.
> > 
> > Also, would it be better to do ENV{NM_CONTROLLED}="no" rather than
> > ENV{NM_UNMANAGED}="1", so that it matches the syntax in ifcfg-rh? (I'm not
> > convinced it would, I'm just wondering.)
> > 
> > (In reply to comment #53)
> > > Should we also revert the veth change
> > > (ebeaeaed4d3180c27cc059fe0ff199153fa9ec24) and just ship udev rules for it in a
> > > separate package?
> > 
> > NMPlatform doesn't wait for udev rules to be processed on software devices, so
> > that wouldn't work, would it? (I don't remember why NMPlatform behaves that way
> > though...)
> 
> I recall it works that way because udev processing actually does take a while,
> and when NM creates a software interface it needs the interface to be available
> in the platform immediately so it can continue with interface setup and
> dependency stuff (like system_create_virtual_device() in the activation paths
> when creating the parent).
> 
> Also, previously udev was only used for detecting hardware stuff like DEVTYPE
> or the driver, and for software devices that's all known via netlink so udev
> rules were unecessary.
> 
> But yes, you bring up a great point and I think this would be a problem, but
> wouldn't it also be a problem for *any* software device including
> virtualbox/VMWare too?  If they create an interface, the kernel event will come
> in long before the udev event, which means NM might completely miss the fact
> that the interface should be unmanaged...
> 
> Possibly the only way around that is to use config options instead of udev
> rules, since we have those available immediately?

I think that acutal reason why software devices don't wait for udev is that containers ordinarily do have software devices but don't have udev and thus it would never initialize the devices.

That said, it looks like a bit of a hack and is not flexible enough for some use cases -- hardware devices in containers and software devices configured by udev properties.  I've included a modified version of a patch from bug #740526 and will close that one as a duplicate.

(In reply to comment #61)
> (In reply to comment #60)
> > (In reply to comment #58)
> > > But yes, you bring up a great point and I think this would be a problem, but
> > > wouldn't it also be a problem for *any* software device including
> > > virtualbox/VMWare too?
> > 
> > As Thomas said, NM thinks those devices are hardware (which is why we need this
> > special casing in the first place; if we could already recognize that they were
> > software, then we could just treat them like veth).

Yes. They live in /virtual/ in the sysfs hierarchy, which means there's no parent hardware device to fetch DRIVERS key from.

Apart from veth, none of them implement ethtool so ID_NET_DRIVER is not set either.

To RTNETLINK they look like link/ether.

> > (In reply to comment #59)
> > > > I don't think we should use default-unmanaged for this. Add a new
> > > > NM_UNMANAGED_PLATFORM or NM_UNMANAGED_UDEV bit. (And update both the code and
> > > > the commit messages accordingly.)
> > > 
> > > NM_UNMANAGED_DEFAULT behaves differently from other unmanaged flags, in that a
> > > device still can be activated on user-request. In that sense,
> > > NM_UNMANAGED_PLATFORM should behave more like NM_UNMANAGED_DEFAULT.
> > > For now, I think that differentiation is unnecessary.
> > 
> > But I don't think we want that behavior for these devices; the discussion above
> > was "if the user wants NM to configure the devices, they can override/uninstall
> > the udev files". So just make them fully-unmanaged, and if the user doesn't
> > want that, they can make them fully-managed.
> > 
> > (Also, unmanaged-default is pure concentrated evil.)

Why do we use precisely that behavior for generic devices? (as opposed to what NM_CONTROLLED=no does)

> A fully unmanaged device also does not assume connections. But assuming
> external connections is a great feature to expose runtime information and run
> dispatcher scripts.
> I think we really want an UNMANGED_DEFAULT behavior.
> 
> -- on the other hand, probably we also would like to assume connections for
> fully unmanaged devices, to get these features. But that should be done in a
> later step and is a different discussion.
> 
> For now, setting UNMANAGED_DEFAULT seems good to me -- but eventually, I think
> we should refactor the code to separate the UNMANGAED_DEFAULT flag from the
> other UNMANGED flags (as they have different meaning)

I left this UNMANAGED_DEFAULT for now; it makes sense to me too. Think we only want the user override the udev rules in corner cases and still want the user to attach a connection to the device manually if they wish so.

Pushed an updated changeset:
http://cgit.freedesktop.org/NetworkManager/NetworkManager/log/?h=lr/udev-unmanaged-fd731014

986a108 contrib: ensure udev rules from RPM package are applied
- Added this one. We should have done this already, we do already ship udev rules.

ace0122 platform: don't wait for udev device initializaton if there's no udev
- Carried over from bug #740526. Uses udev consistently for both software and hardware links. A consensus on what's the proper way to detect if udev is running was not reached

06e7120 platform: give the platform an opportunity to override default-unmanaged
- Just added the documentation here

55c3c3b data: add udev rules to make emulated ethernet devices default unmanaged
- Added veth, parallels and a lot of comments

16fea18 data: move OLPC MESH udev rules away from callouts
- No change

337a36f Revert "core: make veth devices default-unmanaged for now"
- Added

Comment 63 Lubomir Rintel 2015-01-26 16:52:16 UTC

*** Bug 740526 has been marked as a duplicate of this bug. ***

Comment 64 Dan Winship 2015-01-26 17:45:45 UTC

(In reply to comment #62)
> > > As Thomas said, NM thinks those devices are hardware (which is why we need this
> > > special casing in the first place; if we could already recognize that they were
> > > software, then we could just treat them like veth).
> 
> Yes. They live in /virtual/ in the sysfs hierarchy, which means there's no
> parent hardware device to fetch DRIVERS key from.

Ah... well, no real ethernet device is going to be in /virtual/, so we could just recognize them based on that then.

> Why do we use precisely that behavior for generic devices? (as opposed to what
> NM_CONTROLLED=no does)

When we added NMDeviceGeneric in 0.9.9, NM consistently interfered too much with managed devices, so we had to make the new devices unmanaged or NM would have completely broken them. But we wanted to provide for the *possibility* of using NM to manage your tun/veth/whatever device. So I invented default-unmanaged as a way of getting NM to ignore the device unless you tried to activate a connection on it.

But in retrospect, it was a bad idea, because it requires too many weird special cases scattered over too many different functions. And it's likely that with the work that has happened in the last year, we no longer need default-unmanaged for NMDeviceGeneric anyway; we're now good enough at not interfering with other people's devices that we should be able to just treat them as managed.

Comment 65 Lubomir Rintel 2015-01-26 18:44:02 UTC

(In reply to comment #64)
> (In reply to comment #62)
> > > > As Thomas said, NM thinks those devices are hardware (which is why we need this
> > > > special casing in the first place; if we could already recognize that they were
> > > > software, then we could just treat them like veth).
> > 
> > Yes. They live in /virtual/ in the sysfs hierarchy, which means there's no
> > parent hardware device to fetch DRIVERS key from.
> 
> Ah... well, no real ethernet device is going to be in /virtual/, so we could
> just recognize them based on that then.

But there are devices in /virtual/net other than these soft-ethernets -- bridges, teams, dummies loopback. DEVPATH=="*/virtual/*" is just too inclusive. 

If we'd keep the logic in in udev (and I'd really like to, since it's rather flexible and somehow fixes the veth-in-container issue) I think selectively picking the known offending drivers is a better idea.

Comment 66 Michael Biebl 2015-01-26 18:45:55 UTC

(In reply to comment #64)
> (In reply to comment #62)
> > > > As Thomas said, NM thinks those devices are hardware (which is why we need this
> > > > special casing in the first place; if we could already recognize that they were
> > > > software, then we could just treat them like veth).
> > 
> > Yes. They live in /virtual/ in the sysfs hierarchy, which means there's no
> > parent hardware device to fetch DRIVERS key from.
> 
> Ah... well, no real ethernet device is going to be in /virtual/, so we could
> just recognize them based on that then.
> 

That's what I initially proposed and tried to implement in
https://bug731014.bugzilla-attachments.gnome.org/attachment.cgi?id=279568

Comment 67 Dan Williams 2015-02-23 23:33:42 UTC

In general I want to move away from fully-unmanaged and more towards default-unmanaged devices, because NM shouldn't be interfering with the devices anyway.  Maybe we just rename default-unmanaged -> unmanaged and get rid of fully-unmanaged.  So now we've got NM_UNMANAGED which means "default-unmanaged" rather than fully-unmanaged.  Could be confusing in the future for users.

danw is right that default-unmanaged is somewhat of a hack, but I think it's a worthwhile one that we should try to clean up instead of throwing out.

> platform: give the platform an opportunity to override default-unmanaged

In the commit message: "In the ent" -> "In the end" ?

I'm a bit worried about making everything go through udev, but I don't honestly remember the reason software devices got short-cut before, other than possibly speed of detection.  Since the only reason we cared about udev at the time was for hardware device drivers and properties, only hardware things had to go through udev.  Sometimes udev can take a long time, and this may cause some startup concurrency issues; for now it's possible for a bridge port to be detected before the bridge itself when NM starts up, so we need to make sure that can correctly assume that connection and other stuff like that.

That's all I've got on the code, it looks solid.  It's just the side-effects that I'm not 100% sure about, but I think it's probably OK to merge this and work those out later?

Comment 68 Dan Winship 2015-02-25 15:52:02 UTC

(In reply to Dan Williams from comment #67)
> In general I want to move away from fully-unmanaged and more towards
> default-unmanaged devices...
> 
> danw is right that default-unmanaged is somewhat of a hack, but I think it's
> a worthwhile one that we should try to clean up instead of throwing out.

I really think it would be better to kill off default-unmanaged (specifically, the ability to activate a connection from NM_DEVICE_STATE_UNMANAGED), and if you want to bring up a connection on an unmanaged device, you just set the Device.Managed property to TRUE first.

Comment 69 Thomas Haller 2015-02-25 17:54:27 UTC

(In reply to Dan Winship from comment #68)
> (In reply to Dan Williams from comment #67)
> > In general I want to move away from fully-unmanaged and more towards
> > default-unmanaged devices...
> > 
> > danw is right that default-unmanaged is somewhat of a hack, but I think it's
> > a worthwhile one that we should try to clean up instead of throwing out.
> 
> I really think it would be better to kill off default-unmanaged
> (specifically, the ability to activate a connection from
> NM_DEVICE_STATE_UNMANAGED), and if you want to bring up a connection on an
> unmanaged device, you just set the Device.Managed property to TRUE first.

but unless that is implemented, user cannot bring a device to Managed and the udev rules are done too.
I agree with the overall goal, but seems too much for this bug.

Comment 70 Lubomir Rintel 2015-03-21 11:20:43 UTC

Opened bug #746566 for the unmanaged state follow-up. I tend to agree with Winship; making PROP_MANAGED READWRITE and maybe adding a "nmcli d manage <dev>" makes sense to me.

Anything that would block merging the UDev part here?

Comment 71 Lubomir Rintel 2015-03-23 14:18:12 UTC

Merged

ca96793 platform: merge branch 'lr/udev-unmanaged-fd731014'
498d45e Revert "core: make veth devices default-unmanaged for now"
ceea3c6 data: move OLPC MESH udev rules away from callouts
7ba30cf data: add udev rules to make emulated ethernet devices default unmanaged
85ee1f4 platform: give the platform an opportunity to override default-unmanaged
4a05869 platform: don't wait for udev device initializaton if there's no udev
b3667af contrib: ensure udev rules from RPM package are applied

Default-unmanaged follow-up here: bug #746566

Comment 72 Vladimír Čunát 2015-04-12 08:29:32 UTC

I can't see these fixes on any stable branches (0.9.10 or 1.0). Is there a reason not to have them in there?

Comment 73 Dan Williams 2015-04-15 19:19:51 UTC

Lubomir, are these easy to backport, along with any additional fixups that we've done for this branch since?

Comment 74 Lubomir Rintel 2015-04-16 18:21:33 UTC

Vladimir, Dan: I'll add them to 1.0 branch after we do the 1.0.2 release.

Comment 75 Damien Cassou 2015-05-20 16:19:39 UTC

Lubomir: 1.0.2 has been released. Could you please do the backport? Thanks

Comment 76 Damien Cassou 2015-05-21 16:15:04 UTC

I've just updated my system to use 1.0.2 and can still see 2 wired interfaces including "vboxnet0" that is named "Ethernet" (as opposed to my actual ethernet interface that is named "PCI Ethernet").

Comment 77 Thomas Haller 2015-05-21 16:26:09 UTC

(In reply to Damien Cassou from comment #76)
> I've just updated my system to use 1.0.2 and can still see 2 wired
> interfaces including "vboxnet0" that is named "Ethernet" (as opposed to my
> actual ethernet interface that is named "PCI Ethernet").

what gives
  $ nmcli device

and
  $ nmcli connection

?

Probably a thing named "Ethernet" or "PCI Ethernet" are not devices, but connections. A "connection" for NetworkManager is a set of configuration options, a ~profile~.

The effect of this bug would be that the device "vboxnet0" would be in status unmanged (see `nmcli device`).


As you already noted, these changes are not yet backported to nm-1-0 branch (and not in 1.0.2). That will happen.

Comment 78 Damien Cassou 2015-05-22 04:25:04 UTC

$ nmcli device
DEVICE    TYPE      STATE         CONNECTION 
wlp2s0    wifi      connected     mybox      
vboxnet0  ethernet  disconnected  --         
enp1s0f0  ethernet  unavailable   --         
lo        loopback  unmanaged     --

$ nmcli connection
NAME                    UUID                                  TYPE             DEVICE 
XX                      XXX-XXX-XXX  802-11-wireless  --     
XX                      XXX-XXX-XXX  802-11-wireless  --     
Wired connection 1      XXX-XXX-XXX  802-3-ethernet   --     
2nd_floor               XXX-XXX-XXX  802-11-wireless  --     
mybox                   XXX-XXX-XXX  802-11-wireless  wlp2s0 
WIFI-AIRPORT            XXX-XXX-XXX  802-11-wireless  --     
Wired connection 2      XXX-XXX-XXX  802-3-ethernet   --     


I thought the bug was fixed in 1.0.2.

Comment 79 Jiri Klimes 2015-05-22 11:27:00 UTC

(In reply to Damien Cassou from comment #76)
> I've just updated my system to use 1.0.2 and can still see 2 wired
> interfaces including "vboxnet0" that is named "Ethernet" (as opposed to my
> actual ethernet interface that is named "PCI Ethernet").

With "Ethernet", "PCI Ethernet" names, I guess you mean Gnome3 control center/applet naming. The name is get by nma_utils_disambiguate_device_names() from libnm-gtk.
But, what is the actual problem?

Comment 80 Lubomir Rintel 2015-05-28 14:04:49 UTC

Merged into nm-1-0.

There's a number of subsequent bugfixes with some dependencies. I've tested it extensively in different setups; worked fine for me. Hopefully I didn't miss out anything important:

8dd973d merge: 'lr/nm-1-0-udev-unmanaged-fd731014' into nm-1-0
95af757 platform: fix root-tests after adding link detection without udev
e802442 platform: keep udev-device in udev_device_added() even if there is no netlink object
e737b86 platform: intern driver string for NMPlatformLink
00525c3 platform: refactor extraction of type-name for link
21621b2 platform: expose nm_link_type_to_string() function
cbe890c platform: detect TUN/TAP device in link_extract_type() independently of platform cache
4623438 core: remove G_GNUC_WARN_UNUSED_RESULT from ASSERT_VALID_PATH_COMPONENT()
197bd60 core: change activation failure messages to debug level
885273c platform: rework link type detection for better fallback (bgo #743209)
649bfff platform: refactor link-type to string conversion
2c707ab platform: don't wait for udev before announcing links
bc500c4 core: add generic NMDevice function to recheck availability
f71e5c2 platform: don't use udev for link type determination
67c0693 platform/trivial: move ethtool code
068fb9d platform/trivial: move udev_detect_link_type_from_device()
48e0ac2 platform: split link_type_from_udev()
4732b06 platform/trivial: move udev code closer to beginning of nm-linux-platform.c
25565e4 platform/trivial: remove unused argument from udev_get_driver()
7fa8f66 platform/trivial: move code around so that libnl related stuff is together
705b7d5 device: turn off "unmanaged unless IFF_UP externally" for veth
1b95eab device: move the decision whether to wait for IFF_UP a virtual function
70db361 bridge/bond/team: device availability shouldn't depend on IFF_UP (bgo #746918)
093f0bc platform: prevent warning when udev is clueless about a device
249b569 Revert "core: make veth devices default-unmanaged for now"
f16c8ca data: move OLPC MESH udev rules away from callouts
68ba16c data: add udev rules to make emulated ethernet devices default unmanaged
b30a9e9 platform: give the platform an opportunity to override default-unmanaged
6810226 platform: don't wait for udev device initializaton if there's no udev
75739b3 contrib: ensure udev rules from RPM package are applied

Comment 81 Damien Cassou 2015-05-28 15:58:08 UTC

(In reply to Jiri Klimes from comment #79)
> With "Ethernet", "PCI Ethernet" names, I guess you mean Gnome3 control
> center/applet naming. The name is get by
> nma_utils_disambiguate_device_names() from libnm-gtk.
> But, what is the actual problem?

The problem is that I see 2 ethernet devices but my computer only has one. And it looks like one of the two comes from VirtualBox.

Comment 82 Thomas Haller 2015-05-28 18:40:53 UTC

(In reply to Damien Cassou from comment #81)
> (In reply to Jiri Klimes from comment #79)
> > With "Ethernet", "PCI Ethernet" names, I guess you mean Gnome3 control
> > center/applet naming. The name is get by
> > nma_utils_disambiguate_device_names() from libnm-gtk.
> > But, what is the actual problem?
> 
> The problem is that I see 2 ethernet devices but my computer only has one.
> And it looks like one of the two comes from VirtualBox.

You said you were using 1.0.2. As mentioned already, 1.0.2 does not have the fixes from this bug.

Also, the fix would not hide the devices, it would make them "unmanged". Look at the output of `nmcli device show` to see the state of the devices.

Anyway, you don't say what exactly your the problem is. If you run Virtualbox (and it creates additional interfaces), then yes, you will see those interfaces in NetworkManager too. That is expected and not a bug.