GNOME Bugzilla – Bug 687254
NetworkManager refuses to connect virtualized eth0 in a LXC container
Last modified: 2013-05-24 19:12:09 UTC
Description of problem: I'm trying to set up a virtual server on a LXC container. (linux 3.5.0 vanilla). Debian is the host and Fedora-ARM is the guest. When I restart NetworkManager.service (using systemd), I get the following line in the logs (journalctl): <warn> /sys/devices/virtual/net/eth0: couldn't determine device driver; ignoring... I also see that eth0 is never brought up and configured. I suppose the above line is the cause. Version-Release number: NetworkManager-0.9.4.0-9.git20120521.fc17.armv5tel How reproducible: always Steps to Reproduce: 1. Create a LXC environment with the configuration below (virtualized network using veth) on a 3.5 kernel 2. Configure NetworkManager as below 3. Start NetworkManager Actual results: eth0 is not up Expected results: eth0 should be up and configured using dhcp Additional info: /etc/NetworkManager/NetworkManager.conf [main] plugins=keyfile /etc/NetworkManager/system-connections/eth0.conf [connection] id=Auto eth0 uuid=09c8085c-2331-11e2-9b3b-485b391674a7 type=802-3-ethernet autoconnect=true timestamp=0 [ipv4] method=auto dns=127.0.0.1 dns-search=mildred.fr ignore-auto-dns=true [ipv6] method=auto dns=::1 dns-search=mildred.fr ignore-auto-dns=true LXC Configuration: lxc.utsname = fashley lxc.rootfs = /srv/lxc2/root lxc.tty = 4 lxc.network.type = veth lxc.network.link = br0 lxc.network.flags = up Log: Oct 31 09:25:47 fedora-arm NetworkManager[1486]: <info> NetworkManager (version 0.9.4.0-9.git20120521.fc17) is starting... Oct 31 09:25:47 fedora-arm NetworkManager[1486]: <info> Read config file /etc/NetworkManager/NetworkManager.conf Oct 31 09:25:47 fedora-arm NetworkManager[1486]: <info> WEXT support is enabled Oct 31 09:25:47 fedora-arm NetworkManager[1486]: <info> Loaded plugin keyfile: (c) 2007 - 2010 Red Hat, Inc. To report bugs please us...g list. Oct 31 09:25:47 fedora-arm NetworkManager[1486]: keyfile: parsing eth0.conf ... Oct 31 09:25:47 fedora-arm NetworkManager[1486]: keyfile: read connection 'Auto eth0' Oct 31 09:25:47 fedora-arm NetworkManager[1486]: <info> modem-manager is now available Oct 31 09:25:47 fedora-arm NetworkManager[1486]: <info> monitoring kernel firmware directory '/lib/firmware'. Oct 31 09:25:47 fedora-arm NetworkManager[1486]: <info> WiFi enabled by radio killswitch; enabled by state file Oct 31 09:25:47 fedora-arm NetworkManager[1486]: <info> WWAN enabled by radio killswitch; enabled by state file Oct 31 09:25:47 fedora-arm NetworkManager[1486]: <info> WiMAX enabled by radio killswitch; enabled by state file Oct 31 09:25:47 fedora-arm NetworkManager[1486]: <info> Networking is enabled by state file Oct 31 09:25:47 fedora-arm NetworkManager[1486]: <warn> /sys/devices/virtual/net/eth0: couldn't determine device driver; ignoring... Oct 31 09:25:47 fedora-arm NetworkManager[1486]: <warn> /sys/devices/virtual/net/ip6tnl0: couldn't determine device driver; ignoring... Oct 31 09:25:47 fedora-arm NetworkManager[1486]: <warn> /sys/devices/virtual/net/lo: couldn't determine device driver; ignoring... Oct 31 09:25:47 fedora-arm NetworkManager[1486]: <warn> /sys/devices/virtual/net/sit0: couldn't determine device driver; ignoring... Oct 31 09:25:47 fedora-arm NetworkManager[1486]: <warn> /sys/devices/virtual/net/eth0: couldn't determine device driver; ignoring... Oct 31 09:25:47 fedora-arm NetworkManager[1486]: <warn> /sys/devices/virtual/net/ip6tnl0: couldn't determine device driver; ignoring... Oct 31 09:25:47 fedora-arm NetworkManager[1486]: <warn> /sys/devices/virtual/net/lo: couldn't determine device driver; ignoring... Oct 31 09:25:47 fedora-arm NetworkManager[1486]: <warn> /sys/devices/virtual/net/sit0: couldn't determine device driver; ignoring... Oct 31 09:25:47 fedora-arm NetworkManager[1486]: <warn> bluez error getting default adapter: The name org.bluez was not provided by an...e files Oct 31 09:26:19 fedora-arm systemd[1]: NetworkManager-wait-online.service: main process exited, code=exited, status=1 Oct 31 09:26:19 fedora-arm systemd[1]: Unit NetworkManager-wait-online.service entered failed state.
I just saw this thread that might be relevant to this bug: https://mail.gnome.org/archives/networkmanager-list/2012-July/msg00073.html
This might actually be a problem in the kernel… as we should always be able to recieve ethernet device drivers. And virtualized ethernet should be nothing more than just ethernet for the virtualized system. KVM ethernet drivers work without problem.
I also suggest reading the follow-up: https://mail.gnome.org/archives/networkmanager-list/2012-July/msg00143.html I wonder if they should be labeled like that when they are basically just (virtual) ethernet devices. The patch looks ok. Does it support all sorts of VLAN and bridging stuff? I remember OpenVZ's implementation didn't support these features well.
(In reply to comment #1) > I just saw this thread that might be relevant to this bug: > https://mail.gnome.org/archives/networkmanager-list/2012-July/msg00073.html Is there still demand to apply the patch?
We'll have explicit veth support as part of the more-virtual-device-types stuff I'm doing. (They're not actually "ethernet" devices though, they're just generic network devices.)
(In reply to comment #5) > We'll have explicit veth support as part of the more-virtual-device-types stuff > I'm doing. Good. > (They're not actually "ethernet" devices though, they're just generic network > devices.) This bug report is specifically about the guest side of the veth link. Therefore the LXC behavior differs from behavior of other virtualizations like KVM. From the administrator point of view, the guest virtual ethernet is always just an ethernet. It's a replacement of the physical ethernet if you are moving your system from a physical computer to a virtual environment. Therefore if NetworkManager doesn't treat them as ethernet (whether the problem is in NetworkManager or in kernel), this imposes a problem in user experience. A system that works in physical environment would stop working in virtual environment.
(In reply to comment #6) > (In reply to comment #5) > > (They're not actually "ethernet" devices though, they're just generic network > > devices.) > > This bug report is specifically about the guest side of the veth link. (Both sides look the same.) > Therefore if NetworkManager doesn't treat them as ethernet (whether the problem > is in NetworkManager or in kernel), this imposes a problem in user experience. > A system that works in physical environment would stop working in virtual > environment. We can allow NMSettingWired connections to match against NMDeviceVeth... that ought to make NM-under-LXC work at least as well as non-NM networking under LXC. (ie, as long as you're not doing anything ethernet-l2-specific, it will work.)
(In reply to comment #7) > (In reply to comment #6) > > (In reply to comment #5) > > > (They're not actually "ethernet" devices though, they're just generic network > > > devices.) > > > > This bug report is specifically about the guest side of the veth link. > > (Both sides look the same.) And that is what the bug report is about. Both sides looks the same but one is typicaly used by the guest system as Ethernet and the other is typically configured by the virtualization management system. Mildred was trying to achieve the expected guest system behavior and didn't care about the host side as that one already works (the tools handle it and NetworkManager ignores it). > > Therefore if NetworkManager doesn't treat them as ethernet (whether the problem > > is in NetworkManager or in kernel), this imposes a problem in user experience. > > A system that works in physical environment would stop working in virtual > > environment. > > We can allow NMSettingWired connections to match against NMDeviceVeth... that > ought to make NM-under-LXC work at least as well as non-NM networking under > LXC. (ie, as long as you're not doing anything ethernet-l2-specific, it will > work.) Possibly. But I think we should discuss with some kernel/LXC guys what is the intended behavior of veth from their side. Whether, for example, were they trying to implement a complete virtual ethernet and would possibly want to make the guest NIC appear as an ethernet or not. My opinion is that it would make the LXC networking on par with e.g. KVM networking for practical purposes.
*** Bug 700822 has been marked as a duplicate of this bug. ***
700822 has an updated version of the original patch, although it conflicts / is obsoleted by bug 700087 aka the danw/moredevs branch. I guess that the veth driver behaves close enough to a real ethernet device that for current NM purposes, we could fake it and pretend it was. (It possibly does not support ETHTOOL_GPERMADDR, but NMDeviceEthernet copes with that anyway.) I don't think we want to always behave that way though. Certainly we don't want to ignore veth0 and treat eth1 as real when we're on the host side. Presumably there is some way we can detect whether we're in a container, and behave accordingly? (Of course, this presents a minor problem with using LXC as the basis for the testing system, since it means we'll be testing NM's ability to emulate veths as eths, not testing its support for actual eth devices like we would if we used qemu or something. But as Pavel notes above, maybe this should really be considered veth's problem, not ours.)
(In reply to comment #10) > I don't think we want to always behave that way though. Certainly we don't want > to ignore veth0 and treat eth1 as real when we're on the host side. Presumably > there is some way we can detect whether we're in a container, and behave > accordingly? Wrt. the patch I attached to bug 700822 (sorry, I searched for an existing one, but didn't find this one), I'm happy to drop the 'if has_prefix("veth")' check; for the test suite it's no issue at all, and even in a container you can add blacklists to NetworkManager.conf.
(In reply to comment #10) > (Of course, this presents a minor problem with using LXC as the basis for the > testing system, since it means we'll be testing NM's ability to emulate veths > as eths, not testing its support for actual eth devices like we would if we > used qemu or something. But as Pavel notes above, maybe this should really be > considered veth's problem, not ours.) Yes, that's indeed the case with the moredevs branch, as this introduces a completely new class for veth devices, instead of just using the standard Ethernet one. But oh well, we'd at least cover that path then. :-) (on current master, with the patch on bug 700822 the test actually exercises the NMDeviceEthernet class)
My point wasn't about whether we were testing NMDeviceEthernet or NMDeviceVeth, it was just that there are bits of current/possible NMDeviceEthernet functionality that we'd be unable to test using veths, because the veth driver doesn't support them. Eg, we currently accidentally disable wake-on-lan on NM-managed ethernet devices. If we fix that, we can't make a regression test for it using veths, because veth doesn't implement that part of the ethtool API. (I don't know if you've looked at the kernel code at all, but the veth driver is only a teeny tiny bit more ethernet-like than the tun driver is. It's not like mac80211_hwsim where it really tries to implement the entire API.)
I'm afraid that unless we have a simulation driver for ethernet, we will have to live with that for tests that are not run on physical machines with an ethernet dedicated for testing.
(In reply to comment #14) > I'm afraid that unless we have a simulation driver for ethernet, we will have > to live with that for tests that are not run on physical machines with an > ethernet dedicated for testing. Yes, I agree. At least we cover the logic for automatic connection creation, IPv4/6, link beat detection, and the like. Still better than nothing. We could perhaps also create a test case which exercises a real ethernet device, if one is found.
(In reply to comment #14) > I'm afraid that unless we have a simulation driver for ethernet, we will have > to live with that for tests that are not run on physical machines with an > ethernet dedicated for testing. Hm... I was thinking that virtio_net did more ethernet emulation than this, but apparently it doesn't, it's just that it shows up as a very minimally-featureful ethernet card rather than a slightly-ethernet-like virtual device. OK, so. I guess I should switch NMDeviceVeth to be a subclass of NMDeviceEthernet rather than NMDeviceGeneric. Then it would effectively behave exactly like ethernet, to the extent possible (except it would export o.fd.NM.Device.Veth in addition to o.fd.NM.Device.Wired, though that won't be visible in the libnm-glib API right away). It would be nice to have the guest device be default-unmanaged on the host side, and vice-versa, so that we don't accidentally activate connections on them and mess things up. Can we do that?
(In reply to comment #16) > OK, so. I guess I should switch NMDeviceVeth to be a subclass of > NMDeviceEthernet rather than NMDeviceGeneric. That sounds great. > It would be nice to have the guest device be default-unmanaged on the host > side, and vice-versa, so that we don't accidentally activate connections on > them and mess things up. Can we do that? As far as I can see, a veth pair is fairly symmetrical. So far the LXC convention seems to be that the guest interface gets named "eth*" and the host interface gets named "veth*", e. g. with # ip link add name eth1 type veth peer name veth0 Hence the current Ubuntu patch considers veth* as unmanaged. Admittedly that's a bit of a hack, and only a convention, though. It would certainly be more elegant if one could configure the visibility of network interfaces per container, so that the respective other endpoint isn't even visible in the host/the container. But I guess containers are not really that tightly isolated from a kernel/sysfs point of view, so maybe we have to live with the naming heuristics, or alternatively require blacklisting.
Oh, right, I saw that part, but I meant, how does NM detect that it's inside a container?
Created attachment 245053 [details] [review] core: use NMPlatform to figure out device types, where possible Rather than having a bunch of udev-based tests, use nm_platform_link_get_type() to categorize devices. Incomplete, as NMPlatform still categorizes most hardware types as "ETHERNET", so we still need udev-based tests for those.
Created attachment 245054 [details] [review] platform, devices: add support for veth devices
cherry-picked from danw/moredevs. doesn't try to detect/unmanage anything yet
(In reply to comment #21) > doesn't try to detect/unmanage anything yet perhaps that doesn't really matter, since containers are used in fairly controlled situations, where it's reasonable to expect the config on both host and guest to just know what it's supposed to be doing
Review of attachment 245053 [details] [review]: Patch looks fine to me.
Review of attachment 245053 [details] [review]: Looks good.
(In reply to comment #18) > Oh, right, I saw that part, but I meant, how does NM detect that it's inside a > container? In an LXC container, /proc/1/environ (i. e. the init environment) has "container=lxc": http://lxc.git.sourceforge.net/git/gitweb.cgi?p=lxc/lxc;a=blob;f=src/lxc/start.c;h=aefccd6505008dc7681f90d5b271287ebd13f1b5;hb=HEAD#l684 There are also other types of containers. Taking the shell code below from Ubuntu's /etc/init/container-detect.conf: * OpenVZ: [ -d /proc/vz ] && [ ! -d /proc/bc ] && echo "OpenVZ container" * VServer: VXID="$(cat /proc/self/status | grep ^VxID | cut -f2)" || true [ "${VXID:-0}" -gt 1 ] && echo "vserver container"
(In reply to comment #25) > (In reply to comment #18) > > Oh, right, I saw that part, but I meant, how does NM detect that it's inside a > > container? > > In an LXC container, /proc/1/environ (i. e. the init environment) has > "container=lxc": Can you use an LXC guest as a LXC (or other) host at the same time? If yes, then this check is flawed
(In reply to comment #26) > Can you use an LXC guest as a LXC (or other) host at the same time? Potentially. (I'm not sure) > If yes, then this check is flawed Well, it just tells you whether you are in a container or not, nothing more, nothing less. I mostly wrote that because Dan asked. Note that I'm not advocating of actually doing any kind of "am I in a container" detection in NM. I think that would somewhat circumvent the idea of virtualization, and that NM should behave the same in all cases.
(In reply to comment #27) > Well, it just tells you whether you are in a container or not, nothing more, > nothing less. I mostly wrote that because Dan asked. > > Note that I'm not advocating of actually doing any kind of "am I in a > container" detection in NM. I think that would somewhat circumvent the idea of > virtualization, and that NM should behave the same in all cases. I agree. It should behave the same unless there's a specific requirement. One such specific requirement might be to detect the guest side of veth link. Checking it through the check whether we are in container or not should be the last resort workaround if there's no other way.
Review of attachment 245054 [details] [review]: I'm not wild about the nm_manager_get() stuff in the veth device, I feel like we should keep the devices as ignorant of NMManager as we can. Perhaps a GInterface which has an nm_device_provider_get_devices() function, and DeviceAdded/DeviceRemoved signals that each NMDevice can listen for? Or just have device class methods for device_added/device_removed and pass the device list from the manager to the devices when they get created, either way. The OLPC Mesh device should get converted to use this interface for finding its companion wifi device too.
updated code in danw/veth I went with "NMDeviceManager" rather than "NMDeviceProvider" because from everyone else's pov, it behaves just like any other NMFooManager type. It just happens to be implemented as part of NMManager.
after discovering some bugs in the devicemanager stuff, we agreed to push the original veth patches to master, and the devicemanager stuff is in danw/devicemanager, to be revisted after dcbw/add-unsaved lands