GNOME Bugzilla – Bug 725647
[review] dcbw/external-managed-iffup-rh1030947: don't manage external software devices until IFF_UP
Last modified: 2015-01-12 15:51:03 UTC
Just installed git master and tried to use libvirt and its default natted network (on Gentoo) with no success. NetworkManager takes over the virbr0 as soon as it's created by libvirt and apparently breaks its configuration. # nmcli dev DEVICE TYPE STATE CONNECTION ... virbr0 bridge connecting (getting IP configuration) virbr0 ... I tried the following workaround in /etc/NetworkManager/NetworkManager.conf: [main] plugins=keyfile [keyfile] unmanaged-devices=interface-name:virbr0 But it doesn't make any change.
Steps to reproduce (simple version): 1) Make sure NetworkManager is started 2) run 'ip link add somename type bridge' 3) run 'nmcli dev' and see whether the device is managed Expected result: Not managed. Actual result: Managed.
I'm using the following patch as a temporary workaround specific to virbr0: diff --git a/src/devices/nm-device.c b/src/devices/nm-device.c index 3c069e0..b11560a 100644 --- a/src/devices/nm-device.c +++ b/src/devices/nm-device.c @@ -6405,6 +6405,7 @@ nm_device_state_changed (NMDevice *device, priv->in_state_changed = TRUE; g_return_if_fail (NM_IS_DEVICE (device)); + g_return_if_fail (!!strcmp(nm_device_get_iface(device), "virbr0")); /* Do nothing if state isn't changing, but as a special case allow * re-setting UNAVAILABLE if the device is missing firmware so that we @@ -6961,6 +6962,7 @@ nm_device_set_manager_managed (NMDevice *device, gboolean was_managed, now_managed; g_return_if_fail (NM_IS_DEVICE (device)); + g_return_if_fail (!!strcmp(nm_device_get_iface(device), "virbr0")); priv = NM_DEVICE_GET_PRIVATE (device);
NM manages all devices now. Some guy named Pavel wrote code for that. However, it's supposed to manage virbr0 in such a way that it doesn't interfere with it. Right now there's a bug that it tries to bring the bridge up, when it shouldn't. https://bugzilla.redhat.com/show_bug.cgi?id=1030947
For the record, this seems related too: https://bugzilla.redhat.com/show_bug.cgi?id=1064441
(In reply to comment #4) > For the record, this seems related too: > https://bugzilla.redhat.com/show_bug.cgi?id=1064441 I mean, it seems related if you expect NM to manage brigeds, but not interfere with them. ... I guess, that is our current expectation. It is unrelated, if you expect NM to keep such devices unmanaged.
(In reply to comment #3) > NM manages all devices now. Some guy named Pavel wrote code for that. The code I delivered explicitly avoided management of virtual devices not configured in NetworkManager. I'm not the one to be blamed for later modifications that weren't in line with the original idea. (In reply to comment #5) > It is unrelated, if you expect NM to keep such devices unmanaged. Well, the whole point of managed/unmanaged devices was to distribute the network configuration management among multiple tools so that they don't interfere. Saying an interface is not managed by NetworkManager used to mean that NetworkManager won't modify the configuration of such a device, at least that was always my interpretation. As far as I understand, you decided to change the meaning to actually mean nothing (all devices are managed) and instead introduce functionality that some interfaces won't be modified. The new way lacks: 1) A simple term describing devices that won't be tampered by NetworkManager 2) A simple way to list those devices To make sure my virtualization devices aren't being tampered with, I'm now using the following workaround: diff --git a/src/devices/nm-device.c b/src/devices/nm-device.c index 3c069e0..159dcb4 100644 --- a/src/devices/nm-device.c +++ b/src/devices/nm-device.c @@ -6405,6 +6405,8 @@ nm_device_state_changed (NMDevice *device, priv->in_state_changed = TRUE; g_return_if_fail (NM_IS_DEVICE (device)); + g_return_if_fail (!!strncmp(nm_device_get_iface(device), "virbr", 5)); + g_return_if_fail (!!strncmp(nm_device_get_iface(device), "veth", 4)); /* Do nothing if state isn't changing, but as a special case allow * re-setting UNAVAILABLE if the device is missing firmware so that we @@ -6961,6 +6963,8 @@ nm_device_set_manager_managed (NMDevice *device, gboolean was_managed, now_managed; g_return_if_fail (NM_IS_DEVICE (device)); + g_return_if_fail (!!strncmp(nm_device_get_iface(device), "virbr", 5)); + g_return_if_fail (!!strncmp(nm_device_get_iface(device), "veth", 4)); priv = NM_DEVICE_GET_PRIVATE (device); Will be glad for any improvements on that.
Created attachment 274613 [details] [review] Workaround for specific device name prefixes. A new version of the workaround. This one works as expected with the current code base: asdfqwer bridge connecting (getting IP configuration) virbr0zxcv bridge unmanaged virbr0zxcvxxx bridge unmanaged
1. If you refer to other bugzilla reports, make sure that other individuals can access them ... https://bugzilla.redhat.com/show_bug.cgi?id=1064441 is restricted?? 2. So, NM is now suppose to manage all (network) devices. How about the case where is is a NIC which I DO NOT want NM to touch in any manner because that NIC will be accessed/controlled/used by a qemu-kvm virtual system or a secure container.
(In reply to comment #8) > 1. If you refer to other bugzilla reports, make sure that other individuals can > access them ... https://bugzilla.redhat.com/show_bug.cgi?id=1064441 is > restricted?? It might have been actually public at the time of linking.
(In reply to comment #8) > 2. So, NM is now suppose to manage all (network) devices. How about the case > where is is a NIC which I DO NOT want NM to touch in any manner because that > NIC will be accessed/controlled/used by a qemu-kvm virtual system or a secure > container. +1 And when you especially want to query which devices are configured not to be touched in any manner. A statement that some changes shouldn't happen is much weaker if you cannot query NetworkManager which devices are so configured.
Marking this to be considered for 1.0 at the last minute, feel free to remove it once decided.
(In reply to comment #8) > 2. So, NM is now suppose to manage all (network) devices. How about the case > where is is a NIC which I DO NOT want NM to touch in any manner because that > NIC will be accessed/controlled/used by a qemu-kvm virtual system or a secure > container. If there is no NetworkManager configuration for the interface with autoconnect=yes, and you have NetworkManager-config-server installed or "no-auto-default=*" set (which is what NetworkManager-config-server does), then NetworkManager will not configure the interface until you it to. If something else configures that NIC with addresses or adds it as a bridge port or bond slave, then NetworkManager will intentionally notice that change, and reflect those changes in its D-Bus interface, but NetworkManager should *not* touch the interface's configuration. If it does, that's a bug and we will fix it.
(In reply to comment #12) > If there is no NetworkManager configuration for the interface with > autoconnect=yes, and you have NetworkManager-config-server installed or > "no-auto-default=*" set (which is what NetworkManager-config-server does), then > NetworkManager will not configure the interface until you it to. This isn't true according to bug #705329. NetworkManager apparently performs configuration tasks. Also I'm more talking about a runtime way to determine the management status of the interface. More specifically a nmcli command that would tell me for each interface: * That the interface is not to be touched by NetworkManager at all. This was formerly called "unmanaged". * That the interface will be prepared by NetworkManager for applying a connection. For example wired interfaces are upped by NM as described in bug #705329, wifi interfaces are used to do even more like scanning, etc... * That the interface is in some of the connection phases. This is now supported by "nmcli dev". This way I would know which interfaces are being just watched and which ones are being used for pre-connection tasks like watching the carrier, scanning available networks, etc... > If something else configures that NIC with addresses or adds it as a bridge > port or bond slave, then NetworkManager will intentionally notice that change, > and reflect those changes in its D-Bus interface, but NetworkManager should > *not* touch the interface's configuration. If it does, that's a bug and we > will fix it. It's very hard to report such bugs if I cannot even check whether the device is marked so in NetworkManager. In the past, it was a trivial task to list which devices are managed (in the sense described above) and which are not (and are only watched). If I could check this property of the interface easily (as it was possible before), I would quickly know whether the configuration action from NetworkManager is a bug (and I would report it or even post a patch) or whether it is a result of wrong configuration (which I want to fix locally, not report to bugzilla to add a useless bug in addition to the ~600 bug reports open for NetworkManager).
(In reply to comment #13) > (In reply to comment #12) > > If there is no NetworkManager configuration for the interface with > > autoconnect=yes, and you have NetworkManager-config-server installed or > > "no-auto-default=*" set (which is what NetworkManager-config-server does), then > > NetworkManager will not configure the interface until you it to. > > This isn't true according to bug #705329. NetworkManager apparently performs > configuration tasks. It is true that NM does IFF_UP the interface, but I would not call this a "configuration task". IFF_UP is not any kind of configuration. We do know it sometimes interferes with bridge ports, and we are working through ways to handle that better. But NM will not unparent the interface, it will not set other flags, it will not apply IP configuration, and it will not remove any IP configuration (or L2 configuration) unless the user explicitly tells NM to do that. > Also I'm more talking about a runtime way to determine the management status of > the interface. More specifically a nmcli command that would tell me for each > interface: > > * That the interface is not to be touched by NetworkManager at all. This was > formerly called "unmanaged". These interfaces are still unmanaged in the previous sense of the word. That sense is "do not do anything with this interface unless the user tells NM to do so". With the exception of IFF_UP as described above. There is no change in that. > * That the interface will be prepared by NetworkManager for applying a > connection. For example wired interfaces are upped by NM as described in bug > #705329, wifi interfaces are used to do even more like scanning, etc... This is any state >= DISCONNECTED as reported by 'nmcli dev'. In states lower than that (UNAVAILABLE or UNMANAGED) NetworkManager is not actively using the interface. Only UNMANAGED state implies that NM is not touching the interface, as described above. > * That the interface is in some of the connection phases. This is now > supported by "nmcli dev". > > This way I would know which interfaces are being just watched and which ones > are being used for pre-connection tasks like watching the carrier, scanning > available networks, etc... nmcli dev and the device states should give you that information, as described above. > > If something else configures that NIC with addresses or adds it as a bridge > > port or bond slave, then NetworkManager will intentionally notice that change, > > and reflect those changes in its D-Bus interface, but NetworkManager should > > *not* touch the interface's configuration. If it does, that's a bug and we > > will fix it. > > It's very hard to report such bugs if I cannot even check whether the device is > marked so in NetworkManager. In the past, it was a trivial task to list which > devices are managed (in the sense described above) and which are not (and are > only watched). More specifically, what you'd probably like to see as an enhancement is to expose the "assumed" property on the NMActiveConnection object via the D-Bus interface. That's certainly something we can take patches for and should be fairly straightforward.
I'
I've pushed dcbw/external-managed-iffup-rh1030947 to address some of these issues. Copying commit message here: ==== core: don't manage externally created software devices until IFF_UP (rh #1030947) (bgo #725647) Externally created software devices would be managed/assumed immediately upon creation, which includes setting them IFF_UP and possibly turning on NM-managed IPv6LL. With this branch, expected behavior is: 1) external software device created: unmanaged state 2) external software device IP address added but !IFF_UP: no action 3) external software device set IFF_UP: existing connection assumed This branch ensures that external software devices are not set IFF_UP by NetworkManager when they are discovered, and that their connections (if any) are not assumed until they are set IFF_UP externally. https://bugzilla.gnome.org/show_bug.cgi?id=725647 https://bugzilla.redhat.com/show_bug.cgi?id=1030947 ====
> With this branch, expected behavior is: Should the commit message be referring to itself as a branch? *shrug* > This branch ensures that external software devices are not set IFF_UP > by NetworkManager when they are discovered, and that their connections not set IFF_UP by NetworkManager *ever*, in fact, right? Patch looks good
(In reply to comment #17) > > With this branch, expected behavior is: > > Should the commit message be referring to itself as a branch? *shrug* Fixed. > > This branch ensures that external software devices are not set IFF_UP > > by NetworkManager when they are discovered, and that their connections > > not set IFF_UP by NetworkManager *ever*, in fact, right? NM might later, because they are not default-unmanaged. NM will set the bridge up in this case: 1) brctl addbr br32 2) ip addr add 1.2.3.4/24 dev br32 3) ip link set dev br32 up <nm assumes connection> 4) nmcli dev disconnect br32 5) ip link set dev br32 down 6) ip addr add 1.2.3.4/24 dev br32 <nm assumes connection and sets bridge IFF_UP> We could prevent this by making external software devices default-unmanaged I suppose, but that's a larger behavior change than I was hoping to make...
Pushed branch with a new patch that prevents the above behavior, by unmanaging an external software device when it is DISCONNECTED and set !IFF_UP.
>> core: don't manage externally created software devices until IFF_... commit message: This branch ensures that external software devices are not set IFF_UP ^^^^^^ { - return NM_DEVICE_GET_PRIVATE (self)->unmanaged_flags & flag; + return !!(NM_DEVICE_GET_PRIVATE (self)->unmanaged_flags & flag); } maybe NM_FLAGS_ANY()? - if (nm_device_get_unmanaged_flag (device, NM_UNMANAGED_USER) || - nm_device_get_unmanaged_flag (device, NM_UNMANAGED_EXTERNAL_DOWN) || - nm_device_get_unmanaged_flag (device, NM_UNMANAGED_INTERNAL) || - nm_device_get_unmanaged_flag (device, NM_UNMANAGED_PARENT)) + if (nm_device_get_unmanaged_flag (device, NM_UNMANAGED_USER | + NM_UNMANAGED_EXTERNAL_DOWN | + NM_UNMANAGED_INTERNAL | ? Sidenote: I'd say for now this approach looks right. But it also means, that we won't expose runtime information (e.g. IP addresses) until we assume the connection. So, there is an IP address, but user API doesn't show it. This too indicates that we should in the future: - treat unmanaged devices more like "assumed" should be now -- but with the guarantee that we really really don't mess with the interface. E.g. no IFF_UP, no sysctl setting, etc. BUT: do assume a connection, show it as activated in the UI, expose runtime-information for the ActiveConnection, etc. This is also useful for the devices that are configured un-managed. The user really doesn't want us to mess with the interface. But we still want to show runtime-info. - "assumed" devices should have the notion that we might change some thing (e.g. IFF_UP, extend lease times when DHCP returns, set sysctl). But this is more like a very careful, non-disruptive, managed mode where NM does not takes full ownership of the interface. I didn't the commits, but LGTM
basically looks good >+is_software_external (NMDevice *self) >+{ >+ return nm_device_is_software (self) >+ && !nm_device_get_is_nm_owned (self) >+ && NM_DEVICE_GET_PRIVATE (self)->ifindex > 0; Is that last clause really part of being "software external", or should it be an additional "&&" clause on the line that calls is_software_external() below? >+ nm_device_set_unmanaged (self, >+ NM_UNMANAGED_EXTERNAL_DOWN, >+ TRUE, >+ NM_DEVICE_STATE_REASON_UNKNOWN); NM_DEVICE_STATE_REASON_USER_REQUESTED?
(In reply to comment #21) > basically looks good > > >+is_software_external (NMDevice *self) > >+{ > >+ return nm_device_is_software (self) > >+ && !nm_device_get_is_nm_owned (self) > >+ && NM_DEVICE_GET_PRIVATE (self)->ifindex > 0; > > Is that last clause really part of being "software external", or should it be > an additional "&&" clause on the line that calls is_software_external() below? > > >+ nm_device_set_unmanaged (self, > >+ NM_UNMANAGED_EXTERNAL_DOWN, > >+ TRUE, > >+ NM_DEVICE_STATE_REASON_UNKNOWN); > > NM_DEVICE_STATE_REASON_USER_REQUESTED? Both fixed, pushed a fixup. Please re-review!
looks good
LGTM too. I think this branch brings quite an improvement.
I merged nm_device_update_initial_unmanaged_flags() into nm_device_finish_init() since setting the initial unmanaged flags doesn't matter before or after the device is exported, and these functions are called about the same time. There is no functional difference to that change. git master: b86c511db5b755187c9443a89e06197b3966f91b core: don't bring up devices using assumed connections (bgo #725647) (rh #1030947) 5fb20d80389ca42eb3d655c292e0b517f3a83ea8 core: don't manage externally created software devices until IFF_UP (rh #1030947) (bgo #725647) 5f04324f8bef43ebe831a6fa20ea8577dcad0ac5 core: don't assume connections for INTERNAL or PARENT unmanaged devices nm-1-0: 80f15f5284efb147ce8c204699bd9cbc38ef2158 core: don't bring up devices using assumed connections (bgo #725647) (rh #1030947) fa41627152df08216dc21a85a9412c83b0a5dcb8 core: don't manage externally created software devices until IFF_UP (rh #1030947) (bgo #725647) dc4d39a2d4582eb5092651f5a86f6e77488dafa0 core: don't assume connections for INTERNAL or PARENT unmanaged devices