After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 733641 - Cannot bring up a bridge via ifup without causing an error ('waiting for slaves before proceeding')
Cannot bring up a bridge via ifup without causing an error ('waiting for slav...
Status: RESOLVED FIXED
Product: NetworkManager
Classification: Platform
Component: nmcli
0.9.x
Other Linux
: Normal normal
: ---
Assigned To: NetworkManager maintainer(s)
NetworkManager maintainer(s)
Depends on:
Blocks:
 
 
Reported: 2014-07-23 23:49 UTC by Adam Williamson
Modified: 2014-08-19 13:21 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
cli: let activation of master connection succeed when device state reaches IP_CONFIG (5.72 KB, patch)
2014-07-28 12:44 UTC, Thomas Haller
none Details | Review
[PATCH] the same patch as in comment #8; except a change to apply for current master (5.88 KB, patch)
2014-08-19 07:55 UTC, Jiri Klimes
reviewed Details | Review

Description Adam Williamson 2014-07-23 23:49:47 UTC
Since this commit:

http://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/cli/src/connections.c?id=4358f5cc39a8ca20df97ad0a5947dd7eb5ccddb9

it has not been possible to bring up a bridged connection using ifup without causing an error.

Say you have a bridge device 'br0' and a slaved interface 'eth0'. If you try and bring up the slaved interface first:

# ifup eth0

you'll get:

Error: Connection activation failed: Master connection not found or invalid

so obviously you have to bring up the bridge device first. But if you do that:

# ifup br0

You get this error:

Error: Device 'br0' is waiting for slaves before proceeding with activation.

But that's not really an error, is it? It's more just a status note. Really, ifup/NM *did* bring up the bridge - you can see it there in 'brctl show' and 'nmcli con show', just waiting for slaves to be brought up. And indeed, if you *now* run:

# ifup eth0

then the slave device comes up (no "Master connection not found") error and all is working. But you cannot avoid the error.

This is just an annoyance for interactive work, but it's critical for non-interactive use. I found this while debugging creation of bridges in virt-manager, see https://bugzilla.redhat.com/show_bug.cgi?id=1122729 . virt-manager uses ifup to initialize connections, and it bails out when ifup returns an error, so you actually cannot successfully bring up a bridge from virt-manager.

It's fine for NM to provide this information, but it's not really an error, and shouldn't be treated as one.
Comment 1 Adam Williamson 2014-07-24 00:02:04 UTC
I tested with nmcli direct, as well. If you bring up the bridge connection before bringing up the slave connection, you see the same error.

Different from the ifup case, nmcli will allow you to bring up the slave connection before the bridge - it does not throw an equivalent of the "Master connection not found or invalid" error if you do so. But I found that bringing up the slave device and then the bridge in quick succession still showed the error message; it seems it takes a bit of time for everything to come up properly (it's like 20-25 seconds before br0 actually gets an IP address), and nmcli times out during that period.
Comment 2 Adam Williamson 2014-07-24 00:23:21 UTC
Actually, if you bring the slave profile up first with 'nmcli' and then the bridge, something rather odd happens. Filed that as https://bugzilla.gnome.org/show_bug.cgi?id=733644 .
Comment 3 Thomas Haller 2014-07-24 19:11:14 UTC
When you have autoconnect birdge-slaves, at certain instances they might autoconnect. For example when another connection gets deactivated, NM searches for activatable connections, and might to decide to up the bridge-slave.
Upping a bridge slave does also tear up the master (if it is not yet active).

On the other hand, when you up the master instead, no bridges are activated (even if they are autoconnect). This is on purpose.


So, upping a master does not actually fully activate the bridge, hence we print a warning (and even fail nmcli).

Given the above, I think that upping a master *cannot* ever succeed, and nmcli will always fail after timeout. If that is correct, nmcli should behave differently for master devices and not wait until they are fully activated, but return success once they reach state "connecting (getting IP configuration)".

Give the above, upping a master does not lead up to a fully activated device (ever), hence the behavior should be different.


I guess, that is also relevant to other UI clients (nm-applet). Probably they should show a special icon for such slave-less masters -- not the "connecting" icon.
Comment 4 Adam Williamson 2014-07-24 19:20:12 UTC
yeah, that's more or less what I figured. I think it'd be fine (even good) to still print the *message* "is waiting for slaves before proceeding with activation", it's just the fact that it's treated as an error condition that's the problem. As you say it should be treated as success with an informational message.
Comment 5 Dan Williams 2014-07-25 19:11:56 UTC
Yeah, nmcli should probably just exit once a master is set up instead of waiting a long time.

I'll note that the Fedora intiscripts *also* don't automatically bring up slaves when you start a bridge, at least if I'm reading ifup-eth correctly.  Which is one reason NM never did that either.  But for bonding, initscripts *do* bring up slaves, except for NM I/we assumed back then that intiscripts did the same thing for both bridge/bond and didn't bother to confirm, so NM doesn't bring up any slaves.  We've also got a task on the agenda to add an option to bring up slaves when the master is started, since this seems to trip people up.
Comment 6 Adam Williamson 2014-07-25 19:18:26 UTC
one case that might be interesting there is the case that's actually fairly typical if you just deploy a system then set up a bridge. Let's say you've just set up the bridge. You now have:

1. the bridge connection
2. the original, 'normal', non-bridge slave connection for the ethernet adapter
3. the new bridge-slave connection for the ethernet adapter

now let's say profile 2 is active on the ethernet adapter, and you now bring up the bridge connection (with nmcli or whatever) - "hey, I just created my bridge, let's bring it up". what should NM do? I guess my instinctive answer is 'as well as bringing up the bridge connection, also take connection 2 down and bring connection 3 up' (i.e. switch the ethernet adapter from being directly connected to being the bridge slave), but I guess there may be reasons you might not want to do that? just trying to consider possibilities...
Comment 7 Adam Williamson 2014-07-26 19:29:09 UTC
oh, one more somewhat-related note - another thing I saw when poking into this is that the GNOME Network control panel does not display inactive bridge slave adapter connections. That's https://bugzilla.gnome.org/show_bug.cgi?id=733634 . It may be that GNOME expects NM to bring up the slave connection when you bring up the bridge. As things stand, you can't actually bring up a bridge fully using the GUI.
Comment 8 Thomas Haller 2014-07-28 12:44:27 UTC
Created attachment 281865 [details] [review]
cli: let activation of master connection succeed when device state reaches IP_CONFIG

When connecting a master connection, no slave devices will be activated
automatically. The user is supposed to activate them individually.

Hence nmcli should not wait for the connection to be fully activated
because that is not going to happen (unless the user connects a slave
connection from another terminal).

Instead, for master connections behave differently and signal success
once the master device reaches IP_CONFIG state.

This revises behavior introduced by commit
47710f8211f178cddf5f84c1a50146d7476115a7.

https://bugzilla.gnome.org/show_bug.cgi?id=733641

Signed-off-by: Thomas Haller <thaller@redhat.com>
Comment 9 Thomas Haller 2014-07-28 13:15:20 UTC
(In reply to comment #6)
> one case that might be interesting there is the case that's actually fairly
> typical if you just deploy a system then set up a bridge. Let's say you've just
> set up the bridge. You now have:
> 
> 1. the bridge connection
> 2. the original, 'normal', non-bridge slave connection for the ethernet adapter
> 3. the new bridge-slave connection for the ethernet adapter
> 
> now let's say profile 2 is active on the ethernet adapter, and you now bring up
> the bridge connection (with nmcli or whatever) - "hey, I just created my
> bridge, let's bring it up". what should NM do? I guess my instinctive answer is
> 'as well as bringing up the bridge connection, also take connection 2 down and
> bring connection 3 up' (i.e. switch the ethernet adapter from being directly
> connected to being the bridge slave), but I guess there may be reasons you
> might not want to do that? just trying to consider possibilities...

I think it is important that the behavior of initscripts stays identical regardless of whether the device is controlled by NM.


Regardless of that, I think it is reasonable not to activate any slaves by default. Usually you have more interfaces that you want to bridge/bond, so I think NM should not guess which (of possibly several) to activate. This is especially the case, when already another connection is active on a device. In that case, activating a master device potentially breaks your connectivity by attaching an unwanted device.

Still, the behavior seems indeed useful. We should add this possibility (at least client-side to nmcli), so that we could also behave correctly for initscripts backward compatibility.
Comment 10 Adam Williamson 2014-07-28 22:14:54 UTC
So I just ran some tests comparing NM+nmcli, NM+ifup, and network+ifup in a very basic config (ifcfg-bridge , ifcfg-slave , no other connections). This bug is confirmed. Interestingly, in this case, I don't hit the "Master connection not found or invalid" error when trying to bring up the slave first in the NM case. Both 'ifup slave' and 'nmcli con up slave' with NetworkManager in charge work, and bring up the entire bridge (both slave and bridge profiles). So I'll have to try and recreate exactly how I hit that error in the initial testing, as I guess it's a separate bug with a particular config.

FWIW the network.service behaviour is actually, well, rather worse than NM's. So if you want to be compatible, you're going to have break NM a bit. :P But that's outside the scope of this bug.
Comment 11 Adam Williamson 2014-07-28 22:56:40 UTC
aha. I figured out the slightly subtle corner case that triggers the "Master connection not found" error and filed it as https://bugzilla.gnome.org/show_bug.cgi?id=733890 .
Comment 12 Adam Williamson 2014-07-29 00:52:20 UTC
Fix looks good in a quick test; I did a scratch build with the patch (http://koji.fedoraproject.org/koji/taskinfo?taskID=7205403 ) and did a test as I've been testing so far, and now 'ifup bridge' works as the patch intends (returns successfully very fast, with an informational message that it's waiting for slaves). 'ifup slave' then brings up the slave correctly and the bridge functions.
Comment 13 Dan Williams 2014-08-14 22:17:04 UTC
Patch looks good to me.

And doesn't this patch actually provide the same behavior that the network scripts used to have anyway?  ISTR they just started the bridge and exited without waiting for anything else to happen.  Then the ports would get up-ed in a second round in network.service.

For bonds though, the initscripts do bring up the slaves when bringing up the master, so NM still isn't the same as the initscripts here.
Comment 14 Jiri Klimes 2014-08-19 07:55:16 UTC
Created attachment 283862 [details] [review]
[PATCH] the same patch as in comment #8; except a change to apply for current master

The patch in comment #8 looks good to me. I have just changed it to apply for current master, and tested.
Comment 15 Thomas Haller 2014-08-19 12:34:14 UTC
(In reply to comment #14)
> Created an attachment (id=283862) [details] [review]
> [PATCH] the same patch as in comment #8; except a change to apply for current
> master
> 
> The patch in comment #8 looks good to me. I have just changed it to apply for
> current master, and tested.

Merged patch:

master:
http://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=00af27f0ba3fb8dcd58bc73934e98c44cd3f43b9

nm-0-9-10:
http://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=2e59e66416b5739ed3a7ef717bbeadc66c630750
Comment 16 Thomas Haller 2014-08-19 13:21:37 UTC
I think this bug is fixed, for the remaining question about differences in initscripts with bonds, I opened bug 735052.