After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 723178 - Subnet routes on the same subnet have the same metric
Subnet routes on the same subnet have the same metric
Status: RESOLVED FIXED
Product: NetworkManager
Classification: Platform
Component: IP and DNS config
git master
Other Linux
: Normal normal
: ---
Assigned To: NetworkManager maintainer(s)
NetworkManager maintainer(s)
: 723730 (view as bug list)
Depends on: 735512
Blocks:
 
 
Reported: 2014-01-28 17:17 UTC by Dan Williams
Modified: 2014-12-11 09:09 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Dan Williams 2014-01-28 17:17:54 UTC
Connect a wired and wifi interface to the same network.  Their subnet routes will both get the same metric, so the routing will not be deterministic.  This used to work (probably because NM was removing and re-adding the subnet routes, which causes other problems and has been stopped) but was broken at some point after the platform merge.

If at all possible, we want to avoid removing the route and re-adding it with the new metric, since that casues netlink add/remove events and ripples throughout various network stuff, breaking things like IPSec.  Ideally we can either tell the kernel what metric to use when it adds the subnet route.
Comment 1 Dan Williams 2014-01-28 17:32:29 UTC
In the kernel, the address addition (inet_rtm_newaddr) triggers a NETDEV_UP event internally:

		rtmsg_ifa(RTM_NEWADDR, ifa, nlh, NETLINK_CB(skb).portid);
		blocking_notifier_call_chain(&inetaddr_chain, NETDEV_UP, ifa);

which the IPv4 routing code in fib_frontend.c recieves in fib_netdev_event().  That code then walks all addresses on the interface:

	case NETDEV_UP:
		for_ifa(in_dev) {
			fib_add_ifaddr(ifa);
		} endfor_ifa(in_dev);

and adds a subnet route in fib_add_ifaddr():

		fib_magic(RTM_NEWROUTE,
			  dev->flags & IFF_LOOPBACK ? RTN_LOCAL : RTN_UNICAST,
			  prefix, ifa->ifa_prefixlen, prim);

but this does not allow setting the metric at all.

WRT modifying the existing route to set a metric, we need to be very careful here (if we can do it at all) that the route doesn't get removed first, just modified in place.  fib_table_insert() is the place that happens, and it's not clear from the code whether the NLM_F_REPLACE section will trigger a removal or not.
Comment 2 Ferry Huberts 2014-01-28 19:05:17 UTC
See https://bugzilla.redhat.com/show_bug.cgi?id=1050546

This is a regression
Comment 3 Ferry Huberts 2014-01-28 19:07:28 UTC
if you use netlink, you can do a route 'replace' (atomic remove-add)
Comment 4 Pavel Simerda 2014-02-02 12:19:20 UTC
As usual, I'll just add a note that you don't actually have to add the worse-metric route at all as it will be never used. The metric has no sense expect for ad-hoc integration between tools.
Comment 5 Pavel Simerda 2014-02-02 12:43:32 UTC
(In reply to comment #2)
> See https://bugzilla.redhat.com/show_bug.cgi?id=1050546
> 
> This is a regression

Which version was the last one that worked for you?
Comment 6 Ferry Huberts 2014-02-02 13:20:09 UTC
(In reply to comment #4)
> As usual, I'll just add a note that you don't actually have to add the
> worse-metric route at all as it will be never used. The metric has no sense
> expect for ad-hoc integration between tools.

The metric _does_ matter.
Being on the same network with both eth and wlan results in 2 identical routes, with the same metric.
The kernel will then just pick a route for hosts on that network. Usuallly not the one you want.
You _really_ need a metric.

You don't need a metric for hosts outside that network since NM only configures a single default gateway.

NM _could_ configure multiple default gateways, with those same metrics, but that is another story.
Comment 7 Ferry Huberts 2014-02-02 13:21:17 UTC
(In reply to comment #5)
> (In reply to comment #2)
> > See https://bugzilla.redhat.com/show_bug.cgi?id=1050546
> > 
> > This is a regression
> 
> Which version was the last one that worked for you?

I upgraded to F20 somewhere around week 2 of 2014. The F19 version of NM was working for me (different metrics), the F20 version was not working (always metric 0)
Comment 9 Pavel Simerda 2014-02-03 10:16:02 UTC
(In reply to comment #6)
> (In reply to comment #4)
> > As usual, I'll just add a note that you don't actually have to add the
> > worse-metric route at all as it will be never used. The metric has no sense
> > expect for ad-hoc integration between tools.
> 
> The metric _does_ matter.
> Being on the same network with both eth and wlan results in 2 identical routes,
> with the same metric.
> The kernel will then just pick a route for hosts on that network. Usuallly not
> the one you want.
> You _really_ need a metric.

As the kernel only uses the best metric, all unprioritized routes are redundant. Adding only the best metric route has the very same effect of adding all of them.

Therefore when one tool (NetworkManager, for examplle) keeps multiple routes for the same destination, it's purely an implementation detail whether it posts all of them and prioritize one via a metric, or it only pushes the best one to the kernel.

Those are just notes on technical details, not on actual behavior.
Comment 10 Ferry Huberts 2014-02-03 10:41:12 UTC
(In reply to comment #9)
> (In reply to comment #6)
> > (In reply to comment #4)
> > > As usual, I'll just add a note that you don't actually have to add the
> > > worse-metric route at all as it will be never used. The metric has no sense
> > > expect for ad-hoc integration between tools.
> > 
> > The metric _does_ matter.
> > Being on the same network with both eth and wlan results in 2 identical routes,
> > with the same metric.
> > The kernel will then just pick a route for hosts on that network. Usuallly not
> > the one you want.
> > You _really_ need a metric.
> 
> As the kernel only uses the best metric, all unprioritized routes are
> redundant. Adding only the best metric route has the very same effect of adding
> all of them.


Not really.
In our routers we make sure that existing connections will use the same interface even when a better route becomes available (connection pinning). This is to avoid breaking connections.

Therefore is it very much desired to have correct routes coupled to interfaces.
And with multiple interface on the same network you/we really want have these duplicate route.
To designate the best route you then have to use metrics.

> 
> Therefore when one tool (NetworkManager, for examplle) keeps multiple routes
> for the same destination, it's purely an implementation detail whether it posts
> all of them and prioritize one via a metric, or it only pushes the best one to
> the kernel.

There is a substantial difference, see above.

> 
> Those are just notes on technical details, not on actual behavior.

Since we're going this route...

Correct behaviour is:
- each interface has its routes, no 'redundant route removal' is done
- each interface has its own default gateway route
- each route will have a metric that represents its cost (usually inversely proportional to bandwidth)
Comment 11 Pavel Simerda 2014-02-03 11:19:00 UTC
(In reply to comment #10)
> (In reply to comment #9)
> > (In reply to comment #6)
> > > (In reply to comment #4)
> > > > As usual, I'll just add a note that you don't actually have to add the
> > > > worse-metric route at all as it will be never used. The metric has no sense
> > > > expect for ad-hoc integration between tools.
> > > 
> > > The metric _does_ matter.
> > > Being on the same network with both eth and wlan results in 2 identical routes,
> > > with the same metric.
> > > The kernel will then just pick a route for hosts on that network. Usuallly not
> > > the one you want.
> > > You _really_ need a metric.
> > 
> > As the kernel only uses the best metric, all unprioritized routes are
> > redundant. Adding only the best metric route has the very same effect of adding
> > all of them.
> 
> 
> Not really.
> In our routers we make sure that existing connections will use the same
> interface even when a better route becomes available (connection pinning).

Could I have more details on that connection pinning on Linux? How does it relate to the following requested NetworkManager feature?

https://bugzilla.gnome.org/show_bug.cgi?id=709478

Feel free to contact me via mail if you see fit, or on IRC pavlix #nm freenode but I might be a bit busy those days.

> This is to avoid breaking connections.

I'm curious whether that could be used to provide source based routing and multipath TCP as well (see the bug report above).

> Therefore is it very much desired to have correct routes coupled to interfaces.

Does it mean you're pinning the actual secondary route records (those with worse metrics) and when those routes are removed, respective connections are interrupted? I'm asking just to be sure I understand.

Please provide links to any resources on that.

> And with multiple interface on the same network you/we really want have these
> duplicate route.
> To designate the best route you then have to use metrics.

When priority changed dynamically that could actually work. But I haven't yet seen it working in the wild and I don't even know whether it works properly at all.

> Correct behaviour is:

s/Correct/Desired/

> - each interface has its routes, no 'redundant route removal' is done

I can stand corrected but AFAIK the linux kernel doesn't allow you to change metric on a route. Instead I believe you need to remove the route and add one with a different metric.

That would interrupt your connection anyway, wouldn't it?

> - each interface has its own default gateway route

That's not currently supported by NetworkManager. Only one device is selected for default routing and default DNS. That would be a feature request for NetworkManager and I'm eager to hear more about that.

> - each route will have a metric that represents its cost (usually inversely
> proportional to bandwidth)

AFAIK proportionality doesn't matter here, just the ordering, but basicaly yes.

That said, I don't think the topic here is just about adding routes with different metrics. Thanks for the information already provided and looking forward to follow-ups.
Comment 12 Ferry Huberts 2014-02-03 11:39:10 UTC
(In reply to comment #11)
> (In reply to comment #10)
> > (In reply to comment #9)
> > > (In reply to comment #6)
> > > > (In reply to comment #4)
> > > > > As usual, I'll just add a note that you don't actually have to add the
> > > > > worse-metric route at all as it will be never used. The metric has no sense
> > > > > expect for ad-hoc integration between tools.
> > > > 
> > > > The metric _does_ matter.
> > > > Being on the same network with both eth and wlan results in 2 identical routes,
> > > > with the same metric.
> > > > The kernel will then just pick a route for hosts on that network. Usuallly not
> > > > the one you want.
> > > > You _really_ need a metric.
> > > 
> > > As the kernel only uses the best metric, all unprioritized routes are
> > > redundant. Adding only the best metric route has the very same effect of adding
> > > all of them.
> > 
> > 
> > Not really.
> > In our routers we make sure that existing connections will use the same
> > interface even when a better route becomes available (connection pinning).
> 
> Could I have more details on that connection pinning on Linux? How does it
> relate to the following requested NetworkManager feature?
> 
> https://bugzilla.gnome.org/show_bug.cgi?id=709478

It relates very much :-)

What we do (in olsrd) is a first step towards avoiding breaking connections.
Source based routing and multi-path TCP are the follow-up steps to take.


> 
> Feel free to contact me via mail if you see fit, or on IRC pavlix #nm freenode
> but I might be a bit busy those days.
> 
> > This is to avoid breaking connections.
> 
> I'm curious whether that could be used to provide source based routing and
> multipath TCP as well (see the bug report above).
> 
> > Therefore is it very much desired to have correct routes coupled to interfaces.
> 
> Does it mean you're pinning the actual secondary route records (those with
> worse metrics) and when those routes are removed, respective connections are
> interrupted? I'm asking just to be sure I understand.

basically, yes.
In reality we take down connections when the costs become too high. basically the same.

> 
> Please provide links to any resources on that.

we do this in olsrd, it's called multi-smart-gateway.

see
- http://battlemesh.org/BattleMeshV6/Agenda?action=AttachFile&do=view&target=Mitigation+of+Breaking+Connections+-+Multi-Gateway+and+BRDP.pdf
- http://olsr.org/git/?p=olsrd.git;a=blob;f=README-Olsr-Extensions;h=3a7af3e20539478b60b44973893a08694f6c1ec8;hb=HEAD#l220
- http://olsr.org/git/?p=olsrd.git;a=blob;f=files/sgw_policy_routing_setup.sh;h=7648fe13e374d05dab04b4e4a22e985634c91414;hb=HEAD


> 
> > And with multiple interface on the same network you/we really want have these
> > duplicate route.
> > To designate the best route you then have to use metrics.
> 
> When priority changed dynamically that could actually work. But I haven't yet
> seen it working in the wild and I don't even know whether it works properly at
> all.
> 

we have it working.

> > Correct behaviour is:
> 
> s/Correct/Desired/
> 
> > - each interface has its routes, no 'redundant route removal' is done
> 
> I can stand corrected but AFAIK the linux kernel doesn't allow you to change
> metric on a route. Instead I believe you need to remove the route and add one
> with a different metric.

yes and no. you can do an atomic replace (atomic remove/add) via netlink

> 
> That would interrupt your connection anyway, wouldn't it?

no. atomic replace

> 
> > - each interface has its own default gateway route
> 
> That's not currently supported by NetworkManager. Only one device is selected

I know.
Would be awesome if NM could implement it though.
That would also require that the user be able to manually sort the connections...

> for default routing and default DNS. That would be a feature request for
> NetworkManager and I'm eager to hear more about that.
> 
> > - each route will have a metric that represents its cost (usually inversely
> > proportional to bandwidth)
> 
> AFAIK proportionality doesn't matter here, just the ordering, but basicaly yes.

for us it matters very much: we have UMTS, Ethernet and satcom on our nodes.
connection status and bandwidth determine the interface costs and therefore which route will be the strongest.
bandwidths vary. this is not your father's ethernet :-)


> 
> That said, I don't think the topic here is just about adding routes with
> different metrics. Thanks for the information already provided and looking
> forward to follow-ups.



I would still like to see this bug resolved asap since it's bothering the hell out of me.
I have the same again as before it got fixed: when on my LAN (with eth and wlan) usually the wlan interface is picked.
This is 'unexpected' and undesirable :-)
Comment 13 Pavel Simerda 2014-02-03 16:43:54 UTC
https://bugzilla.gnome.org/show_bug.cgi?id=709478
Comment 14 Pavel Simerda 2014-02-04 13:24:25 UTC
(In reply to comment #12)
> > Could I have more details on that connection pinning on Linux? How does it
> > relate to the following requested NetworkManager feature?
> > 
> > https://bugzilla.gnome.org/show_bug.cgi?id=709478
> 
> It relates very much :-)

Thanks, indeed.

> What we do (in olsrd) is a first step towards avoiding breaking connections.
> Source based routing and multi-path TCP are the follow-up steps to take.

Sure. You seemed to claim you are specifying source based routing without separate routing tables. I would like to hear about it as I don't know about such a feature in the Linux kernel.

> > Does it mean you're pinning the actual secondary route records (those with
> > worse metrics) and when those routes are removed, respective connections are
> > interrupted? I'm asking just to be sure I understand.
> 
> basically, yes.
> In reality we take down connections when the costs become too high. basically
> the same.

But that sounds like a policy decision, not a limitation of the solution, right?

> > Please provide links to any resources on that.
> 
> we do this in olsrd, it's called multi-smart-gateway.
> 
> see
> -
> http://battlemesh.org/BattleMeshV6/Agenda?action=AttachFile&do=view&target=Mitigation+of+Breaking+Connections+-+Multi-Gateway+and+BRDP.pdf

Thanks, googled that one already

> http://olsr.org/git/?p=olsrd.git;a=blob;f=README-Olsr-Extensions;h=3a7af3e20539478b60b44973893a08694f6c1ec8;hb=HEAD#l220
> -
> http://olsr.org/git/?p=olsrd.git;a=blob;f=files/sgw_policy_routing_setup.sh;h=7648fe13e374d05dab04b4e4a22e985634c91414;hb=HEAD

Those resources seem to be using multiple routing tables for lookups and routing rules so that the tables are actually used.

> > I can stand corrected but AFAIK the linux kernel doesn't allow you to change
> > metric on a route. Instead I believe you need to remove the route and add one
> > with a different metric.
> 
> yes and no. you can do an atomic replace (atomic remove/add) via netlink

Again, I can stand corrected but AFAIK the metric is one of the key attributes and therefore the routes are considered distinct by the kernel and the atomic update is not possible.

> > That's not currently supported by NetworkManager. Only one device is selected
> 
> I know.
> Would be awesome if NM could implement it though.

That would indeed be great.

> That would also require that the user be able to manually sort the
> connections...

That's a long requested feature AFAIK, see:

https://bugzilla.gnome.org/show_bug.cgi?id=580018

> > AFAIK proportionality doesn't matter here, just the ordering, but basicaly yes.
> 
> for us it matters very much:

But for the kernel? AFAIK the kernel just picks the best one and that's all.

> we have UMTS, Ethernet and satcom on our nodes.
> connection status and bandwidth determine the interface costs and therefore
> which route will be the strongest.

That only confirms what I said, as ordering is sufficient to find the strongest and the relative values are more or less irrelevant.

> > That said, I don't think the topic here is just about adding routes with
> > different metrics. Thanks for the information already provided and looking
> > forward to follow-ups.
> 
> 
> 
> I would still like to see this bug resolved asap since it's bothering the hell
> out of me.

Indeed. We also discussed adding a feature to the kernel so it doesn't create the device routes (Thomas already did that for other reasons to IPv6) and NetworkManager could always add the device routes itself, with the right metric.

Also it would be a step towards being able to fill the multiple tables used for source routing more easily.
Comment 15 Ferry Huberts 2014-02-04 17:17:18 UTC
(In reply to comment #14)
> (In reply to comment #12)
> > > Could I have more details on that connection pinning on Linux? How does it
> > > relate to the following requested NetworkManager feature?
> > > 
> > > https://bugzilla.gnome.org/show_bug.cgi?id=709478
> > 
> > It relates very much :-)
> 
> Thanks, indeed.
> 
> > What we do (in olsrd) is a first step towards avoiding breaking connections.
> > Source based routing and multi-path TCP are the follow-up steps to take.
> 
> Sure. You seemed to claim you are specifying source based routing without
> separate routing tables. I would like to hear about it as I don't know about
> such a feature in the Linux kernel.

I was talking about IPv6 and subtrees. At least I think it's called subtrees, my colleague was experimenting with this.

> 
> > > Does it mean you're pinning the actual secondary route records (those with
> > > worse metrics) and when those routes are removed, respective connections are
> > > interrupted? I'm asking just to be sure I understand.
> > 
> > basically, yes.
> > In reality we take down connections when the costs become too high. basically
> > the same.
> 
> But that sounds like a policy decision, not a limitation of the solution,
> right?
> 

yes

> > > Please provide links to any resources on that.
> > 
> > we do this in olsrd, it's called multi-smart-gateway.
> > 
> > see
> > -
> > http://battlemesh.org/BattleMeshV6/Agenda?action=AttachFile&do=view&target=Mitigation+of+Breaking+Connections+-+Multi-Gateway+and+BRDP.pdf
> 
> Thanks, googled that one already
> 
> > http://olsr.org/git/?p=olsrd.git;a=blob;f=README-Olsr-Extensions;h=3a7af3e20539478b60b44973893a08694f6c1ec8;hb=HEAD#l220
> > -
> > http://olsr.org/git/?p=olsrd.git;a=blob;f=files/sgw_policy_routing_setup.sh;h=7648fe13e374d05dab04b4e4a22e985634c91414;hb=HEAD
> 
> Those resources seem to be using multiple routing tables for lookups and
> routing rules so that the tables are actually used.
> 
> > > I can stand corrected but AFAIK the linux kernel doesn't allow you to change
> > > metric on a route. Instead I believe you need to remove the route and add one
> > > with a different metric.
> > 
> > yes and no. you can do an atomic replace (atomic remove/add) via netlink
> 
> Again, I can stand corrected but AFAIK the metric is one of the key attributes
> and therefore the routes are considered distinct by the kernel and the atomic
> update is not possible.
> 

If that is so that I stand corrected :-)

> > > That's not currently supported by NetworkManager. Only one device is selected
> > 
> > I know.
> > Would be awesome if NM could implement it though.
> 
> That would indeed be great.
> 
> > That would also require that the user be able to manually sort the
> > connections...
> 
> That's a long requested feature AFAIK, see:
> 
> https://bugzilla.gnome.org/show_bug.cgi?id=580018
> 
> > > AFAIK proportionality doesn't matter here, just the ordering, but basicaly yes.
> > 
> > for us it matters very much:
> 
> But for the kernel? AFAIK the kernel just picks the best one and that's all.
> 

agree.
however...
imagine a situation in which we have 1 plugin per interface that determines its cost.
I would not want the interface plugins to have knowledge of other interface plugins.
Therefore, we need to be able to set an arbitrary cost/metric ('order') on the interface.


> > we have UMTS, Ethernet and satcom on our nodes.
> > connection status and bandwidth determine the interface costs and therefore
> > which route will be the strongest.
> 
> That only confirms what I said, as ordering is sufficient to find the strongest
> and the relative values are more or less irrelevant.

As long as I can set an arbitrary cost/metric/order then I'd be happy

> 
> > > That said, I don't think the topic here is just about adding routes with
> > > different metrics. Thanks for the information already provided and looking
> > > forward to follow-ups.
> > 
> > 
> > 
> > I would still like to see this bug resolved asap since it's bothering the hell
> > out of me.
> 
> Indeed. We also discussed adding a feature to the kernel so it doesn't create
> the device routes (Thomas already did that for other reasons to IPv6) and
> NetworkManager could always add the device routes itself, with the right
> metric.
> 
> Also it would be a step towards being able to fill the multiple tables used for
> source routing more easily.

this could become awesome.
we now have all kinds of custom daemons watching interfaces and determining costs.
having NM handle the bulk of that.... potentially awesome.

now it's back to tuning our multi-smart-gateway setup and then onto mptcp and source based routing.
Comment 16 Pavel Simerda 2014-02-04 19:17:17 UTC
(In reply to comment #15)
> > Sure. You seemed to claim you are specifying source based routing without
> > separate routing tables. I would like to hear about it as I don't know about
> > such a feature in the Linux kernel.
> 
> I was talking about IPv6 and subtrees. At least I think it's called subtrees,
> my colleague was experimenting with this.

Ah, what that the IMO incredibly stupid idea to diverge the IPv4 and IPv6 routing configuration? I would really like to have one preferred way used for both IPv4 and IPv6, whichever one it is.

> > Again, I can stand corrected but AFAIK the metric is one of the key attributes
> > and therefore the routes are considered distinct by the kernel and the atomic
> > update is not possible.
> > 
> 
> If that is so that I stand corrected :-)

Let's investigate later. Already pointed some other interested folks to the bugreport so we'll see.

> > But for the kernel? AFAIK the kernel just picks the best one and that's all.
> > 
> 
> agree.
> however...
> imagine a situation in which we have 1 plugin per interface that determines its
> cost.
> I would not want the interface plugins to have knowledge of other interface
> plugins.

Possibly, yes.

> Therefore, we need to be able to set an arbitrary cost/metric ('order') on the
> interface.

Internally, yes, externally, maybe.

> As long as I can set an arbitrary cost/metric/order then I'd be happy

Sounds like a plan :). You would probably need to assign:

1) A default cost/metric/order
2) Per-type cost/metric/order override
3) Per-connection cost/metric/order override

That's IMO the subject of:

https://bugzilla.gnome.org/show_bug.cgi?id=580018

> > We also discussed adding a feature to the kernel so it doesn't create
> > the device routes (Thomas already did that for other reasons to IPv6) and
> > NetworkManager could always add the device routes itself, with the right
> > metric.
> > 
> > Also it would be a step towards being able to fill the multiple tables used for
> > source routing more easily.
> 
> this could become awesome.

Indeed!

> we now have all kinds of custom daemons watching interfaces and determining
> costs. having NM handle the bulk of that.... potentially awesome.

So we have a common understanding. Let's hope we can get more active NM developers step in, as I consider myself just a contributor with limited time.
Comment 17 Thomas Haller 2014-02-06 22:46:07 UTC
*** Bug 723730 has been marked as a duplicate of this bug. ***
Comment 18 Pavel Simerda 2014-06-18 07:51:02 UTC
Hi Ferry,

just two cents from a contributor who made the original change but who can't currently provide much help... as a reaction on recent comments in Fedora bugzilla.

For normal usage with no other tools than NetworkManager and kernel involved, it doesn't matter whether a worse-metric route is installed or not. Therefore the regression doesn't affect such usage.

The bug report includes a sketch of a specialized use case where the worse-metric route is set up first and the running connections can survive the creation of the new route. This was AFAIK only done for non-default routes, so we are talking about local connectivity, right?

I understand you see this as a serious regression. And it will be nice if it gets fixed. But to be honest, I'm more interested in the big picture and especially bug #709478. But I didn't even get my questions answered there, especially on how to set up source based routes without using separate routing tables.

I'm afraid that complaining about time to fix a bug report is the wrong approach when we're not talking about a general use case. If your software and setup depends on some advanced features in NetworkManager that don't apply to most users, IMO the only useful choice is to actively participate in the project and keep track of larger scale changes announced on the mailing list.

Any chance you would be willing to participate in improving NetworkManager to support routing metrics, source based routing, multipath TCP and other techniques used to achieve the least disturbance of connections? Any chance you could visit the following event?

http://wiki.linuxplumbersconf.org/2014:network-management
Comment 19 Ferry Huberts 2014-06-18 08:09:50 UTC
(In reply to comment #18)
> Hi Ferry,
> 
> just two cents from a contributor who made the original change but who can't
> currently provide much help... as a reaction on recent comments in Fedora
> bugzilla.
> 
> For normal usage with no other tools than NetworkManager and kernel involved,
> it doesn't matter whether a worse-metric route is installed or not. Therefore
> the regression doesn't affect such usage.
> 

You need to explain that to me.
To me the situation is extremely basic/simple: both wlan and eth are used/active and controlled by NM.
In this case both LAN routes have the same metric (I'm _not_ talking about default routes!).
The kernel picks one of these routes, usually the wrong one.
So there is nothing special about the situation.

So yes, it (different metrics) _does_ matter because it forces the kernel to choose the route with the lowest metric, instead of a random one from a set with the same metric.

> The bug report includes a sketch of a specialized use case where the
> worse-metric route is set up first and the running connections can survive the
> creation of the new route. This was AFAIK only done for non-default routes, so
> we are talking about local connectivity, right?
> 

see comment just above

> I understand you see this as a serious regression. And it will be nice if it
> gets fixed. But to be honest, I'm more interested in the big picture and
> especially bug #709478. But I didn't even get my questions answered there,
> especially on how to set up source based routes without using separate routing
> tables.
> 
> I'm afraid that complaining about time to fix a bug report is the wrong
> approach when we're not talking about a general use case. If your software and

it _is_ a general use-case.
And it worked before.
So the classification 'regression' seems to be in order.

> setup depends on some advanced features in NetworkManager that don't apply to
> most users, IMO the only useful choice is to actively participate in the
> project and keep track of larger scale changes announced on the mailing list.
> 
> Any chance you would be willing to participate in improving NetworkManager to
> support routing metrics, source based routing, multipath TCP and other
> techniques used to achieve the least disturbance of connections? Any chance you

mmm, maybe. I'm quite busy with olsrd so my time is very limited.
We're bringing multi-gateway into production.
Think of multi-gw as a precursor for full source based routing and multi-path tcp.

> could visit the following event?
> 
> http://wiki.linuxplumbersconf.org/2014:network-management

That depends on where & when it is, couldn't find that on the wiki.
Comment 20 cornel panceac 2014-06-18 08:31:16 UTC
This a regression for all desktop usage (servers don't have wireless by default). It looks like a good default to choose Ethernet over wireless whenever available. I can't think of any usual scenario where wireless would be better than the existing ethernet connection.
Comment 21 Thomas Haller 2014-11-11 14:12:11 UTC
pushed branch for review: th/bgo723178_device_route_metric
Comment 22 Thomas Haller 2014-11-13 13:47:46 UTC
Thinking about this again. With th/bgo723178_device_route_metric NM will add the routes for the subnet itself instead of relying on the kernel provided ones.

If you connect two interfaces to the same LAN, there is now a clash because we cannot add the same route (with same metric) to two interfaces.

Indeed this is worse then before, because now we will randomly toggle the route between the two interfaces whenever we sync the address.

The workaround is to adjust the route metric of at least one interface to avoid the clash. I opened bug 740064 for the general issue.



Maybe another way to help against this would be that the default priority (as returned by nm_device_get_priority() considers duplicate interface types.

for example, every NMDevice should register to a global index. Duplicate device types get different default metrics assigned.

Downside (1): if you have more then one device types of the same type, you can no longer be sure what the default metric of the interface will be. E.g. em1 will get 20, em2 will get 21 (or vice versa). That is a bit problematic if you want to assign ipv4.route-metric of your connection to be slightly less then em1. Would you choose 21? Clash again.


for other reasons, I think we need per-device configuration (files). For example the ignore-carrier, no-auto-default, unmanaged-devices are all tracked in NetworkManager.conf. This is a strong limitation, would be great to have per-device configuration files.
With these files we could redeem downside (1) by allowing to configure the per-device default priority.
Comment 23 Dan Williams 2014-11-13 22:46:44 UTC
nm-iface-helper does have the default route metric, it's called 'priority'.  So that should get passed to nm_ipX_config_commit() instead of '0'.

Also, is IPv6 not affected by this issue?

Why make METRIC_KERNEL_ROUTE a variable, instead of #defining it?

Will addresses *ever* have a zero-length prefix?  That seems really odd and I don't think it's possible...  (re nm_platform_ip4_address_sync())

Also I don't think we want to add any routes if the plen == 32, but the code would try.

The patch (as noted) also no longer commits the IPv4 config for assumed devices, but that will break assumed DHCP connections, because currently NM re-launches the DHCP client for them.

Something has to renew the lease and right now that's NM...  so we either still need to apply the config even for assumed devices, or (possibly a better long-term solution) we need to save a list of which devices are NM-managed when NM quits so that when NM restarts we know that we can fully manage those devices again.
Comment 24 Thomas Haller 2014-11-14 14:44:40 UTC
(In reply to comment #23)
> nm-iface-helper does have the default route metric, it's called 'priority'.  So
> that should get passed to nm_ipX_config_commit() instead of '0'.

Right.

also added commit:
>> iface-helper: make priority variable guint32


> Also, is IPv6 not affected by this issue?

I don't think so. We add IPv6 addresses with IFA_F_NOPREFIXROUTE. Kernel does not add any routes for us.


> Why make METRIC_KERNEL_ROUTE a variable, instead of #defining it?

because it has local scope so it does not need a proper NM_ name. Anyway, I put the define in nm-platform.h


> Will addresses *ever* have a zero-length prefix?  That seems really odd and I
> don't think it's possible...  (re nm_platform_ip4_address_sync())

with iproute2 you can configure such addresses...


> Also I don't think we want to add any routes if the plen == 32, but the code
> would try.

done


> The patch (as noted) also no longer commits the IPv4 config for assumed
> devices, but that will break assumed DHCP connections, because currently NM
> re-launches the DHCP client for them.
> 
> Something has to renew the lease and right now that's NM...  so we either still
> need to apply the config even for assumed devices, or (possibly a better
> long-term solution) we need to save a list of which devices are NM-managed when
> NM quits so that when NM restarts we know that we can fully manage those
> devices again.

Oh right. That is now avoided by passing NM_PLATFORM_ROUTE_METRIC_IP4_DEVICE_ROUTE.



Repushed
Comment 25 Thomas Haller 2014-11-19 22:07:22 UTC
Branch again rebased on master.
Comment 26 Dan Williams 2014-11-21 23:27:36 UTC
> platform: add paramter to ip4_route_add to set src (RTA_PREFSRC)

In nm_platform_ip4_route_add():

+		       pref_src?" (src: ":"",
+		       pref_src?nm_utils_inet4_ntop (pref_src, pref_src_buf):"",
+		       pref_src?")":"");

spaces needed around ? and :


> core: fix route metrics for subnet routes

In ip4_config_merge_and_apply():

+	guint32 default_route_metric = nm_device_get_ip4_route_metric (self);

const?

--------------

One more thing I just thought of...  With IPSec, there are XFRM table rules (/sbin/ip xfrm state and /sbin/ip xfrm policy) that reference the route and the VPN IP address.  We have to be careful that when we manipulate the routing table, we don't disturb those rules through some action of ours, like changing IP addresses or routes.

I just tested out this branch with a libreswan connection and it appears that we are not doing anything wrong at this time, but we need to be really careful in the future because it's not clear what operations disturb the XFRM rules.  I fixed some of these issues in January in these commits:

de56f28db62d042c2c293867750228d6ac253892
merge: handle interface-less VPNs like open/libre-swan (bgo #721724) (rh #1030068)

8d9bfcdd5a1d8d716e8503f88e7b2e239408f667
platform: don't replace routes that already exist

4c16f3c7e2c5c10d313c2fb4318b245fda5f8bc5
core/platform: preserve external and static route metrics

In this case, I found that replacing a route to change the route metric broke some stuff in the XFRM tables in the kernel, because unlike addresses, the kernel removes the route first and adds it back with the new metric.  That route removal caused problems with the XFRM stuff and screwed up IPSec somehow.  Not touching routes fixed things.

So I'm not sure why the branch doesn't have this problem, given that it changes routes and metrics...  Maybe because it shows up on 'wlan0'/'eth0' first and thus it's looks like an externally added route, and we don't change those?
Comment 27 Thomas Haller 2014-11-22 00:16:40 UTC
(In reply to comment #26)
> > platform: add paramter to ip4_route_add to set src (RTA_PREFSRC)
> 
> In nm_platform_ip4_route_add():
> 
> +               pref_src?" (src: ":"",
> +               pref_src?nm_utils_inet4_ntop (pref_src, pref_src_buf):"",
> +               pref_src?")":"");
> 
> spaces needed around ? and :
> 
> 
> > core: fix route metrics for subnet routes
> 
> In ip4_config_merge_and_apply():
> 
> +    guint32 default_route_metric = nm_device_get_ip4_route_metric (self);
> 
> const?

done, fixup pushed.



> So I'm not sure why the branch doesn't have this problem, given that it changes
> routes and metrics...  Maybe because it shows up on 'wlan0'/'eth0' first and
> thus it's looks like an externally added route, and we don't change those?

All it touches, is
- remove IPv4 device routes with metric 0.
- add the same device route with a different metric.

(device route meaning corresponding to the address/plen subnet of an address).

I don't know whether that interferes with IPSec. Would need a show case of a potential problem...
Comment 28 Dan Williams 2014-11-24 17:00:24 UTC
Looks fine for now.

With IPSec you do get an address from the VPN subnet, and you do get a device route to the VPN subnet.  For example, I have:

LAN:
inet 192.168.1.158/24 brd 192.168.1.255 scope global dynamic wlan0
   valid_lft 411sec preferred_lft 411sec
192.168.1.0/24 dev wlan0  proto kernel  scope link  src 192.168.1.158 

VPN:
inet 10.3.237.133/32 brd 10.3.237.133 scope global wlan0
   valid_lft forever preferred_lft forever
10.0.0.0/8 via 192.168.1.1 dev wlan0  src 10.3.237.133 

# ip xfrm policy

src 10.3.237.133/32 dst 10.0.0.0/8 
	dir out priority 2104 ptype main 
	tmpl src 192.168.1.158 dst 209.132.183.55
		proto esp reqid 16389 mode tunnel

It *might* have been the case with older kernels that removing the 10./8 route (which is what the kernel does when changing a route metric) caused this XFRM policy to be removed which is what broke stuff before.  Apparently that's no longer the case with 3.17, but I worry a bit about older kernels...
Comment 29 Dan Williams 2014-11-25 21:15:32 UTC
Thomas points out that this branch only changes behavior for routes that are created due to *addresses* the interface has.  The 10./8 route that I see is added by libreswan itself, and not touched by NetworkManager because the associated 10.x address is a /32.  Since openswan/libreswan only ever add /32 addresses, this branch does nothing for this route.

This answers my question so I have no further issues with this branch.
Comment 30 Thomas Haller 2014-11-25 22:26:06 UTC
the changes for this branch however can interfere, if you have an additional 10.x.y.z/8 address that is managed by NetworkManager -- even on an other interface.

Scenario:

- connect *swan plugin, where ipsec installs a route to 10.0.0.0/8 via the 
  secure tunnel
- have any interface also configure an address 10.x.y.z/8 -- be it manually or 
  DHCP.

In this case, whenever syncing the addresses of that other interface, NM will remove and re-add the 10.0.0.0/8 route.


The "workaround" is: just don't do that. After all, what is even expected behavior?
Another workaround might be to set ipv4.route-metric of that other interface to 0 -- which disables the functionality of this branch.



This is not significantly different from any other route conflicts that NM currently does not handle properly.


I push the current state of the branch, because I think it fixes more problems then it creates.

Later I will submit a follow up patch to only touch those routes if
  - the address we are about to configure does not yet exist,
  - and there exists no such device route before adding the address.
Comment 31 Thomas Haller 2014-11-25 23:22:22 UTC
All three commits merged to master:

http://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=f32075d2fc11252e5661166b2f46c18c017929e9


This should work for most cases.

Then there is a new branch, that contains the fix from comment 30
please review: th/bgo723178_device_route_metric_v2
Comment 32 Ferry Huberts 2014-11-26 06:52:39 UTC
(In reply to comment #30)
> the changes for this branch however can interfere, if you have an additional
> 10.x.y.z/8 address that is managed by NetworkManager -- even on an other
> interface.
> 
> Scenario:
> 
> - connect *swan plugin, where ipsec installs a route to 10.0.0.0/8 via the 
>   secure tunnel
> - have any interface also configure an address 10.x.y.z/8 -- be it manually or 
>   DHCP.
> 
> In this case, whenever syncing the addresses of that other interface, NM will
> remove and re-add the 10.0.0.0/8 route.
> 

Which is why the metric/order should be an interface property an NM should _only_ modify routes per interface an not touch other routes.
Comment 33 Thomas Haller 2014-11-28 18:23:25 UTC
(In reply to comment #32)
> (In reply to comment #30)
> > the changes for this branch however can interfere, if you have an additional
> > 10.x.y.z/8 address that is managed by NetworkManager -- even on an other
> > interface.
> > 
> > Scenario:
> > 
> > - connect *swan plugin, where ipsec installs a route to 10.0.0.0/8 via the 
> >   secure tunnel
> > - have any interface also configure an address 10.x.y.z/8 -- be it manually or 
> >   DHCP.
> > 
> > In this case, whenever syncing the addresses of that other interface, NM will
> > remove and re-add the 10.0.0.0/8 route.
> > 
> 
> Which is why the metric/order should be an interface property an NM should
> _only_ modify routes per interface an not touch other routes.

Isn't that what patch th/bgo723178_device_route_metric_v2 does?

Note that when you modify the route on one interface, it will conflict when another (unrelated) interface has the same route -- "same" meaning the same "network/plen,metric" pair.
Comment 34 Dan Williams 2014-12-10 21:33:02 UTC
nm_platform_ip4_address_sync() and nm_platform_ip4_check_reinstall_device_route() duplicate the same checks for the prefix length.  Let's put them in one place.

Thats all I've got, the rest looks good to me.
Comment 35 Thomas Haller 2014-12-11 09:09:01 UTC
Patch from th/bgo723178_device_route_metric_v2 applied to master as:
http://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=5849c97c0368d51ea35590736aaf40ee522fed0d