After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 692279 - only one of multiple default gateways from IPv6 router advertisments is duplicated
only one of multiple default gateways from IPv6 router advertisments is dupli...
Status: RESOLVED FIXED
Product: NetworkManager
Classification: Platform
Component: IP and DNS config
git master
Other Linux
: Normal normal
: ---
Assigned To: Pavel Simerda
NetworkManager maintainer(s)
Depends on: 699772
Blocks:
 
 
Reported: 2013-01-22 11:23 UTC by Pavel Simerda
Modified: 2015-03-10 10:55 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Syslog with NetworkManager debug turned on (62.00 KB, text/plain)
2013-01-24 02:19 UTC, Martin Jackson
Details

Description Pavel Simerda 2013-01-22 11:23:56 UTC
From Martin Jackson through NetworkManager mailing list:

Hello,

First, let me thank you for all of your hard work in developing and
maintaining network-manager.  It makes life with a Linux laptop much
simpler.

I have observed a behavior in network-manager in both Ubuntu (12.10) and
Fedora (18).  In the presence of multiple routers advertising defaults,
n-m seems to insert a static default route for one of them, with metric
1.  Here is example output from my Fedora system:

default via fe80::5054:ff:fe01:b6dc dev eth0  proto static  metric 1
default via fe80::5054:ff:fe01:b6dc dev eth0  proto ra  metric 1024
expires 2sec
default via fe80::5054:ff:fe2f:62c2 dev eth0  proto ra  metric 1024
expires 9sec

The Ubuntu system's output looks similar.

I believe that the addition of the "static metric 1" route is incorrect,
as static routes of this type persist even when the router in question
is no longer doing router advertisements.  In that case, the system will
use that route in preference to the other router, which may still be
doing router advertisements.  This creates issues when multiple routers
are correctly advertising on the same link, and the intention is to
provide a failover capability, without the complications of a first hop
redundancy protocol.

I am in control of both of the advertising routers and can enable debug
if needed on either the client systems or the advertising routers (which
are virtual Ubuntu systems running Quagga).

I have filed a bug with Ubuntu
(https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/1101825), but
this seemed like a better place to address this - I see that recently
there has been some discussion about IPv6 semantics on this list and
some recent patches for IPv6, so this discussion might be timely.

Thanks,
Marty
Comment 1 Pavel Simerda 2013-01-22 11:40:32 UTC
(In reply to comment #0)
> First, let me thank you for all of your hard work in developing and
> maintaining network-manager.  It makes life with a Linux laptop much
> simpler.

Thanks!

> I have observed a behavior in network-manager in both Ubuntu (12.10) and
> Fedora (18).  In the presence of multiple routers advertising defaults,
> n-m seems to insert a static default route for one of them, with metric
> 1.  Here is example output from my Fedora system:
> 
> default via fe80::5054:ff:fe01:b6dc dev eth0  proto static  metric 1
> default via fe80::5054:ff:fe01:b6dc dev eth0  proto ra  metric 1024
> expires 2sec
> default via fe80::5054:ff:fe2f:62c2 dev eth0  proto ra  metric 1024
> expires 9sec

Yes. This is the current behavior and it is necessary to do it (or something similar) to be able to choose the default routing interface. It might be better to duplicate all local default routes from the default route interface.

Duplicate routes are the way NetworkManager marks the default routing interface in current kernels.

> 
> The Ubuntu system's output looks similar.
> 
> I believe that the addition of the "static metric 1" route is incorrect,
> as static routes of this type persist even when the router in question
> is no longer doing router advertisements.

How did you test that?

> In that case, the system will
> use that route in preference to the other router, which may still be
> doing router advertisements.  This creates issues when multiple routers
> are correctly advertising on the same link, and the intention is to
> provide a failover capability, without the complications of a first hop
> redundancy protocol.

Agreed.

> I am in control of both of the advertising routers and can enable debug
> if needed on either the client systems or the advertising routers (which
> are virtual Ubuntu systems running Quagga).
> 
> I have filed a bug with Ubuntu
> (https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/1101825), but
> this seemed like a better place to address this - I see that recently
> there has been some discussion about IPv6 semantics on this list and
> some recent patches for IPv6, so this discussion might be timely.
> 
> Thanks,
> Marty

The current IPv6 processing is based in linux kernel shortcomings and the same applies to IPv6 standards.

There is some information about the kernel part in redhat bugzilla:

bugzilla.redhat.com/show_bug.cgi?id=891245
Comment 2 Martin Jackson 2013-01-24 02:19:53 UTC
Created attachment 234272 [details]
Syslog with NetworkManager debug turned on
Comment 3 Martin Jackson 2013-01-24 02:28:07 UTC
To test it, I stopped Quagga on one of the hosts (I have seen the static metric 1 route appear to both of the hosts advertising).  I was curious to see what would happen, since I figured the static metric 1 route would persist, and it does.

I will attach the relevant zebra.conf pieces from my two router VMs, if it will help you.  I recently went through some IPv6 training, and was motivated to see how a Linux client would react. 

I have tried a similar experiment without network-manager; in that scenario I only see the ra metric 1024 routes, not a static metric 1 route.

As a sidebar comment (this probably is how the kernel manages it), but I noticed that on Windows 7, Windows applies a better metric to the route with higher preference, which I think might be useful in Linux.  Apparently the bug linked in RH bugzi is not public as it won't let me view it.  (Though I don't have a Red Hat bugzi account.)
Comment 4 Pavel Simerda 2013-01-25 01:37:47 UTC
(In reply to comment #3)
> To test it, I stopped Quagga on one of the hosts (I have seen the static metric
> 1 route appear to both of the hosts advertising).  I was curious to see what
> would happen, since I figured the static metric 1 route would persist, and it
> does.

Quagga tests are irrelevant.

> I will attach the relevant zebra.conf pieces from my two router VMs, if it will
> help you.  I recently went through some IPv6 training, and was motivated to see
> how a Linux client would react. 

I can very well imagine that :).

> I have tried a similar experiment without network-manager; in that scenario I
> only see the ra metric 1024 routes, not a static metric 1 route.

Experiments without networkmanager are irrelevant. NetworkManager does that as a workaround for missing kernel API for choosing the default route interface.

> As a sidebar comment (this probably is how the kernel manages it), but I
> noticed that on Windows 7, Windows applies a better metric to the route with
> higher preference, which I think might be useful in Linux.

This is not possible with Linux until the kernel IPv6 API is useful enough.

> Apparently the bug
> linked in RH bugzi is not public as it won't let me view it.  (Though I don't
> have a Red Hat bugzi account.)

Sorry for that. It should be public now.
Comment 5 Martin Jackson 2013-01-25 02:37:42 UTC
I get the irrelevance comments, at least as far as quagga goes.  Sometimes with things as *practically* new as IPv6, reproduction can be a problem.

I can see the RH bug now, thank you.

Do I understand you correctly that the static route with metric 1 is currently necessary for IPv6 routing to work at all?  (At least in the presence of multiple interfaces - but that seems to be the norm these days.)
Comment 6 Pavel Simerda 2013-01-25 12:46:37 UTC
(In reply to comment #5)
> Do I understand you correctly that the static route with metric 1 is currently
> necessary for IPv6 routing to work at all?  (At least in the presence of
> multiple interfaces - but that seems to be the norm these days.)

Correct. It's the way NetworkManager chosses the outgoing interface. We are talking about nicer solutions, though.
Comment 7 Martin Jackson 2013-01-26 02:08:10 UTC
OK.  Well, I hope this report helps motivate nicer solutions. Thanks for the time and explanations!
Comment 8 Pavel Simerda 2013-01-26 03:01:44 UTC
(In reply to comment #7)
> OK.  Well, I hope this report helps motivate nicer solutions. Thanks for the
> time and explanations!

It certainly will. First, it won't be easily forgotten. Second, it means some people do care about this use case.
Comment 9 Dan Winship 2013-05-02 16:18:44 UTC
NM bugzilla reorganization. Sorry for the bug spam.
Comment 10 Pavel Simerda 2013-05-11 10:13:16 UTC
(In reply to comment #0)
> From Martin Jackson through NetworkManager mailing list:
> default via fe80::5054:ff:fe01:b6dc dev eth0  proto ra  metric 1024
> expires 2sec
> default via fe80::5054:ff:fe2f:62c2 dev eth0  proto ra  metric 1024
> expires 9sec

From libnl source code (lib/route/route_obj.c):

    .oo_id_attrs        = (ROUTE_ATTR_FAMILY | ROUTE_ATTR_TOS |
                   ROUTE_ATTR_TABLE | ROUTE_ATTR_DST |
                   ROUTE_ATTR_PRIO),

I always understood libnl's oo_id_attrs as a unique key. Therefore I thought that two routes with the same values of id attributes should never exist.

In both cases, address family is IPv6, table is main, destination is default and priority is 1024. The only remaining id argument is TOS which is not shown. I don't see any reason for that one to be different, so I can assume that there are two routes with duplicate netlink oo_id_attrs attributes.

Fortunately, with libndp we will avoid the whole kernel autoconf mess.
Comment 11 Pavel Simerda 2013-06-18 08:14:13 UTC
This bug report should be solved by development done for bug #699772.
Comment 12 Pavel Simerda 2013-08-13 18:11:36 UTC
In current git master, userspace router discovery (src/rdisc) is being used, nm-device takes the list of gateways from it and only picks the first one (sorted by src/rdisc). Therefore there's no longer a need to duplicate kernel-autoconfigured addresses.
Comment 13 Lorenzo Colitti 2015-03-10 01:52:59 UTC
Pavel, https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/1101825 links to this issue.

However, that bug is about the fact that when NM receives an RA, it creates a static route that never expires, and when the RA expires, the route is not deleted. That's incorrect, and it will break connectivity if the network default gateway changes. 

In current git master, does receiving an RA also cause NM to create a static route that never expires? (I see this on 0.9.8.8, but don't have an easy way to run master.) Should I file another bug about it here?
Comment 14 Thomas Haller 2015-03-10 10:55:35 UTC
(In reply to Lorenzo Colitti from comment #13)
> Pavel,
> https://bugs.launchpad.net/ubuntu/+source/network-manager/+bug/1101825 links
> to this issue.
> 
> However, that bug is about the fact that when NM receives an RA, it creates
> a static route that never expires, and when the RA expires, the route is not
> deleted. That's incorrect, and it will break connectivity if the network
> default gateway changes. 
> 
> In current git master, does receiving an RA also cause NM to create a static
> route that never expires? (I see this on 0.9.8.8, but don't have an easy way
> to run master.) Should I file another bug about it here?

Since 0.9.10, NM does autoconf in userspace (using libndp). So it works quite differently there. NM-0.9.8 let kernel handle it.

Routes by themselves don't expire in Linux. But since 0.9.10, NetworkManager will keep track of them and remove them. On NM master routes should expire. If they don't, it's a separate bug.


0.9.8 is quite old... in this case a downstream bug seems more appropriate. I don't think that somebody upstream will fix such an old version -- unless it's trivial or somebody wants to work on it.