After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 762626 - Packets read from VLAN include the 802.1Q tag (breaks rp-pppoe)
Packets read from VLAN include the 802.1Q tag (breaks rp-pppoe)
Status: RESOLVED FIXED
Product: NetworkManager
Classification: Platform
Component: IP and DNS config
git master
Other Linux
: Normal normal
: ---
Assigned To: NetworkManager maintainer(s)
NetworkManager maintainer(s)
Depends on:
Blocks: nm-1-2
 
 
Reported: 2016-02-24 16:43 UTC by Léo Gillot-Lamure
Modified: 2016-03-11 18:14 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Wireshark trace (748 bytes, application/x-pcapng)
2016-02-25 12:38 UTC, Léo Gillot-Lamure
Details
nmcli c show for the vlan connection (3.80 KB, text/plain)
2016-02-25 12:39 UTC, Léo Gillot-Lamure
Details
Peer file for pppd (216 bytes, text/plain)
2016-02-25 12:41 UTC, Léo Gillot-Lamure
Details
VLAN connection network-script (294 bytes, text/plain)
2016-02-25 13:36 UTC, Léo Gillot-Lamure
Details

Description Léo Gillot-Lamure 2016-02-24 16:43:33 UTC
Hello.

I tried version 1.1.90-14243.f41aebf897.fc23 (from Fedora copr  lkundrak/NetworkManager) and the VLAN have a weird behavior: the 802.1Q tag is included in inbound packets even on the vlan interface.

This is visible in Wireshark and programs that ask for raw ethernet packet, like rp-pppoe, choke on the tag where they expect to find the start of the upper layer packet (I gdb'd into pppoe to verify that).
For reference and google, the rp-pppoe error is "Bogus PPPoE length field (4359)" and then "Timeout waiting for PADO packets".

I rolled back to Fedora 23 version (1.0.10-2.fc23) and all came back to normal, the tag is gone in wireshark and I can connect to PPPoE again.
Comment 1 Thomas Haller 2016-02-24 17:07:27 UTC
could be due to https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=5feda42813ddf5d51adf51c4e95cea51748a48fd where we now always call nm_platform_link_vlan_change() even if there is no qos-mapping.
Comment 2 Thomas Haller 2016-02-24 17:08:01 UTC
do you see any difference in how the device is configured?

Does `ip -d link show` look any different for them?
Comment 3 Léo Gillot-Lamure 2016-02-24 17:13:12 UTC
Older working version gives:

7: ppp0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1412 qdisc fq_codel state UNKNOWN mode DEFAULT group default qlen 3
    link/ppp  promiscuity 0 addrgenmode eui64

I won't be able to test the newer version today, as I must go.
But I tell you that tomorrow.
Comment 4 Beniamino Galvani 2016-02-24 17:31:37 UTC
(In reply to Léo Gillot-Lamure from comment #3)
> Older working version gives:
> 
> 7: ppp0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1412 qdisc fq_codel
> state UNKNOWN mode DEFAULT group default qlen 3
>     link/ppp  promiscuity 0 addrgenmode eui64
> 
> I won't be able to test the newer version today, as I must go.
> But I tell you that tomorrow.

From a quick test, a VLAN created through NetworkManager works fine here on 1.1.90. When you get the chance, can you also paste the 'ip -d l show' output for the vlan interface?
Comment 5 Léo Gillot-Lamure 2016-02-24 17:39:45 UTC
Oh shh stupid me !
Yeah so here is the vlan:

4: net_orange@enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 44:8a:5b:2b:75:26 brd ff:ff:ff:ff:ff:ff promiscuity 0 
    vlan protocol 802.1Q id 835 <REORDER_HDR> addrgenmode eui64 

And here is the trunk, if that helps:

2: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
    link/ether 44:8a:5b:2b:75:26 brd ff:ff:ff:ff:ff:ff promiscuity 0 addrgenmode eui64
Comment 6 Beniamino Galvani 2016-02-24 18:06:56 UTC
Also, can you please describe how you create the VLAN connection and the output of 'nmcli c show <conn_name>'? How are launching pppoe over the VLAN?
Thanks!
Comment 7 Léo Gillot-Lamure 2016-02-25 12:37:12 UTC
So, here are the interface params with the new NM:

4: net_orange@enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 44:8a:5b:2b:75:26 brd ff:ff:ff:ff:ff:ff promiscuity 0 
    vlan protocol 802.1Q id 835 addrgenmode eui64 

2: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
    link/ether 44:8a:5b:2b:75:26 brd ff:ff:ff:ff:ff:ff promiscuity 0 addrgenmode eui64

The most obviously related difference is the lack of <REORDER_HDR>.

I created the VLAN connection with the gnome GUI. I launch pppoe with pppd manually (it's the userland pppoe, not the kernel one), since NM doesn't support PPPoE on a VLAN at the moment (see Bug 724501 ;).
Comment 8 Léo Gillot-Lamure 2016-02-25 12:38:32 UTC
Created attachment 322360 [details]
Wireshark trace

This trace shows the dot1q header present even when sniffing the vlan interface. It is not present with NM 1.0.10.
Comment 9 Léo Gillot-Lamure 2016-02-25 12:39:59 UTC
Created attachment 322361 [details]
nmcli c show for the vlan connection
Comment 10 Léo Gillot-Lamure 2016-02-25 12:41:48 UTC
Created attachment 322362 [details]
Peer file for pppd
Comment 11 Beniamino Galvani 2016-02-25 13:28:54 UTC
It seems that during the upgrade the vlan flags changed
and now don't include REORDER_HDR. What's content of
/etc/sysconfig/network-scripts/ifcfg-net_orange?

The following commands should fix the problem (but before please grab
the content of the file above so that we can investigate why this
happened, thanks):

  nmcli connection mod net_orange vlan.flags 1
  nmcli connection up net_orange
Comment 12 Léo Gillot-Lamure 2016-02-25 13:36:58 UTC
Created attachment 322372 [details]
VLAN connection network-script
Comment 13 Léo Gillot-Lamure 2016-02-25 13:38:31 UTC
Note that I can go back and forth between old and new NM and it consistently reproduce and fix the bug. In both case nmcli shows the vlan.flags key to be 0.

Strangely right now I'm on NM 1.0.10, the key in the ifcfg file is REORDER_HDR=0, and the <REORDER_HDR> appears in ip -d link sh.
Comment 14 Beniamino Galvani 2016-02-25 16:10:18 UTC
(In reply to Léo Gillot-Lamure from comment #13)
> Note that I can go back and forth between old and new NM and it consistently
> reproduce and fix the bug. In both case nmcli shows the vlan.flags key to be
> 0.
>
> Strangely right now I'm on NM 1.0.10, the key in the ifcfg file is
> REORDER_HDR=0, and the <REORDER_HDR> appears in ip -d link sh.

Yeah, that's because there seems to be a bug in the way we set vlan
flags in 1.0.10 and previous version, and so we can't clear flags that
kernel sets by default.

So the problem is: you have REORDER_HDR=0 set in the connection
(because versions before 1.0.8 set that value by default), however
this was ignored in older versions of NM.

Now the value is applied correctly in 1.2 and thus the regression. Can
you confirm that the commands in my previous comment fix the issue?

We should find a workaround for the broken migration path...
Comment 15 Léo Gillot-Lamure 2016-02-25 16:15:44 UTC
Yeah, it fixed it and now I'm running 1.1.90 fine, thanks.

I think REORDER_HDR should be 1 by default, it's more logical for the vlan to be transparent for the clients (even the raw clients: if they want to see the tag, they can open the trunk device) isn't it ?
Comment 16 Beniamino Galvani 2016-02-26 10:08:11 UTC
(In reply to Léo Gillot-Lamure from comment #15)
> Yeah, it fixed it and now I'm running 1.1.90 fine, thanks.
> 
> I think REORDER_HDR should be 1 by default, it's more logical for the vlan
> to be transparent for the clients (even the raw clients: if they want to see
> the tag, they can open the trunk device) isn't it ?

Yes, it's set by default since commit:

https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=687b6515980c08cdbb9734bd112a594166c4d6dd

The problem is for connections created with older NM versions.
Comment 17 Léo Gillot-Lamure 2016-02-26 11:19:04 UTC
Oh, right, but I think I misunderstood something : you said that this patch was in 1.0.8 right ?

Then it's strange, because I'm on a fresh Fedora 23 install (so with NM 1.0.10 from the start).
Comment 18 Thomas Haller 2016-02-26 12:29:37 UTC
(In reply to Léo Gillot-Lamure from comment #17)
> Oh, right, but I think I misunderstood something : you said that this patch
> was in 1.0.8 right ?
> 
> Then it's strange, because I'm on a fresh Fedora 23 install (so with NM
> 1.0.10 from the start).

1.0 and earlier has a bug in that every VLAN added by NetworkManager would have REORDER_HDR set, regardless of the configuration.

At the same time, the configuration would default to REORDER_HDR=0, and NM would write that to all connection files.


That is fixed on nm-1-2, but it means, we must ignore any REORDER_HDR=0 setting, otherwise users will experience the regression on update. Unfortunately, that means we will behave different then initscripts, which correctly honors REORDER_HDR=0.
Comment 19 Beniamino Galvani 2016-02-26 15:48:58 UTC
How about bg/vlan-reorder-hdr-bgo762626 ?
Comment 20 Thomas Haller 2016-02-26 16:21:57 UTC
>> ifcfg-rh: change the handling of REORDER_HDR flag

It seems that make check still fails for test-ifcfg-rh.


I would not use GString to construct the flags. How about the two fixups?



>> libnm-core: change default for missing vlan flags in keyfile reader


The default of the GObject property NM_SETTING_VLAN_FLAGS happens to be NM_VLAN_FLAG_REORDER_HEADERS.

Thus, you could remove:

     if (NM_IS_SETTING_VLAN (setting)) {
          if (!strcmp (property, NM_SETTING_VLAN_FLAGS))
              g_object_set (setting, property, (NMVlanFlags) 
 
and the "Make sure that if [vlan] group was missing we set vlan.flags to REORDER_HDR" block.


Unless you want to keep them to be explicit about it... (which would be fine with me).





and maybe remove the comments:
/* Ensure the VLAN flags are set to REORDER_HDR */




otherwise, lgtm
Comment 21 Beniamino Galvani 2016-02-29 10:21:30 UTC
(In reply to Thomas Haller from comment #20)
> >> ifcfg-rh: change the handling of REORDER_HDR flag
> 
> It seems that make check still fails for test-ifcfg-rh.

Strange, it succeeds here (and also on travis).
 
> I would not use GString to construct the flags. How about the two fixups?

Look good.

> The default of the GObject property NM_SETTING_VLAN_FLAGS happens to be
> NM_VLAN_FLAG_REORDER_HEADERS.
> 
> Thus, you could remove:
> 
>      if (NM_IS_SETTING_VLAN (setting)) {
>           if (!strcmp (property, NM_SETTING_VLAN_FLAGS))
>               g_object_set (setting, property, (NMVlanFlags) 
>  
> and the "Make sure that if [vlan] group was missing we set vlan.flags to
> REORDER_HDR" block.
> 
> Unless you want to keep them to be explicit about it... (which would be fine
> with me).

Good catch, I dropped them.

> and maybe remove the comments:
> /* Ensure the VLAN flags are set to REORDER_HDR */

Ok. Repushed bg/vlan-reorder-hdr-bgo762626.
Comment 22 Thomas Haller 2016-02-29 12:51:10 UTC
(In reply to Beniamino Galvani from comment #21)
> (In reply to Thomas Haller from comment #20)
> > >> ifcfg-rh: change the handling of REORDER_HDR flag
> > 
> > It seems that make check still fails for test-ifcfg-rh.
> 
> Strange, it succeeds here (and also on travis).

NetworkManager-ifcfg-rh:ERROR:test-ifcfg-rh.c:74:_connection_from_file: assertion failed (error == NULL): Could not read file '/data/src/NetworkManager/src/settings/plugins/ifcfg-rh/tests/network-scripts/ifcfg-test-vlan-reorder-hdr-2': No such file or directory (g-file-error-quark, 4)

I guess travis doesn't build ifcfg-rh.
Comment 23 Beniamino Galvani 2016-02-29 13:34:26 UTC
(In reply to Thomas Haller from comment #22)
> NetworkManager-ifcfg-rh:ERROR:test-ifcfg-rh.c:74:_connection_from_file:
> assertion failed (error == NULL): Could not read file
> '/data/src/NetworkManager/src/settings/plugins/ifcfg-rh/tests/network-
> scripts/ifcfg-test-vlan-reorder-hdr-2': No such file or directory
> (g-file-error-quark, 4)

Oops, fixed.
Comment 25 Léo Gillot-Lamure 2016-03-01 19:29:23 UTC
Thanks folks !