GNOME Bugzilla – Bug 762626
Packets read from VLAN include the 802.1Q tag (breaks rp-pppoe)
Last modified: 2016-03-11 18:14:09 UTC
Hello. I tried version 1.1.90-14243.f41aebf897.fc23 (from Fedora copr lkundrak/NetworkManager) and the VLAN have a weird behavior: the 802.1Q tag is included in inbound packets even on the vlan interface. This is visible in Wireshark and programs that ask for raw ethernet packet, like rp-pppoe, choke on the tag where they expect to find the start of the upper layer packet (I gdb'd into pppoe to verify that). For reference and google, the rp-pppoe error is "Bogus PPPoE length field (4359)" and then "Timeout waiting for PADO packets". I rolled back to Fedora 23 version (1.0.10-2.fc23) and all came back to normal, the tag is gone in wireshark and I can connect to PPPoE again.
could be due to https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=5feda42813ddf5d51adf51c4e95cea51748a48fd where we now always call nm_platform_link_vlan_change() even if there is no qos-mapping.
do you see any difference in how the device is configured? Does `ip -d link show` look any different for them?
Older working version gives: 7: ppp0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1412 qdisc fq_codel state UNKNOWN mode DEFAULT group default qlen 3 link/ppp promiscuity 0 addrgenmode eui64 I won't be able to test the newer version today, as I must go. But I tell you that tomorrow.
(In reply to Léo Gillot-Lamure from comment #3) > Older working version gives: > > 7: ppp0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1412 qdisc fq_codel > state UNKNOWN mode DEFAULT group default qlen 3 > link/ppp promiscuity 0 addrgenmode eui64 > > I won't be able to test the newer version today, as I must go. > But I tell you that tomorrow. From a quick test, a VLAN created through NetworkManager works fine here on 1.1.90. When you get the chance, can you also paste the 'ip -d l show' output for the vlan interface?
Oh shh stupid me ! Yeah so here is the vlan: 4: net_orange@enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 44:8a:5b:2b:75:26 brd ff:ff:ff:ff:ff:ff promiscuity 0 vlan protocol 802.1Q id 835 <REORDER_HDR> addrgenmode eui64 And here is the trunk, if that helps: 2: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000 link/ether 44:8a:5b:2b:75:26 brd ff:ff:ff:ff:ff:ff promiscuity 0 addrgenmode eui64
Also, can you please describe how you create the VLAN connection and the output of 'nmcli c show <conn_name>'? How are launching pppoe over the VLAN? Thanks!
So, here are the interface params with the new NM: 4: net_orange@enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 44:8a:5b:2b:75:26 brd ff:ff:ff:ff:ff:ff promiscuity 0 vlan protocol 802.1Q id 835 addrgenmode eui64 2: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000 link/ether 44:8a:5b:2b:75:26 brd ff:ff:ff:ff:ff:ff promiscuity 0 addrgenmode eui64 The most obviously related difference is the lack of <REORDER_HDR>. I created the VLAN connection with the gnome GUI. I launch pppoe with pppd manually (it's the userland pppoe, not the kernel one), since NM doesn't support PPPoE on a VLAN at the moment (see Bug 724501 ;).
Created attachment 322360 [details] Wireshark trace This trace shows the dot1q header present even when sniffing the vlan interface. It is not present with NM 1.0.10.
Created attachment 322361 [details] nmcli c show for the vlan connection
Created attachment 322362 [details] Peer file for pppd
It seems that during the upgrade the vlan flags changed and now don't include REORDER_HDR. What's content of /etc/sysconfig/network-scripts/ifcfg-net_orange? The following commands should fix the problem (but before please grab the content of the file above so that we can investigate why this happened, thanks): nmcli connection mod net_orange vlan.flags 1 nmcli connection up net_orange
Created attachment 322372 [details] VLAN connection network-script
Note that I can go back and forth between old and new NM and it consistently reproduce and fix the bug. In both case nmcli shows the vlan.flags key to be 0. Strangely right now I'm on NM 1.0.10, the key in the ifcfg file is REORDER_HDR=0, and the <REORDER_HDR> appears in ip -d link sh.
(In reply to Léo Gillot-Lamure from comment #13) > Note that I can go back and forth between old and new NM and it consistently > reproduce and fix the bug. In both case nmcli shows the vlan.flags key to be > 0. > > Strangely right now I'm on NM 1.0.10, the key in the ifcfg file is > REORDER_HDR=0, and the <REORDER_HDR> appears in ip -d link sh. Yeah, that's because there seems to be a bug in the way we set vlan flags in 1.0.10 and previous version, and so we can't clear flags that kernel sets by default. So the problem is: you have REORDER_HDR=0 set in the connection (because versions before 1.0.8 set that value by default), however this was ignored in older versions of NM. Now the value is applied correctly in 1.2 and thus the regression. Can you confirm that the commands in my previous comment fix the issue? We should find a workaround for the broken migration path...
Yeah, it fixed it and now I'm running 1.1.90 fine, thanks. I think REORDER_HDR should be 1 by default, it's more logical for the vlan to be transparent for the clients (even the raw clients: if they want to see the tag, they can open the trunk device) isn't it ?
(In reply to Léo Gillot-Lamure from comment #15) > Yeah, it fixed it and now I'm running 1.1.90 fine, thanks. > > I think REORDER_HDR should be 1 by default, it's more logical for the vlan > to be transparent for the clients (even the raw clients: if they want to see > the tag, they can open the trunk device) isn't it ? Yes, it's set by default since commit: https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=687b6515980c08cdbb9734bd112a594166c4d6dd The problem is for connections created with older NM versions.
Oh, right, but I think I misunderstood something : you said that this patch was in 1.0.8 right ? Then it's strange, because I'm on a fresh Fedora 23 install (so with NM 1.0.10 from the start).
(In reply to Léo Gillot-Lamure from comment #17) > Oh, right, but I think I misunderstood something : you said that this patch > was in 1.0.8 right ? > > Then it's strange, because I'm on a fresh Fedora 23 install (so with NM > 1.0.10 from the start). 1.0 and earlier has a bug in that every VLAN added by NetworkManager would have REORDER_HDR set, regardless of the configuration. At the same time, the configuration would default to REORDER_HDR=0, and NM would write that to all connection files. That is fixed on nm-1-2, but it means, we must ignore any REORDER_HDR=0 setting, otherwise users will experience the regression on update. Unfortunately, that means we will behave different then initscripts, which correctly honors REORDER_HDR=0.
How about bg/vlan-reorder-hdr-bgo762626 ?
>> ifcfg-rh: change the handling of REORDER_HDR flag It seems that make check still fails for test-ifcfg-rh. I would not use GString to construct the flags. How about the two fixups? >> libnm-core: change default for missing vlan flags in keyfile reader The default of the GObject property NM_SETTING_VLAN_FLAGS happens to be NM_VLAN_FLAG_REORDER_HEADERS. Thus, you could remove: if (NM_IS_SETTING_VLAN (setting)) { if (!strcmp (property, NM_SETTING_VLAN_FLAGS)) g_object_set (setting, property, (NMVlanFlags) and the "Make sure that if [vlan] group was missing we set vlan.flags to REORDER_HDR" block. Unless you want to keep them to be explicit about it... (which would be fine with me). and maybe remove the comments: /* Ensure the VLAN flags are set to REORDER_HDR */ otherwise, lgtm
(In reply to Thomas Haller from comment #20) > >> ifcfg-rh: change the handling of REORDER_HDR flag > > It seems that make check still fails for test-ifcfg-rh. Strange, it succeeds here (and also on travis). > I would not use GString to construct the flags. How about the two fixups? Look good. > The default of the GObject property NM_SETTING_VLAN_FLAGS happens to be > NM_VLAN_FLAG_REORDER_HEADERS. > > Thus, you could remove: > > if (NM_IS_SETTING_VLAN (setting)) { > if (!strcmp (property, NM_SETTING_VLAN_FLAGS)) > g_object_set (setting, property, (NMVlanFlags) > > and the "Make sure that if [vlan] group was missing we set vlan.flags to > REORDER_HDR" block. > > Unless you want to keep them to be explicit about it... (which would be fine > with me). Good catch, I dropped them. > and maybe remove the comments: > /* Ensure the VLAN flags are set to REORDER_HDR */ Ok. Repushed bg/vlan-reorder-hdr-bgo762626.
(In reply to Beniamino Galvani from comment #21) > (In reply to Thomas Haller from comment #20) > > >> ifcfg-rh: change the handling of REORDER_HDR flag > > > > It seems that make check still fails for test-ifcfg-rh. > > Strange, it succeeds here (and also on travis). NetworkManager-ifcfg-rh:ERROR:test-ifcfg-rh.c:74:_connection_from_file: assertion failed (error == NULL): Could not read file '/data/src/NetworkManager/src/settings/plugins/ifcfg-rh/tests/network-scripts/ifcfg-test-vlan-reorder-hdr-2': No such file or directory (g-file-error-quark, 4) I guess travis doesn't build ifcfg-rh.
(In reply to Thomas Haller from comment #22) > NetworkManager-ifcfg-rh:ERROR:test-ifcfg-rh.c:74:_connection_from_file: > assertion failed (error == NULL): Could not read file > '/data/src/NetworkManager/src/settings/plugins/ifcfg-rh/tests/network- > scripts/ifcfg-test-vlan-reorder-hdr-2': No such file or directory > (g-file-error-quark, 4) Oops, fixed.
Merged to master: https://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=da01b81ff412ab4941025b1cf09d4cd11ba71826
Thanks folks !