GNOME Bugzilla – Bug 742939
NetworkManager should use kernel PPPoE (plugin) not /usr/sbin/pppoe binary
Last modified: 2015-05-13 11:53:23 UTC
The current PPPoE implementation in NetworkManager prefers to use the /usr/sbin/pppoe binary. This is slower than kernel PPPoE, and doesn't support RFC 4638 jumbograms, whereas the rp-pppoe.so plugin exploits kernel support for RFC 4638. I have a patch to use rp-pppoe.so instead of /usr/bin/pppoe, which works for me on Linux. I've not tested on other OSes. It also attempts to raise MTU on the underlying Ethernet interface, so that RFC 4638 will work (and this gets me 1500 MTU on AAISP FTTC). I've not directly tested this patch - I've taken the F21 version of NM, and patched that (see https://bugzilla.redhat.com/show_bug.cgi?id=1177549 for the downstream bug, with direct patch for F21), as that's what I use day-to-day. However, it compiles without error.
Created attachment 294556 [details] [review] kernel PPPoE patch
(In reply to comment #0) > The current PPPoE implementation in NetworkManager prefers to use the > /usr/sbin/pppoe binary. This is slower than kernel PPPoE, and doesn't support > RFC 4638 jumbograms, whereas the rp-pppoe.so plugin exploits kernel support for > RFC 4638. Sigh. NM used to use kernel PPPoE, but we switched to using userland PPPoE in 0.9.10 because of a bug in the kernel implementation. (If the remote end drops the connection in a certain way, the kernel plugin doesn't notice, and so NM thinks the connection is still active even though it's dead.)
(In reply to comment #2) > (In reply to comment #0) > > The current PPPoE implementation in NetworkManager prefers to use the > > /usr/sbin/pppoe binary. This is slower than kernel PPPoE, and doesn't support > > RFC 4638 jumbograms, whereas the rp-pppoe.so plugin exploits kernel support for > > RFC 4638. > > Sigh. NM used to use kernel PPPoE, but we switched to using userland PPPoE in > 0.9.10 because of a bug in the kernel implementation. (If the remote end drops > the connection in a certain way, the kernel plugin doesn't notice, and so NM > thinks the connection is still active even though it's dead.) Sigh. I spent my evening working out how NetworkManager handles PPPoE so that I could get 1500 MTU (hence discussing RFC 4638 in the initial bug report), only to discover that the developers don't care what users might want. I want 1500 MTU on my PPPoE link; it looked to me like using the kernel PPPoE implementation would get this for me, and boost performance to boot (I'm seeing a latency drop of around 1 ms, and a reduction in CPU load). Can you at least take 30 seconds to point me to your upstream bug report for the issue with the kernel PPPoE implementation or the rp-pppoe.so plugin? I can then go and fix that, and ask you to reconsider your use of the incomplete userspace PPPoE implementation once it's fixed.
http://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=7955806a02db64b20079267743056d7d9d45af3b IMO if this gets changed again, it should be made configurable via NetworkManager.conf.
(In reply to comment #3) > only to discover that the developers don't care what users might want. We do care... that's why we fixed the original bug. At the time, it seemed like the userland PPPoE was strictly better than the kernel PPPoE. My "sigh" above was because apparently, no, each of them is better in some ways and worse in other ways. > Can you at least take 30 seconds to point me to your upstream bug report for > the issue with the kernel PPPoE implementation or the rp-pppoe.so plugin? It was a RHEL bug and there's customer info in it so it's not public. The relevant bit is: Problem: not all PPPoE servers send LCP Termination requests before they send PPPoE PADT termination frames. This causes the kernel's pppoe.ko module to terminate the connection, but gives pppd no indication that it has done so. This is because pppd only listens for the PPP LCP frames, and knows nothing about PPPoE. The pppd rp-pppoe.so plugin does know about PPPoE PADT frames, but unfortunately (1) it hands off all read/write operations to pppd after setting up the initial connection, and thus cannot read further PPPoE frames, and (2) the kernel's pppoe.ko eats the PADT frame without passing them up to userspace anyway. Thus, when using kernel PPPoE with pppd, there is no possibility to notice a PPPoE PADT termination frame from the server, and pppd knows nothing about the server terminating the PPPoE session. Usually, a server would send an LCP Termination request, followed by a PPPoE PADT frame, and pppd would see the LCP Termination request and stop the connection locally. Unfortunately, not all servers do this. Short-term solution: to work around this issue, NetworkManager should be changed to use userland PPP, which involves using the /usr/sbin/pppoe binary to wrap pppd. Since /usr/sbin/pppoe wraps pppd and listens to traffic before passing it to pppd, it is able to capture PPPoE frames from the ethernet interface and recognize the PADT without the kernel interfering. The /usr/sbin/pppoe-connect script shows how to call pppd in this way, and some of this logic should be duplicated in NetworkManager.
Created attachment 295901 [details] [review] [PATCH] pppoe: Use workqueue to die properly when a PADT is received When a PADT frame is received, the socket may not be in a good state to close down the PPP interface. Use schedule_work to get to a process context from which we clear down the PPP interface, in a fashion analogous to hangup on a TTY-based PPP interface. Signed-off-by: Simon Farnsworth <simon@farnz.org.uk> --- drivers/net/ppp/pppoe.c | 15 ++++++++++++++- include/linux/if_pppox.h | 2 ++ 2 files changed, 16 insertions(+), 1 deletion(-)
(In reply to comment #6) > Created an attachment (id=295901) [details] [review] > [PATCH] pppoe: Use workqueue to die properly when a PADT is received > > > When a PADT frame is received, the socket may not be in a good state to > close down the PPP interface. Use schedule_work to get to a process > context from which we clear down the PPP interface, in a fashion > analogous to hangup on a TTY-based PPP interface. > > Signed-off-by: Simon Farnsworth <simon@farnz.org.uk> > --- > drivers/net/ppp/pppoe.c | 15 ++++++++++++++- > include/linux/if_pppox.h | 2 ++ > 2 files changed, 16 insertions(+), 1 deletion(-) This patch is compile-tested only, and is for the kernel - if I've understood properly, it fixes the kernel's PADT handling to be the same as hangup on a TTY. I've no easy way to test this; PADTs on my Internet link are rare, and I have no other PPPoE connection available. I'm happy to send it upstream myself, but not until it's been at least looked over by another developer, and ideally tested. The patch is against DaveM's net-next branch, as of: commit 58c11b5faed6913f73f2763d3a85e4a668e8ba2b Author: Karicheri, Muralidharan <m-karicheri2@ti.com> Date: Thu Jan 29 18:15:51 2015 -0500 drivers: net: cpsw: make cpsw_ale.c a module to allow re-use on Keystone
Created attachment 295902 [details] [review] [PATCH] Use kernel PPPoE instead of separate program Considerably more efficient on Linux, and enables RFC4638 jumbogram support Note that this may break PPPoE on non-Linux systems - I haven't checked. Signed-off-by: Simon Farnsworth <simon@farnz.org.uk> --- src/nm-config.c | 10 ++++ src/nm-config.h | 2 + src/ppp-manager/nm-ppp-manager.c | 114 +++++++++++++++++++++++++++++---------- 3 files changed, 97 insertions(+), 29 deletions(-)
Created attachment 295903 [details] [review] [PATCH] ppp: Fix PPPoE MRU and MTU handling Userspace PPPoE can't do MRU or MTU greater than 1492 bytes, but doesn't fail cleanly; it simply doesn't pass PPP frames that are oversized. Cap PPP M[RT]U when using userspace PPPoE, to avoid data loss Conversely, kernel PPPoE caps M[RT]U based on the MTU of the underlying interface. Attempt to increase the MTU of the underlying interface to match the user's requested M[RT]U, ignoring failure. This gets the user their chosen M[RT]U if it's possible on their hardware. Signed-off-by: Simon Farnsworth <simon@farnz.org.uk> --- src/ppp-manager/nm-ppp-manager.c | 63 ++++++++++++++++++++++++++++------------ 1 file changed, 44 insertions(+), 19 deletions(-)
Note that the two NetworkManager patches I've just attached are compile-tested only (it's late here, and I'm not up to replacing and restoring working Internet tonight). The first patch makes userspace versus kernel PPPoE a configuration option, and enables support for kernel PPPoE. I've defaulted the option to FALSE for now, but I'm willing to argue that point. The second fixes M[RT]U handling, which has one straight bug (created by switching to userspace PPPoE), and one feature requirement. The straight bug is that userspace PPPoE won't pass jumbograms to pppd; it simply drops any PPP frame with more than 1492 bytes of payload. However, pppd doesn't know that this is going to happen - it thinks it's on a serial line. Lost data ensues if you're configured to use PPPoE MTU greater than 1492 bytes. Note that with kernel PPPoE, this wouldn't happen; pppd would only be allowed to use a larger MTU if the underlying interface supported it, and would cap to 1492 bytes during LCP negotiation with the default 1500 byte Ethernet MTU. The feature requirement is to ask the kernel to increase the interface MTU if the PPP MTU requires it - this means I can set an MTU of 1500 bytes and, if the remote end supports RFC 4638 and a 1500 byte MTU, I get 1500 byte MTU.
Simon, thanks for working on this. I can try to test out the kernel side patches, but it'll take some time since I'm away from the setup for a week.
(In reply to comment #11) > Simon, thanks for working on this. I can try to test out the kernel side > patches, but it'll take some time since I'm away from the setup for a week. I'm happy to fix any faults you find - the patch is entirely based on code inspection and reading documentation, and is only compile tested, so may need rework anyway.
(In reply to Dan Williams from comment #11) > Simon, thanks for working on this. I can try to test out the kernel side > patches, but it'll take some time since I'm away from the setup for a week. Hello Dan, Have you had a spare moment to test the kernel patch? Also, do you have any feedback on the two NM patches?
Not yet, hopefully this week...
I can confirm that your patches do the trick, both in the kernel and NM. For the kernel patch, three suggestions: 1) maybe add some text to the commit message that indicate what the user-visible problem was before the patch, eg that the PADT gets eaten by the kernel but userspace is unaware that the session has been closed 2) put a newline after pppoe_unbind_sock_work() and before the /******/ line 3) put a space after the 'if' in pppoe_disc_rcv() where the schedule_work gets called. Kernel style has a space between the 'if' and '(', but (as you do correctly) no space between a function name and '('. ----- For the NM side, what I'd like to do is simply revert the original userspace PPPoE patch and return to the previous code, and simply drop the userspace PPPoE option entirely if the problem is solved. Second, if we fail to set the MTU on the underlying interface, is pppd smart enough to clamp the given 'mru' and 'mtu' options? Or should NM be clamping them if setting the MTU fails? Thanks!
Created attachment 297117 [details] [review] 0001-Revert-ppp-manager-don-t-use-kernel-pppoe-rh-1034860.patch
Created attachment 297118 [details] [review] 0002-ppp-Fix-PPPoE-MRU-and-MTU-handling.patch Simon, can you review this port of your patch to the revert I did? Thanks!
(In reply to Dan Williams from comment #16) > Created attachment 297117 [details] [review] [review] > 0001-Revert-ppp-manager-don-t-use-kernel-pppoe-rh-1034860.patch Doesn't this mean, that users of an older kernel will again have the problem? using pppoe they could avoid that problem and instead having missing mtu functionality. How about making it a configure option?
If this was a widespread problem that might make sense, but we only know of a single person who was affected by the original bug
(In reply to Dan Williams from comment #15) > I can confirm that your patches do the trick, both in the kernel and NM. > > For the kernel patch, three suggestions: > > 1) maybe add some text to the commit message that indicate what the > user-visible problem was before the patch, eg that the PADT gets eaten by > the kernel but userspace is unaware that the session has been closed > > 2) put a newline after pppoe_unbind_sock_work() and before the /******/ line > > 3) put a space after the 'if' in pppoe_disc_rcv() where the schedule_work > gets called. Kernel style has a space between the 'if' and '(', but (as you > do correctly) no space between a function name and '('. > I'll clean it up (including better commit message) and send to netdev for the usual mob to shred. > ----- > > For the NM side, what I'd like to do is simply revert the original userspace > PPPoE patch and return to the previous code, and simply drop the userspace > PPPoE option entirely if the problem is solved. > Sounds very good to me; I was just doing the extra work suggested in comment #4 on this bug. > Second, if we fail to set the MTU on the underlying interface, is pppd smart > enough to clamp the given 'mru' and 'mtu' options? Or should NM be clamping > them if setting the MTU fails? > The rp-pppoe.so plugin is smart enough to clamp the given mru and mtu options. plugin.c contains code to reduce them if needed based on interface MTU; discovery.c contains code to reduce them if needed based on the max-payload tag from the AC.
(In reply to Thomas Haller from comment #18) > (In reply to Dan Williams from comment #16) > > Created attachment 297117 [details] [review] [review] [review] > > 0001-Revert-ppp-manager-don-t-use-kernel-pppoe-rh-1034860.patch > > Doesn't this mean, that users of an older kernel will again have the problem? > > using pppoe they could avoid that problem and instead having missing mtu > functionality. > > How about making it a configure option? Note that there's also a workaround for most users; if lcp-echo-failure and lcp-echo-interval are both non-zero, then as long as the PPP endpoint supports LCP echoes, you'll be fine - the echo failure will result in the connection going down. The only people who need userspace PPPoE to make their setup work reliably are those who both have PADTs without a preceding LCP close, *and* have a PPP peer that doesn't respond to LCP echoes.
Review of attachment 297118 [details] [review]: Looks like a correct port of the code to me. Commit message needs fixing, though, as we no longer do userspace PPPoE. Something like: With a default 1500 Ethernet MTU, PPPoE can't do a PPP MTU or MRU greater than 1492 bytes; you need a larger Ethernet MTU to do larger PPP M[RT]U. rp-pppoe.so caps M[RT]U based on the Ethernet MTU of the underlying interface. Attempt to increase the Ethernet MTU of the underlying interface to match the user's requested M[RT]U, ignoring failure. This gets the user their chosen M[RT]U if it's possible on their hardware, and fails gracefully if the user has requested an M[RT]U that cannot be supported.
(In reply to Thomas Haller from comment #18) > (In reply to Dan Williams from comment #16) > > Created attachment 297117 [details] [review] [review] [review] > > 0001-Revert-ppp-manager-don-t-use-kernel-pppoe-rh-1034860.patch > > Doesn't this mean, that users of an older kernel will again have the problem? > > using pppoe they could avoid that problem and instead having missing mtu > functionality. > > How about making it a configure option? I'd rather not, for similar reasons as danw described... for RHEL at least we can get the patch nominated for the kernel, and we can point out the kernel patch git SHA/details in the revert patch for NM. I think that's sufficient.
Message-Id: <1424381068-22252-1-git-send-email-simon@farnz.org.uk> has been sent to netdev@ for review and flaming.
https://patchwork.ozlabs.org/patch/444717/ is (hopefully) the final version of the patch, for DaveM to accept or reject as appropriate.
And, for anyone who needs to track the kernel change too for downstream, https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=287f3a943fef58c5c73e42545169443be379222f is the commit in DaveM's net-next tree. I'll go quiet now, and let you get on with making NM better for everyone.
Thanks again Simon! I think when net-next hits Linus' tree then we can merge the NetworkManager patches with a reference to the kernel patch since we'll have a stable commit ID (stuff might get rebased before then, which changes the commit ID). I'm OK with backporting the NM patches to 1.0 as well, since the lost-PADT thing apparently only affects a very, very small number of people, while the MSS issues are more widely detrimental.
(In reply to Dan Williams from comment #27) > Thanks again Simon! > > I think when net-next hits Linus' tree then we can merge the NetworkManager > patches with a reference to the kernel patch since we'll have a stable > commit ID (stuff might get rebased before then, which changes the commit > ID). I'm OK with backporting the NM patches to 1.0 as well, since the > lost-PADT thing apparently only affects a very, very small number of people, > while the MSS issues are more widely detrimental. I don't think that net-next gets rebased.
(In reply to Dan Williams from comment #27) > Thanks again Simon! > > I think when net-next hits Linus' tree then we can merge the NetworkManager > patches with a reference to the kernel patch since we'll have a stable > commit ID (stuff might get rebased before then, which changes the commit > ID). I'm OK with backporting the NM patches to 1.0 as well, since the > lost-PADT thing apparently only affects a very, very small number of people, > while the MSS issues are more widely detrimental. BTW, when it comes to PADT support, you may also want to be aware of https://github.com/paulusmack/ppp/issues/3 - pppd doesn't send the optional PADT on disconnect, and apparently some PPPoE access concentrators keep the session alive if you disconnect without PADT.
Simon's patch has now hit Linus' tree and queued for 4.1 kernel.
Branch bg/kernel-pppoe-bgo742939 includes the two NM patches with tweaked commit messages and an additional patch to fix MTU handling.
(In reply to Beniamino Galvani from comment #31) > Branch bg/kernel-pppoe-bgo742939 includes the two NM patches with tweaked > commit messages and an additional patch to fix MTU handling. + mru = nm_setting_ppp_get_mru (setting); + mtu = nm_setting_ppp_get_mru (setting); ^^^ mtu + nm_platform_link_set_mtu (nm_platform_link_get_ifindex (priv->parent_iface), mxu + 8); this can cause assertions if the lookup of parent_iface fails. IMO it's a very strange place to set the MTU of the parent device. I would have create_pppd_cmd_line() and nm_ppp_manager_start() return a @out_desired_parent_mtu and the calling device should set the MTU.
While backporting this fix to Ubuntu, I had to revert the following section as well to avoid ppp-manager/nm-ppp-manager.c:659:2: error: 'PPPOE_PATH' undeclared here (not in a function) For some reason, my Kyocera iBurst PPP modem does not work at all with /usr/sbin/pppoe, but works just fine with the kernel PPPoE. --- a/src/ppp-manager/nm-ppp-manager.c +++ b/src/ppp-manager/nm-ppp-manager.c @@ -655,14 +655,6 @@ NULL }; -static const char *pppoe_binary_paths[] = { - PPPOE_PATH, - "/usr/local/sbin/pppoe", - "/usr/sbin/pppoe", - "/sbin/pppoe", - NULL -}; - static inline const char * nm_find_binary (const char *paths[]) {
(In reply to Graham Inggs from comment #33) > While backporting this fix to Ubuntu, I had to revert the following section > as well to avoid > ppp-manager/nm-ppp-manager.c:659:2: error: 'PPPOE_PATH' undeclared here (not > in a function) Right, that chunk was removed in commit 544fc82aa722 "core: consolidate helper program searching (bgo #734131)", so if you are using an older NM version you should revert that part of code as well.
(In reply to Thomas Haller from comment #32) > + nm_platform_link_set_mtu (nm_platform_link_get_ifindex > (priv->parent_iface), mxu + 8); > > this can cause assertions if the lookup of parent_iface fails. Ok. > IMO it's a very strange place to set the MTU of the parent device. I would > have create_pppd_cmd_line() and nm_ppp_manager_start() return a > @out_desired_parent_mtu and the calling device should set the MTU. We need to set the MTU on parent interface before starting pppd, otherwise the PPP MRU and MTU would be capped by pppd to comply with Ethernet MTU. How about changing Ethernet MTU for PPPoE connections according to PPP configuration in stage2 (config) instead?
(In reply to Beniamino Galvani from comment #35) > (In reply to Thomas Haller from comment #32) > > + nm_platform_link_set_mtu (nm_platform_link_get_ifindex > > (priv->parent_iface), mxu + 8); > > > > this can cause assertions if the lookup of parent_iface fails. > > Ok. > > > IMO it's a very strange place to set the MTU of the parent device. I would > > have create_pppd_cmd_line() and nm_ppp_manager_start() return a > > @out_desired_parent_mtu and the calling device should set the MTU. > > We need to set the MTU on parent interface before starting pppd, > otherwise the PPP MRU and MTU would be capped by pppd to comply with > Ethernet MTU. > > How about changing Ethernet MTU for PPPoE connections according to PPP > configuration in stage2 (config) instead? Yeah, that's probably where it should happen.
Created attachment 302898 [details] [review] [PATCH] device: set Ethernet MTU for PPPoE connections in stage2/config
You could use nm_connection_is_type(connection, NM_SETTING_PPPOE_SETTING_NAME) there instead of a strcmp(). The rest looks good.
(In reply to Dan Williams from comment #38) > You could use nm_connection_is_type(connection, > NM_SETTING_PPPOE_SETTING_NAME) there instead of a strcmp(). Pushed fixup to branch bg/kernel-pppoe-mtu-bgo742939.
LGTM
+ if (mxu) { + iface = nm_device_get_iface (device); + ifindex = nm_platform_link_get_ifindex (NM_PLATFORM_GET, iface); is there are reason not to use nm_device_get_ip_ifindex()? If yes, would be worth a code comment :)
(In reply to Thomas Haller from comment #41) > + if (mxu) { > + iface = nm_device_get_iface (device); > + ifindex = nm_platform_link_get_ifindex (NM_PLATFORM_GET, > iface); > > > is there are reason not to use nm_device_get_ip_ifindex()? No particular reason, nm_device_get_ip_ifindex() seems a better choice since it uses the cached value. Maybe nm_device_get_ifindex() is even better as it makes clear that we are interested in the index of underlying device (get_ip_ifindex() and get_ifindex() should be equivalent at this point since the IP layer is not configured yet). Pushed a fixup to bg/kernel-pppoe-mtu-bgo742939.
Fixup looks good; and you're right that everywhere we can, we should use ifindexes instead of interface names (since those can change and it's a short delay until NM knows that they have changed).
Squashed and merged to master: d6ac377 ppp: merge branch 'bg/kernel-pppoe-mtu-bgo742939'