After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 697360 - [review] pavlix/platform2
[review] pavlix/platform2
Status: RESOLVED FIXED
Product: NetworkManager
Classification: Platform
Component: general
git master
Other Linux
: Normal normal
: ---
Assigned To: Pavel Simerda
NetworkManager maintainer(s)
Depends on: 696434
Blocks:
 
 
Reported: 2013-04-05 14:45 UTC by Pavel Simerda
Modified: 2013-04-24 16:59 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Pavel Simerda 2013-04-05 14:45:17 UTC
A set of patches on top of nm-platform branch regarding master and slave devices. Please review.
Comment 1 Dan Winship 2013-04-05 20:42:42 UTC
Mostly looks good to me.

In hack_empty_master_iff_lower_up(), I'm not sure that a bond should be considered "down" if it has slaves, even if all the slaves are down. (It's possible that this is correct, I just don't remember for sure if it is.)

nm-linux-platform.c's link_enslave() and link_release() are written differently for no good reason. (IMHO, link_release's way is nicer.)

For sysctl_set, I had written an optimization to that code a while back but never got around to testing/submitting it. See the danw/sysctl branch.

You don't have implementations of sysctl_get/sysctl_set and master/slave_get/set_option for NMFakePlatform.

>+	return path ? g_strstrip (nm_platform_sysctl_get (path)) : NULL;

you should probably just move the strstrip call into nm_platform_sysctl_get()...
Comment 2 Dan Williams 2013-04-05 22:12:26 UTC
One important point is that none of the bridging netlink stuff is going to work unless we have kernel >= 3.1 and possibly libnl >= 3.2.22.  Thus when we merge this code, technically it's a regression because the old code supported older kernels using the ioctl interface.  I think we simply need to release-note this very well.  The 3.1 kernel came out on Oct 24, 2011, so some 2012-era distros are probably still using it (at least Ubuntu 12.04 is using 3.2).

Also, libnl 3.2.22 now exposes bridge port attribute API, but doesn't expose bridge master API.  So perhaps we should use that instead of writing to sysfs for slave_[get|set]_option in the linux platform?
Comment 3 Dan Winship 2013-04-06 14:31:09 UTC
Can't we just have the ioctl-based fallback code in NMPlatform too? I thought the goal of NMPlatform was to isolate the ugly compat code in one place, not to get rid of it.
Comment 4 Pavel Simerda 2013-04-09 14:58:18 UTC
(In reply to comment #3)
> Can't we just have the ioctl-based fallback code in NMPlatform too?

Before getting to the rest of comments, I'll answer this. Yes, we can have as much compat code in NMPlatform as necessary and there are already tests for that.

> I thought the goal of NMPlatform was to isolate the ugly compat code in one
> place, not to get rid of it.

During the development of NMPlatform, it was decided to bump the version of libnl and get rid of all compat code that wasn't needed. So far no target kernel version was specified. If we can specify the minimum target kernel, I can do the backporting work.
Comment 5 Pavel Simerda 2013-04-09 16:56:17 UTC
(In reply to comment #1)
> Mostly looks good to me.
> 
> In hack_empty_master_iff_lower_up(), I'm not sure that a bond should be
> considered "down" if it has slaves, even if all the slaves are down.

How would it communicate, then?

> (It's
> possible that this is correct, I just don't remember for sure if it is.)
> 
> nm-linux-platform.c's link_enslave() and link_release() are written differently
> for no good reason. (IMHO, link_release's way is nicer.)

For now changed to match link_enslave(). But if you guys prefer having a separate nle variable for just everything, I'll either change that for you consistently in all pavlix/platform and pavlix/platform2 code or anyone can change it after push.

> For sysctl_set, I had written an optimization to that code a while back but
> never got around to testing/submitting it. See the danw/sysctl branch.

Yep, this will save one of the mallocs on the way. What about the others? I might be delaying the optimization after push.

> You don't have implementations of sysctl_get/sysctl_set and
> master/slave_get/set_option for NMFakePlatform.

Those are not yet available. I'm going to add those and amend the testsuite.

> >+	return path ? g_strstrip (nm_platform_sysctl_get (path)) : NULL;
> 
> you should probably just move the strstrip call into
> nm_platform_sysctl_get()...

Maybe. I just didn't want to tinker with sysctl_get() result as I didn't know whether it's always a simple value. But I'll do that if you think it's safe.
Comment 6 Dan Winship 2013-04-10 12:21:19 UTC
(In reply to comment #5)
> (In reply to comment #1)
> > Mostly looks good to me.
> > 
> > In hack_empty_master_iff_lower_up(), I'm not sure that a bond should be
> > considered "down" if it has slaves, even if all the slaves are down.
> 
> How would it communicate, then?

It wouldn't, until one of the slaves came back. But since the whole point of bonding is redundancy, it seems like maybe even if every slave fails at once, you still want to consider the bond to be up, under the assumption that eventually one of the slaves will come back.

> Maybe. I just didn't want to tinker with sysctl_get() result as I didn't know
> whether it's always a simple value. But I'll do that if you think it's safe.

I haven't looked at all the potential callers, so I don't know for sure, but it seems like it probably ought to be safe...
Comment 7 Dan Williams 2013-04-10 13:00:31 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > (In reply to comment #1)
> > > Mostly looks good to me.
> > > 
> > > In hack_empty_master_iff_lower_up(), I'm not sure that a bond should be
> > > considered "down" if it has slaves, even if all the slaves are down.
> > 
> > How would it communicate, then?
> 
> It wouldn't, until one of the slaves came back. But since the whole point of
> bonding is redundancy, it seems like maybe even if every slave fails at once,
> you still want to consider the bond to be up, under the assumption that
> eventually one of the slaves will come back.

Another good case for separating carrier from connectivity perhaps.  You're correct, we don't want to take down the whole bond just because all the slaves fail, but we certainly want to indicate somehow (the 'carrier' property of the bond master interface, and maybe the global State property) that connectivity is no longer possible.
Comment 8 Pavel Simerda 2013-04-10 14:18:25 UTC
(In reply to comment #7)
> Another good case for separating carrier from connectivity perhaps.

Unless there's another notion of connectivity in the kernel, the NMPlatform notion of connectivity is carrier (if anything).

> You're
> correct, we don't want to take down the whole bond just because all the slaves
> fail,

This can be handled in NMDeviceBond specifically, through NMDevice generic class.

> but we certainly want to indicate somehow (the 'carrier' property of the
> bond master interface, and maybe the global State property) that connectivity
> is no longer possible.

Unless there's something that we can do in NMPlatform, I suggest discussing it when making changes to NMDeviceBridge carrier handling.
Comment 9 Pavel Simerda 2013-04-12 15:27:02 UTC
(In reply to comment #5)
> > You don't have implementations of sysctl_get/sysctl_set and
> > master/slave_get/set_option for NMFakePlatform.
> 
> Those are not yet available. I'm going to add those and amend the testsuite.

* Amended pavlix/platform2 with NMFakePlatform implementation and amended the testsuite.

I'm open to improvements.
Comment 10 Dan Winship 2013-04-16 18:25:02 UTC
> platform: link features (carrier-detect and vlans)

>+link_supports_carrier_detect (NMPlatform *platform, int ifindex)

>+	/* We ignore the result and only return FALSE on error */
>+	return name && ethtool_get (name, &edata);

That could be clarified a little more: "We ignore the result; if the ETHTOOL_GLINK call succeeded, then the device supports carrier-detect, and if it failed, then it doesn't."


>+	/* Only ARPHDR_ETHER links can possibly support VLANs. Thanks to Dan Winship

Typo ("ARPHDR"). And you don't really need to thank me there...  :)



> platform: setter/getter of /proc/sys and /sys options

This doesn't really need to be part of the NMPlatform API. At least not yet. It could just be internal to nm-linux-platform.c. (The less blatantly-linux-specific stuff in the API, the better the odds that it might actually be portable to other platforms someday...)
Comment 11 Dan Williams 2013-04-16 21:25:17 UTC
> platform: setter/getter of /proc/sys and /sys options

nm_platform_sysctl_set() - I would use two g_asserts() here:

g_assert (g_path_is_absolute (path));
g_assert (!strstr (path, ".."));

Any path including ".." shouldn't be allowed, as we won't ever be using clever directory traversal anywhere in our paths for sysfs or proc.


Other than that, same comment as danw about the proc/sys option API being public at the moment.  Rest looks good.
Comment 12 Pavel Simerda 2013-04-17 11:18:31 UTC
(In reply to comment #10)
> > platform: link features (carrier-detect and vlans)
> 
> >+link_supports_carrier_detect (NMPlatform *platform, int ifindex)
> 
> >+	/* We ignore the result and only return FALSE on error */
> >+	return name && ethtool_get (name, &edata);
> 
> That could be clarified a little more: "We ignore the result; if the
> ETHTOOL_GLINK call succeeded, then the device supports carrier-detect, and if
> it failed, then it doesn't."

This was already in master via former pavlix/platform, pushed a better explanation to master.

> >+	/* Only ARPHDR_ETHER links can possibly support VLANs. Thanks to Dan Winship
> 
> Typo ("ARPHDR"). And you don't really need to thank me there...  :)

As you wish, master :). Same as above, fixed to master.

> > platform: setter/getter of /proc/sys and /sys options
> 
> This doesn't really need to be part of the NMPlatform API.

It is there to support that setting through NMFakePlatform. While I agree with you that this is partially linux specific and I'm keeping portability in mind all the time, without that (or without transoforming *all* important sysctl/sysclass calls to NMPlatform calls), it is impossible to run fake NetworkManager without root privileges. And with root privileges, the fake NetworkManager wouldn't avoid side effects.

I vote for keeping the sysctl functions in NMPlatform public API until we can replace all occurences of sysctl calls in NetworkManager with separate NMPlatform calls.

We can possibly note this fact in the comments.

> At least not yet. It could just be internal to nm-linux-platform.c.

I'm going to use it instead of nm_utils_do_sysctl() throughout the code to separate core NetworkManager from the kernel with NMPlatform, as I'm doing with all other NMPlatform functionality. Therefore it can't be internal unless we give up NMFakePlatform-based NetworkManager testing

> (The less blatantly-linux-specific stuff in the API, the better
> the odds that it might actually be portable to other platforms someday...)

My idea was that the replacement of all sysctl/sysclass calls would be done when there is direct motivation to port NetworkManager to another platform. Or when someone does that out of pure enthusiasm :).

(In reply to comment #11)
> > platform: setter/getter of /proc/sys and /sys options
> 
> nm_platform_sysctl_set() - I would use two g_asserts() here:
> 
> g_assert (g_path_is_absolute (path));

As I'm already checking for specific path prefixes, so that we don't ever write outside specific known locations, the absolute path check would be redundant.

> g_assert (!strstr (path, ".."));

Changed.

If you agree with keeping sysctl getter/setter (at least for now) in public NMPlatform API, we are ready to push.
Comment 13 Dan Williams 2013-04-17 17:20:46 UTC
(In reply to comment #12)
> If you agree with keeping sysctl getter/setter (at least for now) in public
> NMPlatform API, we are ready to push.

Sure, that's fine.
Comment 14 Pavel Simerda 2013-04-24 16:59:31 UTC
Pushed last week. Looks like we can close this, feel free to file bug reports.