GNOME Bugzilla – Bug 744812
[review] lr/james-bond-rh1183424: Various stuff that blew up while stress-testing bond behavior
Last modified: 2015-05-06 15:42:02 UTC
I've run a reproducer for another bug and for various reasons other than the original issue I couldn't make sure things work well. The script: > nmcli c del bond-bond0 > nmcli c del bond-slave-eth1 > nmcli c del bond-slave-eth2 > nmcli c del vlan-vlan10 > > set -e > > nmcli c add type bond ifname bond0 con-name bond-bond0 mode active-backup > nmcli c mod bond-bond0 ipv4.method disable > nmcli c add type bond-slave ifname eth1 con-name bond-slave-eth1 master bond0 > nmcli c add type bond-slave ifname eth2 con-name bond-slave-eth2 master bond0 > > nmcli c add type vlan ifname vlan10 con-name vlan-vlan10 dev bond0 id 10 > > nmcli c mod vlan-vlan10 ipv4.method manual ipv4.addresses "192.168.10.11/24,192.168.10.1" > > nmcli c up bond-slave-eth1 > nmcli c up bond-slave-eth2 > nmcli c up bond-bond0 > nmcli c up vlan-vlan10 Here's a more-or-less random collection of fixes I had to apply: 3a5636c active-connection: turn off assumed flag when the connection vanishes f4a69b7 utils: match a cloned mac address with a connection does not specify it 0cb0ea3 device: set the reason for when deactivating for another activation 05ef01d manager: don't re-assume generated connections http://cgit.freedesktop.org/NetworkManager/NetworkManager/log/?h=lr/james-bond-rh1183424
(The reason assumed connections kicked in was bug #732703, as well as what f4a69b7 fixes)
I think the first two can go in right now. > utils: match a cloned mac address with a connection does not specify it This will get run for every generated connection, right? So wouldn't it potentially mis-match a non-cloned Ethernet connection? eg, say I have both a cloned Ethernet connection and an un-cloned one. > active-connection: turn off assumed flag when the connection vanishes nm_active_connection_set_connection() takes a reference, so use g_value_dup_object() in set_property() to avoid a reference leak. But this is a bit ugly; I don't think we should clear the assumed tag on the ActiveConnection I think that's part of the immutable state. Instead, what if we move the bits from active_connection_remove() into nm-active-connection.c and make the NMActiveConnection be responsible for deleting the generated NMConnection when it gets disposed? The NMAC could then set priv->connection to NULL when it gets removed and then, of course, wouldn't try to delete it when its NULL.
(In reply to Dan Williams from comment #2) > I think the first two can go in right now. > > > utils: match a cloned mac address with a connection does not specify it > > This will get run for every generated connection, right? So wouldn't it > potentially mis-match a non-cloned Ethernet connection? eg, say I have both > a cloned Ethernet connection and an un-cloned one. The exact matches are prioritized in nm_utils_match_connection(). > > active-connection: turn off assumed flag when the connection vanishes > > nm_active_connection_set_connection() takes a reference, so use > g_value_dup_object() in set_property() to avoid a reference leak. > > But this is a bit ugly; I don't think we should clear the assumed tag on the > ActiveConnection I think that's part of the immutable state. Instead, what > if we move the bits from active_connection_remove() into > nm-active-connection.c and make the NMActiveConnection be responsible for > deleting the generated NMConnection when it gets disposed? The NMAC could > then set priv->connection to NULL when it gets removed and then, of course, > wouldn't try to delete it when its NULL. The problem was that there was a lot of assumptions that connection never vanishes from act request once set. I'll try fixing those cases.
Pushed updated branch.
>> utils: match a cloned mac address with a connection that does not specify it could you add test case(es) that cover this change? >> active-connection: turn off assumed flag when the connection vanishes the first version changes the immutability of nmca-assumed flag. The second version changes the invariant of that an NMCA always references it's connection. Both seem reasonable requirements to me. Instead I would in active_connection_remove(): if ( nm_active_connection_get_assumed (active) && (connection = nm_active_connection_get_connection (active)) + && is_connection_still_not_removed(connection) && nm_settings_connection_get_nm_generated_assumed (NM_SETTINGS_CONNECTION (connection))) g_object_ref (connection);
(In reply to Thomas Haller from comment #5) > >> utils: match a cloned mac address with a connection that does not specify it > > could you add test case(es) that cover this change? Done. > >> active-connection: turn off assumed flag when the connection vanishes > > the first version changes the immutability of nmca-assumed flag. > The second version changes the invariant of that an NMCA always references > it's connection. > > Both seem reasonable requirements to me. > > > Instead I would in active_connection_remove(): > if ( nm_active_connection_get_assumed (active) > && (connection = nm_active_connection_get_connection (active)) > + && is_connection_still_not_removed(connection) > && nm_settings_connection_get_nm_generated_assumed > (NM_SETTINGS_CONNECTION (connection))) > g_object_ref (connection); I would disagree here. I believe it's not too useful to have NMCA hold a connection reference and it seems only two places in the code base assume the reference is always there. I think that dropping the reference when the connection is removed is more straightforward.
The branch seems fine to me. > utils: match a cloned mac address with a connection that does not specify it Just a small nitpick in a comment "A device enslaved to a bond it inherits the bond's MAC address." ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^ > active-connection: drop connection reference if the connection goes away Is it safe to g_clear_object (&priv->connection) in connection_removed?
I'm still not convinced. When I added the remove-of-assumed connection code to active_connection_remove(), I did it on purpose, because it seems like the least-trouble-causing place to do it. At that place, nm-manager (who manages the lifetime of the nm-active-connection) just disposed of nm-a-c. It's a good moment to delete the generated-assumed connection too. Just add a nm_settings_has_connection(): if ( connection + && nm_settings_has_connection (priv->settings, connection)) { nm_log_dbg (LOGD_DEVICE, "Assumed connection disconnected. Deleting generated connection '%s' (%s)", nm_connection_get_id (connection), nm_connection_get_uuid (connection)); nm_settings_connection_delete (NM_SETTINGS_CONNECTION (connection), NULL, NULL); g_object_unref (connection); }
(In reply to Thomas Haller from comment #8) > I'm still not convinced. > > When I added the remove-of-assumed connection code to > active_connection_remove(), I did it on purpose, because it seems like the > least-trouble-causing place to do it. > > At that place, nm-manager (who manages the lifetime of the > nm-active-connection) just disposed of nm-a-c. It's a good moment to delete > the generated-assumed connection too. > > > Just add a nm_settings_has_connection(): > > if ( connection > + && nm_settings_has_connection (priv->settings, connection)) { > nm_log_dbg (LOGD_DEVICE, "Assumed connection disconnected. > Deleting generated connection '%s' (%s)", > nm_connection_get_id (connection), > nm_connection_get_uuid (connection)); > nm_settings_connection_delete (NM_SETTINGS_CONNECTION > (connection), NULL, NULL); > g_object_unref (connection); > } Oh, yes. This indeed looks more straightforward. Updated the branch.
> utils: match a cloned mac address with a connection that does not specify it Sorry to bring this one up again, but maybe we just shouldn't set generated slave connections cloned MAC addresses at all? In any case, patch looks OK. > manager: don't try to delete generated connection if it's already gone The patch looks good. However... :) I wonder if we shouldn't try to fix this in nm-settings (which allows multiple deletions in connection_removed(), which is kinda wrong) and nm-settings-connection.c (which allows nm_dbus_manager_unregister_object() to be called twice) and nm_dbus_manager_unregister_object() itself (which allows itself to be called for things that aren't exported. I those things should get fixed too... Possibly NMSettingsConnection should get a priv->removed variable that prevents double-removal, or possibly g_return_if_fail(). I thought about checking nm_connection_get_path(), but if we set that to NULL when removing, it could allow re-export which we don't want. nm-settings.c::connection_removed() doesn't even check if the connection is priv->connections before doing stuff with it, which is kinda wrong. nm_dbus_manager_unregister_object() doesn't even check if the object is exported before trying to un-export it.
(In reply to Dan Williams from comment #10) > > utils: match a cloned mac address with a connection that does not specify it > > Sorry to bring this one up again, but maybe we just shouldn't set generated > slave connections cloned MAC addresses at all? > > In any case, patch looks OK. > > > manager: don't try to delete generated connection if it's already gone > > The patch looks good. However... :) this LGTM too. > I wonder if we shouldn't try to fix this in nm-settings (which allows > multiple deletions in connection_removed(), which is kinda wrong) and > nm-settings-connection.c (which allows nm_dbus_manager_unregister_object() > to be called twice) and nm_dbus_manager_unregister_object() itself (which > allows itself to be called for things that aren't exported. I those things > should get fixed too... the patch seems still useful, even with that change, because otherwise we would log nm_log_dbg (LOGD_DEVICE, "Assumed connection disconnected..." although the connection is already deleted.
(In reply to Dan Williams from comment #10) > > utils: match a cloned mac address with a connection that does not specify it > > Sorry to bring this one up again, but maybe we just shouldn't set generated > slave connections cloned MAC addresses at all? I don't really know. Both possibly solutions seem equally good to me. > In any case, patch looks OK. > > > manager: don't try to delete generated connection if it's already gone > > The patch looks good. However... :) > > I wonder if we shouldn't try to fix this in nm-settings (which allows > multiple deletions in connection_removed(), which is kinda wrong) and > nm-settings-connection.c (which allows nm_dbus_manager_unregister_object() > to be called twice) and nm_dbus_manager_unregister_object() itself (which > allows itself to be called for things that aren't exported. I those things > should get fixed too... > > Possibly NMSettingsConnection should get a priv->removed variable that > prevents double-removal, or possibly g_return_if_fail(). I thought about > checking nm_connection_get_path(), but if we set that to NULL when removing, > it could allow re-export which we don't want. > > nm-settings.c::connection_removed() doesn't even check if the connection is > priv->connections before doing stuff with it, which is kinda wrong. > > nm_dbus_manager_unregister_object() doesn't even check if the object is > exported before trying to un-export it. The current patch seem like straightforward way to solve the problem to me; your suggestions sound just like hardening to me. Is there any known way for a client to trigger re-export or double deletion of NMSettingsConnection? If not, could I just push this now and maybe leave the ticket open so that the rest is not forgotten?
(In reply to Lubomir Rintel from comment #12) > (In reply to Dan Williams from comment #10) > > > utils: match a cloned mac address with a connection that does not specify it > > > > Sorry to bring this one up again, but maybe we just shouldn't set generated > > slave connections cloned MAC addresses at all? > > I don't really know. Both possibly solutions seem equally good to me. Ok, lets go with your fix. > > In any case, patch looks OK. > > > > > manager: don't try to delete generated connection if it's already gone > > > > The patch looks good. However... :) > > > > I wonder if we shouldn't try to fix this in nm-settings (which allows > > multiple deletions in connection_removed(), which is kinda wrong) and > > nm-settings-connection.c (which allows nm_dbus_manager_unregister_object() > > to be called twice) and nm_dbus_manager_unregister_object() itself (which > > allows itself to be called for things that aren't exported. I those things > > should get fixed too... > > > > Possibly NMSettingsConnection should get a priv->removed variable that > > prevents double-removal, or possibly g_return_if_fail(). I thought about > > checking nm_connection_get_path(), but if we set that to NULL when removing, > > it could allow re-export which we don't want. > > > > nm-settings.c::connection_removed() doesn't even check if the connection is > > priv->connections before doing stuff with it, which is kinda wrong. > > > > nm_dbus_manager_unregister_object() doesn't even check if the object is > > exported before trying to un-export it. > > The current patch seem like straightforward way to solve the problem to me; > your suggestions sound just like hardening to me. Is there any known way for > a client to trigger re-export or double deletion of NMSettingsConnection? Yeah, the current patch will fix the issue and is OK. I pushed two further commits to you branch that do the hardening and another small settings cleanup I noticed along the way. ACK from me on the branch. Can you double-check my patches though before pushing? BTW, are these patches candidates for 1.0 backport?
(In reply to Dan Williams from comment #13) > I pushed two further commits to you branch that do the hardening and another > small settings cleanup I noticed along the way. > > ACK from me on the branch. Can you double-check my patches though before > pushing? They look good & work well. > BTW, are these patches candidates for 1.0 backport? Yes. I'll pull them in after 1.0.2. 2b097fd settings,manager: merge branch 'lr/james-bond-rh1183424' bbcf544 settings: remove 'do_export' argument from claim_connection() fe96dbc settings/dbus: harden connection removal and object unexport 74ed416 manager: don't try to delete generated connection if it's already gone 8a00bb3 nm-settings: add nm_settings_has_connection()
nm-1-0: 1d8a2a2 settings: remove 'do_export' argument from claim_connection() de9a34c settings/dbus: harden connection removal and object unexport 89d96b8 manager: don't try to delete generated connection if it's already gone a4369ef nm-settings: add nm_settings_has_connection()