After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 744812 - [review] lr/james-bond-rh1183424: Various stuff that blew up while stress-testing bond behavior
[review] lr/james-bond-rh1183424: Various stuff that blew up while stress-tes...
Status: RESOLVED FIXED
Product: NetworkManager
Classification: Platform
Component: general
git master
Other Linux
: Normal normal
: ---
Assigned To: NetworkManager maintainer(s)
NetworkManager maintainer(s)
Depends on:
Blocks: nm-review
 
 
Reported: 2015-02-19 18:39 UTC by Lubomir Rintel
Modified: 2015-05-06 15:42 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Lubomir Rintel 2015-02-19 18:39:02 UTC
I've run a reproducer for another bug and for various reasons other than the original issue I couldn't make sure things work well. The script:

> nmcli c del bond-bond0
> nmcli c del bond-slave-eth1
> nmcli c del bond-slave-eth2
> nmcli c del vlan-vlan10
> 
> set -e
> 
> nmcli c add type bond ifname bond0 con-name bond-bond0 mode active-backup
> nmcli c mod bond-bond0 ipv4.method disable
> nmcli c add type bond-slave ifname eth1 con-name bond-slave-eth1 master bond0
> nmcli c add type bond-slave ifname eth2 con-name bond-slave-eth2 master bond0
> 
> nmcli c add type vlan ifname vlan10 con-name vlan-vlan10 dev bond0 id 10
> 
> nmcli c mod vlan-vlan10 ipv4.method manual ipv4.addresses "192.168.10.11/24,192.168.10.1"
>
> nmcli c up bond-slave-eth1
> nmcli c up bond-slave-eth2
> nmcli c up bond-bond0
> nmcli c up vlan-vlan10

Here's a more-or-less random collection of fixes I had to apply:

3a5636c active-connection: turn off assumed flag when the connection vanishes
f4a69b7 utils: match a cloned mac address with a connection does not specify it
0cb0ea3 device: set the reason for when deactivating for another activation
05ef01d manager: don't re-assume generated connections

http://cgit.freedesktop.org/NetworkManager/NetworkManager/log/?h=lr/james-bond-rh1183424
Comment 1 Lubomir Rintel 2015-02-19 18:40:16 UTC
(The reason assumed connections kicked in was bug #732703, as well as what f4a69b7 fixes)
Comment 2 Dan Williams 2015-02-23 23:01:55 UTC
I think the first two can go in right now.

> utils: match a cloned mac address with a connection does not specify it

This will get run for every generated connection, right?  So wouldn't it potentially mis-match a non-cloned Ethernet connection?  eg, say I have both a cloned Ethernet connection and an un-cloned one.

> active-connection: turn off assumed flag when the connection vanishes

nm_active_connection_set_connection() takes a reference, so use g_value_dup_object() in set_property() to avoid a reference leak.

But this is a bit ugly; I don't think we should clear the assumed tag on the ActiveConnection I think that's part of the immutable state.  Instead, what if we move the bits from active_connection_remove() into nm-active-connection.c and make the NMActiveConnection be responsible for deleting the generated NMConnection when it gets disposed?  The NMAC could then set priv->connection to NULL when it gets removed and then, of course, wouldn't try to delete it when its NULL.
Comment 3 Lubomir Rintel 2015-03-31 14:56:00 UTC
(In reply to Dan Williams from comment #2)
> I think the first two can go in right now.
> 
> > utils: match a cloned mac address with a connection does not specify it
> 
> This will get run for every generated connection, right?  So wouldn't it
> potentially mis-match a non-cloned Ethernet connection?  eg, say I have both
> a cloned Ethernet connection and an un-cloned one.

The exact matches are prioritized in nm_utils_match_connection().

> > active-connection: turn off assumed flag when the connection vanishes
> 
> nm_active_connection_set_connection() takes a reference, so use
> g_value_dup_object() in set_property() to avoid a reference leak.
> 
> But this is a bit ugly; I don't think we should clear the assumed tag on the
> ActiveConnection I think that's part of the immutable state.  Instead, what
> if we move the bits from active_connection_remove() into
> nm-active-connection.c and make the NMActiveConnection be responsible for
> deleting the generated NMConnection when it gets disposed?  The NMAC could
> then set priv->connection to NULL when it gets removed and then, of course,
> wouldn't try to delete it when its NULL.

The problem was that there was a lot of assumptions that connection never vanishes from act request once set. I'll try fixing those cases.
Comment 4 Lubomir Rintel 2015-03-31 15:11:29 UTC
Pushed updated branch.
Comment 5 Thomas Haller 2015-04-01 11:08:55 UTC
>> utils: match a cloned mac address with a connection that does not specify it

could you add test case(es) that cover this change?


>> active-connection: turn off assumed flag when the connection vanishes
    
the first version changes the immutability of nmca-assumed flag.
The second version changes the invariant of that an NMCA always references it's connection.

Both seem reasonable requirements to me.


Instead I would in active_connection_remove():
          if (   nm_active_connection_get_assumed (active)
              && (connection = nm_active_connection_get_connection (active))
+             && is_connection_still_not_removed(connection)
              && nm_settings_connection_get_nm_generated_assumed (NM_SETTINGS_CONNECTION (connection)))
               g_object_ref (connection);
Comment 6 Lubomir Rintel 2015-04-03 23:04:16 UTC
(In reply to Thomas Haller from comment #5)
> >> utils: match a cloned mac address with a connection that does not specify it
> 
> could you add test case(es) that cover this change?

Done.

> >> active-connection: turn off assumed flag when the connection vanishes
>     
> the first version changes the immutability of nmca-assumed flag.
> The second version changes the invariant of that an NMCA always references
> it's connection.
> 
> Both seem reasonable requirements to me.
> 
> 
> Instead I would in active_connection_remove():
>           if (   nm_active_connection_get_assumed (active)
>               && (connection = nm_active_connection_get_connection (active))
> +             && is_connection_still_not_removed(connection)
>               && nm_settings_connection_get_nm_generated_assumed
> (NM_SETTINGS_CONNECTION (connection)))
>                g_object_ref (connection);

I would disagree here. I believe it's not too useful to have NMCA hold a connection reference and it seems only two places in the code base assume the reference is always there.

I think that dropping the reference when the connection is removed is more straightforward.
Comment 7 Jiri Klimes 2015-04-15 08:06:47 UTC
The branch seems fine to me.

> utils: match a cloned mac address with a connection that does not specify it
Just a small nitpick in a comment 
"A device enslaved to a bond it inherits the bond's MAC address."
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^

> active-connection: drop connection reference if the connection goes away
Is it safe to g_clear_object (&priv->connection) in connection_removed?
Comment 8 Thomas Haller 2015-04-15 11:48:32 UTC
I'm still not convinced.

When I added the remove-of-assumed connection code to active_connection_remove(), I did it on purpose, because it seems like the least-trouble-causing place to do it.

At that place, nm-manager (who manages the lifetime of the nm-active-connection) just disposed of nm-a-c. It's a good moment to delete the generated-assumed connection too.


Just add a nm_settings_has_connection():

         if (   connection
+            && nm_settings_has_connection (priv->settings, connection)) {
              nm_log_dbg (LOGD_DEVICE, "Assumed connection disconnected. Deleting generated connection '%s' (%s)",
                          nm_connection_get_id (connection), nm_connection_get_uuid (connection));
              nm_settings_connection_delete (NM_SETTINGS_CONNECTION (connection), NULL, NULL);
              g_object_unref (connection);
         }
Comment 9 Lubomir Rintel 2015-04-16 18:22:54 UTC
(In reply to Thomas Haller from comment #8)
> I'm still not convinced.
> 
> When I added the remove-of-assumed connection code to
> active_connection_remove(), I did it on purpose, because it seems like the
> least-trouble-causing place to do it.
> 
> At that place, nm-manager (who manages the lifetime of the
> nm-active-connection) just disposed of nm-a-c. It's a good moment to delete
> the generated-assumed connection too.
> 
> 
> Just add a nm_settings_has_connection():
> 
>          if (   connection
> +            && nm_settings_has_connection (priv->settings, connection)) {
>               nm_log_dbg (LOGD_DEVICE, "Assumed connection disconnected.
> Deleting generated connection '%s' (%s)",
>                           nm_connection_get_id (connection),
> nm_connection_get_uuid (connection));
>               nm_settings_connection_delete (NM_SETTINGS_CONNECTION
> (connection), NULL, NULL);
>               g_object_unref (connection);
>          }

Oh, yes. This indeed looks more straightforward.

Updated the branch.
Comment 10 Dan Williams 2015-04-16 22:06:47 UTC
> utils: match a cloned mac address with a connection that does not specify it

Sorry to bring this one up again, but maybe we just shouldn't set generated slave connections cloned MAC addresses at all?

In any case, patch looks OK.

> manager: don't try to delete generated connection if it's already gone

The patch looks good.  However... :)

I wonder if we shouldn't try to fix this in nm-settings (which allows multiple deletions in connection_removed(), which is kinda wrong) and nm-settings-connection.c (which allows nm_dbus_manager_unregister_object() to be called twice) and nm_dbus_manager_unregister_object() itself (which allows itself to be called for things that aren't exported.  I those things should get fixed too...

Possibly NMSettingsConnection should get a priv->removed variable that prevents double-removal, or possibly g_return_if_fail().  I thought about checking nm_connection_get_path(), but if we set that to NULL when removing, it could allow re-export which we don't want.

nm-settings.c::connection_removed() doesn't even check if the connection is priv->connections before doing stuff with it, which is kinda wrong.

nm_dbus_manager_unregister_object() doesn't even check if the object is exported before trying to un-export it.
Comment 11 Thomas Haller 2015-04-17 07:45:07 UTC
(In reply to Dan Williams from comment #10)
> > utils: match a cloned mac address with a connection that does not specify it
> 
> Sorry to bring this one up again, but maybe we just shouldn't set generated
> slave connections cloned MAC addresses at all?
> 
> In any case, patch looks OK.
> 
> > manager: don't try to delete generated connection if it's already gone
> 
> The patch looks good.  However... :)

this LGTM too.


> I wonder if we shouldn't try to fix this in nm-settings (which allows
> multiple deletions in connection_removed(), which is kinda wrong) and
> nm-settings-connection.c (which allows nm_dbus_manager_unregister_object()
> to be called twice) and nm_dbus_manager_unregister_object() itself (which
> allows itself to be called for things that aren't exported.  I those things
> should get fixed too...

the patch seems still useful, even with that change, because otherwise we would log 
  nm_log_dbg (LOGD_DEVICE, "Assumed connection disconnected..."
although the connection is already deleted.
Comment 12 Lubomir Rintel 2015-04-20 20:05:55 UTC
(In reply to Dan Williams from comment #10)
> > utils: match a cloned mac address with a connection that does not specify it
> 
> Sorry to bring this one up again, but maybe we just shouldn't set generated
> slave connections cloned MAC addresses at all?

I don't really know. Both possibly solutions seem equally good to me.
 
> In any case, patch looks OK.
>
> > manager: don't try to delete generated connection if it's already gone
> 
> The patch looks good.  However... :)
> 
> I wonder if we shouldn't try to fix this in nm-settings (which allows
> multiple deletions in connection_removed(), which is kinda wrong) and
> nm-settings-connection.c (which allows nm_dbus_manager_unregister_object()
> to be called twice) and nm_dbus_manager_unregister_object() itself (which
> allows itself to be called for things that aren't exported.  I those things
> should get fixed too...
> 
> Possibly NMSettingsConnection should get a priv->removed variable that
> prevents double-removal, or possibly g_return_if_fail().  I thought about
> checking nm_connection_get_path(), but if we set that to NULL when removing,
> it could allow re-export which we don't want.
> 
> nm-settings.c::connection_removed() doesn't even check if the connection is
> priv->connections before doing stuff with it, which is kinda wrong.
> 
> nm_dbus_manager_unregister_object() doesn't even check if the object is
> exported before trying to un-export it.

The current patch seem like straightforward way to solve the problem to me; your suggestions sound just like hardening to me. Is there any known way for a client to trigger re-export or double deletion of NMSettingsConnection?

If not, could I just push this now and maybe leave the ticket open so that the rest is not forgotten?
Comment 13 Dan Williams 2015-04-23 14:44:31 UTC
(In reply to Lubomir Rintel from comment #12)
> (In reply to Dan Williams from comment #10)
> > > utils: match a cloned mac address with a connection that does not specify it
> > 
> > Sorry to bring this one up again, but maybe we just shouldn't set generated
> > slave connections cloned MAC addresses at all?
> 
> I don't really know. Both possibly solutions seem equally good to me.

Ok, lets go with your fix.

> > In any case, patch looks OK.
> >
> > > manager: don't try to delete generated connection if it's already gone
> > 
> > The patch looks good.  However... :)
> > 
> > I wonder if we shouldn't try to fix this in nm-settings (which allows
> > multiple deletions in connection_removed(), which is kinda wrong) and
> > nm-settings-connection.c (which allows nm_dbus_manager_unregister_object()
> > to be called twice) and nm_dbus_manager_unregister_object() itself (which
> > allows itself to be called for things that aren't exported.  I those things
> > should get fixed too...
> > 
> > Possibly NMSettingsConnection should get a priv->removed variable that
> > prevents double-removal, or possibly g_return_if_fail().  I thought about
> > checking nm_connection_get_path(), but if we set that to NULL when removing,
> > it could allow re-export which we don't want.
> > 
> > nm-settings.c::connection_removed() doesn't even check if the connection is
> > priv->connections before doing stuff with it, which is kinda wrong.
> > 
> > nm_dbus_manager_unregister_object() doesn't even check if the object is
> > exported before trying to un-export it.
> 
> The current patch seem like straightforward way to solve the problem to me;
> your suggestions sound just like hardening to me. Is there any known way for
> a client to trigger re-export or double deletion of NMSettingsConnection?

Yeah, the current patch will fix the issue and is OK.

I pushed two further commits to you branch that do the hardening and another small settings cleanup I noticed along the way.

ACK from me on the branch.  Can you double-check my patches though before pushing?

BTW, are these patches candidates for 1.0 backport?
Comment 14 Lubomir Rintel 2015-04-27 13:03:13 UTC
(In reply to Dan Williams from comment #13)

> I pushed two further commits to you branch that do the hardening and another
> small settings cleanup I noticed along the way.
> 
> ACK from me on the branch.  Can you double-check my patches though before
> pushing?

They look good & work well.

> BTW, are these patches candidates for 1.0 backport?

Yes. I'll pull them in after 1.0.2.

2b097fd settings,manager: merge branch 'lr/james-bond-rh1183424'
bbcf544 settings: remove 'do_export' argument from claim_connection()
fe96dbc settings/dbus: harden connection removal and object unexport
74ed416 manager: don't try to delete generated connection if it's already gone
8a00bb3 nm-settings: add nm_settings_has_connection()
Comment 15 Lubomir Rintel 2015-05-06 15:42:02 UTC
nm-1-0:

1d8a2a2 settings: remove 'do_export' argument from claim_connection()
de9a34c settings/dbus: harden connection removal and object unexport
89d96b8 manager: don't try to delete generated connection if it's already gone
a4369ef nm-settings: add nm_settings_has_connection()