Bug 741336 – [review] jk/libnm-cli-fixes: a few fixes for libnm and nmcli

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 741336 - [review] jk/libnm-cli-fixes: a few fixes for libnm and nmcli


Summary:	[review] jk/libnm-cli-fixes: a few fixes for libnm and nmcli


Status:	RESOLVED OBSOLETE

Product:	NetworkManager
Classification:	Platform
Component:	general
Version:	git master
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	NetworkManager maintainer(s)
QA Contact:	NetworkManager maintainer(s)

URL:
Whiteboard:

Depends on:
Blocks:	nm-patch

Reported:	2014-12-10 14:20 UTC by Jiri Klimes
Modified:	2020-11-12 14:26 UTC

See Also:
GNOME target:	---
GNOME version:	---

Description Jiri Klimes 2014-12-10 14:20:08 UTC

jk/libnm-cli-fixes branch contains a few fixes for libnm and nmcli, mainly for reporting errors.

Comment 1 Lubomir Rintel 2014-12-10 16:37:49 UTC

+               /* Ensure that nm_device_get_state() will return the right value even if
+                * we haven't processed the corresponding PropertiesChanged yet.
+                */
+               priv->state = tmp->new_state;
+               
  ^^^^^ trailing whitespace

> cli: monitor device when activating connections to capture errors

+       if ((new_state == NM_DEVICE_STATE_FAILED || new_state == NM_DEVICE_STATE_DEACTIVATING))
+               nmc_activate_error = nmc_device_reason_to_string (reason);

What if the callback fires more than once? Do we leak the old reason string? How do we determine the better reason (secrets missing as opposed to none)?

Maybe we should just send the NM_DEVICE_STATE_REASON_NO_SECRETS reason for the transition from NM_DEVICE_STATE_FAILED to NM_DEVICE_STATE_DISCONNECTED and avoid the 100ms wait trickery altogether?

Comment 2 Dan Williams 2014-12-10 23:34:37 UTC

> cli: return empty secrets instead of unregistering the agent

Just curious, why return empty secrets instead of the NM_SECRET_AGENT_ERROR_NO_SECRETS error?  I think NM handles that error OK already by going to the  next agent.


> libnm: emit all NMDevice::state-changed signals

Does emit_state_changed() really need the 'data' argument?  I think 'data' will always be in the queue, no? (though I coudl be wrong).  It looks to me like we can just keep adding the events to the end of the reload queue, and then when the reload is done just emit them all.

> cli: monitor device when activating connections to capture errors

Yeah, the idle is a kinda icky.

Maybe nmc_activate_connection() could pass a data struct/closure to store_device_state_cb() and active_connection_state_cb().  store_device_state_cb() would store the reason code into the struct/closure, and call a completion function.  active_connection_state_cb() would also call that completion function.  The completion function would wait for both calls to be done (or just active_connection_state_cb() in the case of a VPN) and then print the message when they had both completed, using the device reason it received earlier.  Would that be cleaner?

Comment 3 Jiri Klimes 2014-12-11 17:12:57 UTC

(In reply to comment #1)
> +               /* Ensure that nm_device_get_state() will return the right
> value even if
> +                * we haven't processed the corresponding PropertiesChanged
> yet.
> +                */
> +               priv->state = tmp->new_state;
> +               
>   ^^^^^ trailing whitespace
 
I will fix that.

> > cli: monitor device when activating connections to capture errors
> 
> +       if ((new_state == NM_DEVICE_STATE_FAILED || new_state ==
> NM_DEVICE_STATE_DEACTIVATING))
> +               nmc_activate_error = nmc_device_reason_to_string (reason);
> 
> What if the callback fires more than once? Do we leak the old reason string?
> How do we determine the better reason (secrets missing as opposed to none)?
> 
It is const string.



(In reply to comment #2)
> > cli: return empty secrets instead of unregistering the agent
> 
> Just curious, why return empty secrets instead of the
> NM_SECRET_AGENT_ERROR_NO_SECRETS error?  I think NM handles that error OK
> already by going to the  next agent.
> 
There no way to do it now. You mean updating nm_secret_agent_simple_response() to return NM_SECRET_AGENT_ERROR_NO_SECRETS instead of empty dict?

Anyway, src/settings/nm-agent-manager.c:get_done_cb() behaves the same for both cases (calls next agent).

>
> > libnm: emit all NMDevice::state-changed signals
> 
> Does emit_state_changed() really need the 'data' argument?  I think 'data' will
> always be in the queue, no? (though I coudl be wrong).  It looks to me like we
> can just keep adding the events to the end of the reload queue, and then when
> the reload is done just emit them all.
> 
data parameter is used to find the current signal and only emit the signals before that including the current one (they are old anyway). 
If we would remove the parameter and emit all the signals, would not it cause problems? I mean emitting signals before reloading the properties.

> > cli: monitor device when activating connections to capture errors
> 
> Yeah, the idle is a kinda icky.
> 
> Maybe nmc_activate_connection() could pass a data struct/closure to
> store_device_state_cb() and active_connection_state_cb(). 
> store_device_state_cb() would store the reason code into the struct/closure,
> and call a completion function.  active_connection_state_cb() would also call
> that completion function.  The completion function would wait for both calls to
> be done (or just active_connection_state_cb() in the case of a VPN) and then
> print the message when they had both completed, using the device reason it
> received earlier.  Would that be cleaner?

Yes, the idles and synchronization is tricky. I will think it over again...

Comment 4 Dan Williams 2014-12-12 17:43:38 UTC

> (In reply to comment #2)
> > > cli: return empty secrets instead of unregistering the agent
> > 
> > Just curious, why return empty secrets instead of the
> > NM_SECRET_AGENT_ERROR_NO_SECRETS error?  I think NM handles that error OK
> > already by going to the  next agent.
> > 
> There no way to do it now. You mean updating nm_secret_agent_simple_response()
> to return NM_SECRET_AGENT_ERROR_NO_SECRETS instead of empty dict?
> 
> Anyway, src/settings/nm-agent-manager.c:get_done_cb() behaves the same for both
> cases (calls next agent).

True.  I think this is fine for now.

> > > libnm: emit all NMDevice::state-changed signals
> > 
> > Does emit_state_changed() really need the 'data' argument?  I think 'data' will
> > always be in the queue, no? (though I coudl be wrong).  It looks to me like we
> > can just keep adding the events to the end of the reload queue, and then when
> > the reload is done just emit them all.
> > 
> data parameter is used to find the current signal and only emit the signals
> before that including the current one (they are old anyway). 
> If we would remove the parameter and emit all the signals, would not it cause
> problems? I mean emitting signals before reloading the properties.

Ah good point, since there could potentially be multiple in-flight calls to _nm_object_reload_properties_async().

> > > cli: monitor device when activating connections to capture errors
> > 
> > Yeah, the idle is a kinda icky.
> > 
> > Maybe nmc_activate_connection() could pass a data struct/closure to
> > store_device_state_cb() and active_connection_state_cb(). 
> > store_device_state_cb() would store the reason code into the struct/closure,
> > and call a completion function.  active_connection_state_cb() would also call
> > that completion function.  The completion function would wait for both calls to
> > be done (or just active_connection_state_cb() in the case of a VPN) and then
> > print the message when they had both completed, using the device reason it
> > received earlier.  Would that be cleaner?
> 
> Yes, the idles and synchronization is tricky. I will think it over again...

Ok.

Comment 5 Dan Winship 2014-12-15 13:04:27 UTC

> libnm: add NM_CLIENT_ERROR_ACTIVE_CONNECTION_REMOVED error

If an AC is created and then immediately removed, it's basically a race condition whether you end up here or in object_creation_failed(), so we should return the same error (NM_CLIENT_ERROR_OBJECT_CREATION_FAILED) in both cases.


> libnm: emit all NMDevice::state-changed signals

This seems wrong. Why is this needed? (And why only for NMDevice::state-changed, and not every other signal?)

Hm... well, ok, I guess it's needed to make the next patch work... but it's weird that ::state-changed works this way, but ::notify::state doesn't, etc.

Comment 6 Jiri Klimes 2014-12-15 15:02:34 UTC

(In reply to comment #5)
> > libnm: add NM_CLIENT_ERROR_ACTIVE_CONNECTION_REMOVED error
> 
> If an AC is created and then immediately removed, it's basically a race
> condition whether you end up here or in object_creation_failed(), so we should
> return the same error (NM_CLIENT_ERROR_OBJECT_CREATION_FAILED) in both cases.
> 
Changed to NM_CLIENT_ERROR_OBJECT_CREATION_FAILED. However, is it really always a creation failure?

> 
> > libnm: emit all NMDevice::state-changed signals
> 
> This seems wrong. Why is this needed? (And why only for
> NMDevice::state-changed, and not every other signal?)
> 
> Hm... well, ok, I guess it's needed to make the next patch work... but it's
> weird that ::state-changed works this way, but ::notify::state doesn't, etc.

Well, I guess ::notify::state should be emitted for all changes as well (if it is not currently). However, the problem was that before we explicitly skipped state-changed signal which is not nice.

(In reply to comment #4)
> > > 
> > > Maybe nmc_activate_connection() could pass a data struct/closure to
> > > store_device_state_cb() and active_connection_state_cb(). 
> > > store_device_state_cb() would store the reason code into the struct/closure,
> > > and call a completion function.  active_connection_state_cb() would also call
> > > that completion function.  The completion function would wait for both calls to
> > > be done (or just active_connection_state_cb() in the case of a VPN) and then
> > > print the message when they had both completed, using the device reason it
> > > received earlier.  Would that be cleaner?
> > 
> > Yes, the idles and synchronization is tricky. I will think it over again...
> 
> Ok.

Thinking over that again, it seems to me that the solution with a completion function would not differ much from the current commit. The completion function would implement a kind of timer anyway as a safeguard. Moreover NMDevice::state-changed signal function doesn't accept user_data parameter, so store_device_error_cb() would still need to use a global variable for the error.

Update in jk/libnm-cli-fixes2.

Comment 7 Dan Winship 2014-12-15 15:35:12 UTC

(In reply to comment #6)
> Changed to NM_CLIENT_ERROR_OBJECT_CREATION_FAILED. However, is it really always
> a creation failure?

OBJECT_CREATION_FAILED means "the object was destroyed before libnm could finish creating/populating a wrapper object for it". So any situation where the AC is destroyed before NM can return it to the caller would be this.

Maybe it needs a better name.

> > Hm... well, ok, I guess it's needed to make the next patch work... but it's
> > weird that ::state-changed works this way, but ::notify::state doesn't, etc.
> 
> Well, I guess ::notify::state should be emitted for all changes as well (if it
> is not currently). However, the problem was that before we explicitly skipped
> state-changed signal which is not nice.

So the reason why we don't emit state-changed right away is because gnome-shell used to get confused when we claimed to be CONNECTED, but :ip4-config and :ip6-config were still NULL. Doing the queue-and-replay of state-changed signals brings this problem back; while we won't emit state-changed too soon, we'll sometimes emit it too late, and signal handlers will find that the NMDevice doesn't look the way they expected it to in the given state.

Alternate possibilities:

  - Diverge from libnm-glib and make ::state-changed be immediate in libnm,
    and say you should use ::notify::state if you want state change
    notifications that are synched up with the properties.

  - Keep ::state-changed as-is, but add a second signal that's emitted
    immediately

  - Add a "deferred" argument to the signal that would be set on the
    emissions that don't match the current device state, as a warning to
    signal handlers

  - Add a "missed_states"/"squashed_states" argument (or property, or
    function) to allow signal handlers to find out about any state
    change notifications they missed

Comment 8 André Klapper 2020-11-12 14:26:07 UTC

bugzilla.gnome.org is being shut down in favor of a GitLab instance. 
We are closing all old bug reports and feature requests in GNOME Bugzilla which have not seen updates for a long time.

If you still use NetworkManager and if you still see this bug / want this feature in a recent and supported version of NetworkManager, then please feel free to report it at https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/

Thank you for creating this report and we are sorry it could not be implemented (workforce and time is unfortunately limited).