GNOME Bugzilla – Bug 689741
dispatcher does not properly announce configuration changes (IP, DNS, NTP, ...)
Last modified: 2020-11-12 14:33:33 UTC
Created attachment 230836 [details] dispatcher log Version: 0.9.6.4 (also reproducible with 0.9.4.0) I have a dhcpv4 server and radvd running in my network. The NM connection has method=auto for both ipv4 and ipv6. When I enable the connection, both ipv4 and ipv6 are correctly configured and I have a corresponding IP address. Yet the dispatcher only runs for the ipv4 config. If I disable the ipv4 config (method=none), then the dispatcher is run for the ipv6 config. Attached is a debug log of NetworkManager and the dispatcher, where I started NM, then disable networking and re-enabled it again. As you can see, the dispatcher does not show any ipv6 config although nm-tool or the NetworkManager.log clearly shows that NM successfully aquired an ipv6 address.
Created attachment 230837 [details] NetworkManager log
Created attachment 230838 [details] nm-tool output
NM bugzilla reorganization. Sorry for the bug spam.
This bug looks closely related to bug 689742. What's the current status?
The dispatcher should be redesigned with regard to IP and DNS configuration changes. For example, when you connect dynamically to IPv4 and IPv6, the actions should be: * IPv4: configuration changed * device: state changed * IPv6: configuration changed Or, if IPv6 finishes first: * IPv6: configuration changed * device: state changed * IPv4: configuration changed When later IPv4/IPv6 configuration changes because of RD/DHCP events: * IPv4: configuration changed * IPv6: configuration changed The problem of dynamic dualprotocol configuration is that it's no longer strictly bound to the device state and so we'd best handle the changes separately. The same applies for DNS configuration. And most importantly, the data format (environment variables) should not substantially differ between VPN and non-VPN connections.
*** Bug 694797 has been marked as a duplicate of this bug. ***
The code that causes this problematic behavior is in nm-device.c and is located in 2 parts: Part #1 (nm_device_activate_ip4_config_commit): http://cgit.freedesktop.org/NetworkManager/NetworkManager/tree/src/devices/nm-device.c#n4631 if (nm_device_get_state (self) == NM_DEVICE_STATE_IP_CONFIG) nm_device_state_changed (self, NM_DEVICE_STATE_IP_CHECK, NM_DEVICE_STATE_REASON_NONE); Part #2 (nm_device_activate_ip6_config_commit): http://cgit.freedesktop.org/NetworkManager/NetworkManager/tree/src/devices/nm-device.c#n4725 if (nm_device_get_state (self) == NM_DEVICE_STATE_IP_CONFIG) nm_device_state_changed (self, NM_DEVICE_STATE_IP_CHECK, NM_DEVICE_STATE_REASON_NONE); This code essentially let's the device state transition from NM_DEVICE_STATE_IP_CONFIG to NM_DEVICE_STATE_IP_CHECK if the IPv4 *or* the IPv6 configuration is done. IMHO this is a bug because the device state transitions to NM_DEVICE_STATE_IP_CHECK before NM_DEVICE_STATE_IP_CONFIG has been completed. I have two proposals how this issue could be fixed: 1) Transition to the NM_DEVICE_STATE_IP_CHECK state only if the IPv4 *and* IPv6 configuration are done. 2) Call nm_utils_call_dispatcher before these two if blocks so that the dispatcher runs for each IP configuration update. Dan, Pavel, what do you think? I'm happy to provide a patch as long as we agree on a way to fix this issue.
(In reply to comment #7) > This code essentially let's the device state transition from > NM_DEVICE_STATE_IP_CONFIG to NM_DEVICE_STATE_IP_CHECK if the IPv4 *or* the IPv6 > configuration is done. Right. That's intentional; otherwise if you have ip6.method=auto, and you're on a network with no IPv6 support, it would take 45 seconds for IP_CONFIG to complete.
Then I guess that leaves us with fix proposal 2). Dan, what do you think about calling nm_utils_call_dispatcher multiple times during this phase? So instead of calling it once on the transition to the NM_DEVICE_STATE_ACTIVATED state we could introduce two additional dispatcher runs - once the IPv4 configuration is done and once the IPv6 configuration is done. The question would be though what DISPATCHER_ACTIONs should be used for the two new dispatcher runs...
Is this actually still a problem? Note that this bug report is quite old, and there have been changes with respect to dispatcher callouts. Especially http://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=11f9855223966c4fd927fe83e2cd9a623a74acad sounds like it would fix the issue. I almost think we can close this one as OBSOLETE
Thomas, this code is indeed similar to what I've had in mind but this bug might not be fully obsolete by now. A dhcp*-change dispatcher event doesn't tell if the interface came up or if it has been up for a while. This has been also mentioned in the bug related to the commit: https://bugzilla.gnome.org/show_bug.cgi?id=729284#c2 I don't know if the initial report of this bug cares about this detail. For me this detail is not important and so I will backport the commit to the NM version we use. Is there an interest of a patch for NM 0.9.8.8? If yes, then I would attach the patch to this bug once I've done the work.
(In reply to comment #8) > Right. That's intentional; otherwise if you have ip6.method=auto, and you're on > a network with no IPv6 support, it would take 45 seconds for IP_CONFIG to > complete. I don't know why should IPv4 and IPv6 be treated any different. Both protocol versions allow the client to discover the server/router and wait for it to respond. Therefore the result of either of them can be either immediate or by timeout. The only reasons to treat IPv4 differently are historical and statistical but we should try to avoid that as much as possible. (In reply to comment #11) > Thomas, this code is indeed similar to what I've had in mind but this bug might > not be fully obsolete by now. In my opinion, with the current git master (not sure about specific versions) is good enough in that it starts the dispatcher.d script upon any of the important configuration changes. On the other hand, the dispatcher script API and the potential sequences of events are still weird. See: https://bugzilla.gnome.org/show_bug.cgi?id=689741#c5 For example, I don't see a direct representation for a finished IPv4 configuration (e.g. via DHCP). The script doesn't get any input argument that would be always the same in this case and always different in others. In my opinion, an optimal solution would carry all concurrent events in one "message", i.e. one run of a dispatcher script. Example: Message 1: IPv6 configuration (e.g. via RA) gets finished first IPV6_CONFIGURATION=up Message 2: IPv4 configuration (e.g. via DHCP) gets finished second IPV4_CONFIGURATION=up CONFIGURATION=up Other variables could carry information specific to RA, DHCPv4 and DHCPv6 configuration protocols. A dispatcher script could then choose what to do (and whether to act at all) according to multiple input options. Some scripts would only act when configuration is considered up by NetworkManager, other could distingush IP protocol versions.
I would love to see IPv4 and IPv6 treated equally but I can also understand if some people argue that the time isn't ready for it yet as IPv4 is still vastly more important as IPv6... I also fully agree with Pavel's opinion that the current git master code is weird when it comes the dispatcher runs when a device is brought up. The current code makes it hard to write well working dispatcher scripts if something is supposed to only happen on interface up and relies on DHCPv4 options.
Created attachment 282700 [details] [review] [PATCH for NM 0.9.10.0] core: emit dhcp4- and dhcp6-change dispatcher events on interface up (Revision 1)
Maybe it is easier to discuss a patch. ;-)
(In reply to comment #14) > Created an attachment (id=282700) [details] [review] > [PATCH for NM 0.9.10.0] core: emit dhcp4- and dhcp6-change dispatcher events on > interface up (Revision 1) I kinda feel, it would be desirable not to receive an DHCPx_CHANGE event before the UP event. I guess, the original patch tried to achieve that. But it actually fails to guarantee that too. So, if we don't want that ordering (UP before CHANGE), the patch is just fine. Actually, I think with attachment 282700 [details] [review] it is more correct. Does this fix some kind of race for you? How exactly does this help you to write dispatcher scripts?
I guess it is easier to explain this with an example. We're setting the hostname of workstations via a dispatcher script on boot based on data supplied by DHCPv4. Once the hostname has been set all successive dispatcher script runs only log if a hostname change is pending and recommend a reboot. Currently the dispatcher script reacts only to the up event. This is problematic because sometimes the IPv6 configuration is earlier done than the IPv4 configuration and because of that the hostname data is not available to the dispatcher script. With commit [1] the dispatcher script can be updated to react to the up and dhcp4-change event and finally the dispatcher script would get the hostname data in all cases on interface up BUT this behavior is not documented anywhere and to someone new to dispatcher scripts certainly not obvious. That's the reason why I've proposed the patch [2]. Then our dispatcher script would only react to the dhcp4-change event which would be IMHO straightforward. Even better would be IMHO if there would be dedicated dhcp4-up and dhcp6-up events. [1] http://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=11f9855223966c4fd927fe83e2cd9a623a74acad [2] https://bugzilla.gnome.org/attachment.cgi?id=282700
Thomas, did my example help? I can also drop by in the IRC channel to discuss the issue. ;-)
(In reply to comment #12) > For example, I don't see a direct representation for a finished IPv4 > configuration (e.g. via DHCP). The script doesn't get any input argument that > would be always the same in this case and always different in others. > > In my opinion, an optimal solution would carry all concurrent events in one > "message", i.e. one run of a dispatcher script. > > Example: > > Message 1: IPv6 configuration (e.g. via RA) gets finished first > > IPV6_CONFIGURATION=up > > Message 2: IPv4 configuration (e.g. via DHCP) gets finished second > > IPV4_CONFIGURATION=up > CONFIGURATION=up > > Other variables could carry information specific to RA, DHCPv4 and DHCPv6 > configuration protocols. A dispatcher script could then choose what to do (and > whether to act at all) according to multiple input options. Some scripts would > only act when configuration is considered up by NetworkManager, other could > distingush IP protocol versions. I still think that the overall dispatcher behavior could be improved as above and we could even use a single message per an actual event in the long term. If one wants to react on a *finished DHCPv4 transaction*, he should be able to do it in a uniform way, i.e. by checking a single environment variable for a specific value in the dispatcher script. In that case we wouldn't spawn dispatcher scripts unnecessarily often (especially not a number of times for a single actual event) and the workflow would be as easy as possible for anyone watching to any type of change. (In reply to comment #17) > Even better would be IMHO if there would be dedicated dhcp4-up and dhcp6-up > events. What I'm proposing is one step further where you could check for IPV4_CONFIGURATION=up or even IPV4_DHCP=up. Missing environment variable would indicate that the event is not of your interest, e.g. finished IPv6 configuration (by whatever means).
Additional variables in the environment would be a nice addition. The main issue is though that there is no documentation whatsoever regarding the dispatcher behavior and its environment variables. How about letting a colleague who has no clue about NM figure out a dispatcher script to log the data supplied by DHCPv4? Watching that should be an eye opener... IMHO it would be really nice if there would be an example dispatcher script along a reasonable documentation to get people started.
(In reply to comment #20) > Additional variables in the environment would be a nice addition. The main > issue is though that there is no documentation whatsoever regarding the > dispatcher behavior and its environment variables. It could be fixed at once, at least both the docs and the behavior would be more sane than the current status. > How about letting a colleague who has no clue about NM figure out a dispatcher > script to log the data supplied by DHCPv4? Watching that should be an eye > opener... No need, I was the "colleague" myself and Thomas also came to the project after all this was already there. The NM docs improved a lot during the last two years but still there are many gaps. Any help is of course welcome.
Pavel, thanks for all the hard work. IMHO the code quality improved a lot in the last 2 years. ;-) Maybe out of habit I don't look for docs anymore... :-/ Thomas, I think this bug has enough data to either go ahead and improve the dispatcher behavior or to close the bug as working as intended.
(In reply to comment #20) > Additional variables in the environment would be a nice addition. The main > issue is though that there is no documentation whatsoever regarding the > dispatcher behavior and its environment variables. > > How about letting a colleague who has no clue about NM figure out a dispatcher > script to log the data supplied by DHCPv4? Watching that should be an eye > opener... > IMHO it would be really nice if there would be an example dispatcher script > along a reasonable documentation to get people started. There is a whole section in 'man NetworkManager' for dispatcher scripts and if that could use some improvements, we'd love to hear suggestions! We have one example, but we could certainly add more examples that make use of the IP and DHCP related variables.
(In reply to comment #17) > I guess it is easier to explain this with an example. > > We're setting the hostname of workstations via a dispatcher script on boot > based on data supplied by DHCPv4. Once the hostname has been set all successive > dispatcher script runs only log if a hostname change is pending and recommend a > reboot. > > Currently the dispatcher script reacts only to the up event. This is > problematic because sometimes the IPv6 configuration is earlier done than the > IPv4 configuration and because of that the hostname data is not available to > the dispatcher script. There is a 'hostname' dispatcher event (see man NetworkManager), the intent of which is to fire whenever the hostname has been changed by some action of NetworkManager. That should include both DHCP-related changes and administrator changes via 'nmcli gen hostname foobar.baz' or via the D-Bus interface. If the hostname event isn't firing when NM changes the hostname, we should certainly fix that.
We should probably add ip4-up/change and ip6-up/change events which would indicate initial configuration and later changes? We'd have to figure out whether those block or not though.
bugzilla.gnome.org is being shut down in favor of a GitLab instance. We are closing all old bug reports and feature requests in GNOME Bugzilla which have not seen updates for a long time. If you still use NetworkManager and if you still see this bug / want this feature in a recent and supported version of NetworkManager, then please feel free to report it at https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/issues/ Thank you for creating this report and we are sorry it could not be implemented (workforce and time is unfortunately limited).