GNOME Bugzilla – Bug 655773
WIFI unavailable after being disabled and enabled again
Last modified: 2011-09-30 06:08:18 UTC
Machine: ASUS eeepc 1005HA OS: openSUSE 12.1 Milestone3 NetworkManager version: 0.8.9997 WLAN Card: Atheros AR9285 Wireless Network Adapter (PCI-Express) [168c:002b] Reproduce step: 1. Click the network status applet to show the network menu 2. Switch Wireless to OFF and wait 5 seconds 3. Switch Wirelss to ON Result: Wireless became "unavailable" forever unless NM restarted. (Replace step 2 and 3 with 'Press Fn-F2, WLAN Key' is also valid.) This is caused by the hard killswitch of the WLAN card. # rfkill list wlan 1: eeepc-wlan: Wireless LAN Soft blocked: no Hard blocked: no 2: phy0: Wireless LAN Soft blocked: no Hard blocked: no When NM switched eeepc-wlan Soft block to ON, phy0 became soft-blocked and then hard-blocked after 2~3 seconds. When NM switched eeepc-wlan Soft block to OFF, soft block of phy0 was released immediately. However, it took around 3 seconds to switch hard block to OFF, and NM could not bring up wlan0 until the hard block was released. # ifconfig wlan0 up SIOCSIFFLAGS: Operation not possible due to RF-kill NM 0.8.2 can handle this case without any problem, so I assume this is a regression in NM.
I figured out the root cause. It's related to 67e092abcbdece45f4753383797000d4ed49f3dc http://cgit.freedesktop.org/NetworkManager/NetworkManager/commit/?id=67e092abcbdece45f4753383797000d4ed49f3dc NM used to conclude the state of WLAN killswitches by scanning all switches, but now it overwrites the state by the platform switch. This is generally correct. However, some machines, like eeepc 1005HA, do not lift the hard block immediately, and NM got confused in this case. I think NM has to be notified when there is a hard block lifted even if it's not a platform switch.
Created attachment 193398 [details] [review] A hacky patch to ignore rfkill event I made a quick hack patch for NetworkManager. The idea is to store the state of non-platform switches and compare it with the poll state. If the poll state is UNBLOCKED and the non-platform state is HARD_BLOCKED, then this rfkill event will be ignored and NM waits for the next event that indicates all switches UNBLOCKED. The patch works for my eeepc.
Interesting, thanks for checking this out. I've thought about doing a split between platform switches and device switches before but never quite got there. So what you're saying here is that when NM un-kills all the switches, it takes a second or two for the platform switch to switch state? Can you grab the NM log output (/var/log/messages or /var/log/daemon.log) during when the issue happens? It might work better to run it with: NetworkManager --no-daemon --log-level=debug so we can get more info.
Created attachment 194440 [details] NetworkManager log file # rfkill list 0: phy0: Wireless LAN Soft blocked: no Hard blocked: no 1: hci0: Bluetooth Soft blocked: no Hard blocked: no 2: asus-wlan: Wireless LAN Soft blocked: no Hard blocked: no 3: asus-bluetooth: Bluetooth Soft blocked: no Hard blocked: no The name of platform killswitches were changed after a kernel upgrade, but the behavior is the same. When asus-wlan was set to soft-blocked, phy0 was set to soft-blocked (yes-no) and then hard-blocked (yes-yes) after 1~2 seconds, and the situation is similar while asus-wlan was set to unblocked. phy0 yes-yes -> no-yes (1~2 sec)-> no-no
I figured out a better solution. Currently, we use this rule: state = platform_state; In the most of cases, the platform switch reflects the real state, especially when the platform switch is blocked. However, a unblocked platform switch doesn't really mean "unblocked". In my case, phy0 was hardblocked for a few seconds even if asus-wlan was unblocked, and this caused the wlan device unusable for a few seconds. So, let's define the rule as such: if (platform_state == UNBLOCKED) state = non_platform_state; else state = platform_state; If the state is UNBLOCKED, it's really unblocked, i.e. both the platform switch and non-platform switch are unblocked. Otherwise, it's blocked. I think this rule can reflect the real state, and my eeepc works fine with this rule.
Pushed as 339229e4c698c61e20a28bfc33d8501490891427 to git master (0.9) with a few cleanups. Thanks!