GNOME Bugzilla – Bug 650680
Network 'applet' sometimes gets stuck in 'connecting...' state after a resume, when the connection is actually up and working fine
Last modified: 2011-09-11 09:48:18 UTC
On my laptop, sometimes (quite often) when I resume from suspend, the Shell network 'applet' icon is stuck in 'Connecting...' state (three dots) even after the connection to my AP is actually up and working fine. There doesn't appear to be much practical impact of this - NM-dependent apps like Evolution and Firefox know they're online, so that's okay - it's just visually jarring (and may affect the functionality of the network 'applet' itself, I didn't really check). On IRC, Colin thought this may well be to do with these messages in .xsession-errors: Window manager warning: Log level 16: _nm_object_get_property: Error getting 'Ssid' for /org/freedesktop/NetworkManager/AccessPoint/375: (19) Method "Get" with signature "ss" on interface "org.freedesktop.DBus.Properties" doesn't exist I see a _lot_ of those. Restarting Shell - alt-f2, r - fixes the issue (the applet icon shows 'connected'), and I see none of those messages after the restart. <walters> adamw: file a bug please; but my guess here is this happens when an access point has no name (or an invalid name) (I'm guessing he means 'any AP in range', not 'the AP to which the system is connecting', because my AP is named - 'Squishy').
It looks to me like libnm-glib will warn about this if an access point goes away; we'll get null from the get_ssid() method. At least a few places in js/ui/status/network.js aren't robust against null.
Actually, all of network.js needs auditing to be safe across suspends.
Created attachment 188230 [details] [review] network.js: Be more robust against access points going away See patch
<walters> dcbw1: how did nm-applet deal with https://bugzilla.gnome.org/show_bug.cgi?id=650680#c2 ? <bebot> Bug 650680: normal, Normal, ---, gnome-shell-maint, UNCONFIRMED, Network 'applet' sometimes gets stuck in 'connecting...' state after a resume, when the connection is actually up and working fine <walters> it seems rather complicated <dcbw1> walters: my guess is just a missing trigger somewhere on a state changed signal to update the icon <walters> no i meant suspending <-- aruiz has quit (Ping timeout: 600 seconds) <walters> where the entire world can change, but we may have dbus requests in flight <dcbw1> walters: yeah, but you'll still get AccessPointRemoved signals for all of those, right? <walters> of course, but only after getting errors <dcbw1> walters: NM kills everything right before a suspend and emits the AP removed signal on the wifi device walters: and libnm-glib should only try to get the SSID the first time somebody wants it, after that it's on the propschanged signal <walters> i mean more like what if we're inside _accessPointAdded because NM saw a new one right before the user closed the laptop lid <dcbw1> walters: this is true; though I think we're just gonna kill those warning messages anyway <walters> looks to me like we call a bunch of functions, synchronously even =( so nm-applet basically would occasionally log errors but not crash? i.e. you didn't explicitly try to cope with this <dcbw1> correct, you'll periodically get those errors in ~/.xsession-errors <walters> the reason i mention this is because adamw specifically mentioned noticing this in the context of suspending <dcbw1> well, the coping was basically ensuring that we didnt' try to use a NULL SSID and stuff like that <walters> i could imagine it being triggered quite easily if you're near lots of access points, some with weak signals <dcbw1> GHashTable cares if you modify it during a g_hash_table_foreach(), but how I can I actually trigger that? <walters> honestly it's one of those problems where i wonder if it would make sense to just have a big GiveMeTheStateOfTheWorld dbus call that gives us a big JSON dump or something <dcbw1> walters: what you want here is the thing I was talking about earlier walters: you want libnm-glib to only say its initialized after it's read the whole state of the world completely <dcbw1> the tradeoff there is that's a huge dbus hit on startup while right now properties are fetched on-demand <walters> yeah, but individual dbus messages have overhead too <mclasen> dcbw1: gdbusproxy fetches all properties up front, and then keeps up with changes so porting to gdbus would give you at least part of this --> felipeborges (~felipe@200.132.100.252) has joined #gnome-os <walters> gdbus would help yeah <dcbw1> not really, no <walters> we should be landing giovanni's code there soon <dcbw1> it wouldn't helkp with the sync/async part <walters> actually davidz's object manager stuff might help more too i think one of the core problems it's solving is cancelling outgoing calls when a sub-object like a wifi AP go away <dcbw1> the core problem is that if this stuff is going to be async, which it should be in the end, then we need to make behavioral changes to libnm-glib to not expose an object to clients of libnm-glib until all properties are retrieved successfully that needs to get fixed regardless of whether gdbus or dbus-glib gets used, and that's the big change gdbus or dbus-glib is an implementation detail tha tdoesn't really matte rhere what you want to happen is to have the libnm-glib NMDeviceWifi class keep two AP lists, one for fully-initialized APs, and the other for pending APs that haven't had GetAll return yet, and only emit the 'ap-added' signal when an AP moves from pending -> initialized list internally <walters> mclasen: so what is the final meeting time? <dcbw1> but that trickles up further because you don't want to expose the NMDeviceWifi object until all it's APs have been retrieved and initialized the first time <walters> dcbw1: i think in gdbus, the proxy won't be returned until GetAll returns <davidz> dcbw1, walters: the guarantees provided by the object-manager stuff is described in detail here: http://developer.gnome.org/gio/unstable/GDBusObjectManagerClient.html#GDBusObjectManagerClient.description <mclasen> dcbw1: why am I having flashbacks to the horrible races between the accountsservice library and gdm ? <walters> mclasen: well it's a complicated problem <davidz> and, yes, all properties are loaded before the object is returned <mclasen> where the accountsservice gets a list of users, but they are not fully populated yet, ... --> dcbw (~dcbw@50.12.248.61) has joined #gnome-os <dcbw> walters: right, but then I still have to have two lists, one for completed proxies, and one for pending GCancelables that represent initializing GDBusProxy objects
bug 646454 deals with null SSIDs more comprehensively, and includes something more-or-less identical to this patch. *** This bug has been marked as a duplicate of bug 646454 ***
Perhaps the patch from https://bugzilla.gnome.org/show_bug.cgi?id=651378 might help: https://bugzilla.gnome.org/attachment.cgi?id=195498.