GNOME Bugzilla – Bug 316455
multiload-applet-2 hangs if smbfs mount breaks
Last modified: 2010-01-24 01:06:20 UTC
This bug has been opened here: http://bugzilla.ubuntu.com/show_bug.cgi?id=15532 "I occasionally type: sudo mount -t smbfs -o password=,uid=chris //server/dokumenty ~/smb/server/dokumenty to mount an SMB share into my home directory. The network socket on my laptop is a bit broken. It doesn't hold the network cable in, so if I move the laptop, the network cable often falls out. This isn't a problem for the most part, I just put it back in. But for SMB mounts, it seems to be a problem - the mount stops working, and doesn't start working again when I plug the cable back in again. I noticed a few minutes ago that the graphs at the top of my screen had stopped moving. They show CPU usage, network, memory, swap and disk usage and are made by the 'multiload-applet-2' process. I just used 'strace' to see what the process was doing instead of animating the graphs for me: chris@chrislap:/$ pgrep multiload 6156 chris@chrislap:/$ strace -p 6156 Process 6156 attached - interrupt to quit statfs("/home/chris/smb/server/dokumenty", <unfinished ...> Process 6156 detached It seems it's hanging trying to 'statfs' the SMB mount point. It's been doing that for quite a while now. I'd like it to give up and start drawing the graphs again for me. Incidentally, if I 'cd' to /home/chris/smb/server and run "ls", I see: chris@chrislap:~/smb/server$ ls dokumenty but if I run "ls -l" I see: chris@chrislap:~/smb/server$ ls -l <flashing cursor, no output, no new shell prompt> ls: dokumenty: Input/output error total 0 ie. it hangs for a minute or so and then prints an error. Should this perhaps be filed somewhere else? Or should I just get my laptop fixed?"
Have asked some questions of the filer in the Ubuntu bug (http://bugzilla.ubuntu.com/show_bug.cgi?id=15532).
I don't think this can't be fixed. I see this kind of lock with SMB and sshfs. It simply hard locks everything that tries to access it : ls, nautilus, etc. I guess there are mount options to detect more early deconnections instead of hanging for few minutes. How long did/do you wait ? Do you unmount /smb when your cable has fallen ?
The original reporter hasn't given any comments, but I would suspect the cause is entering the D state. Thus there isn't really a way to 'fix' it. I assume the problem is caused by libgtop making requests to find out how much free space is on the device. Perhaps we can prevent these calls being made in multiload-applet (since it doesn't need to know that information AFAIK). That would prevent multiload going into the D state at least.
Yes it needs to in order get IO stats.
Excellent point. I guess there is no handy, dandy non-blocking mode, is there?
If i'm right, it's because statvfs blocks on /samba. I don't know how to fix this properly. The only way i know is using SIGALRM but this is not reentrant.
I don't have internet at home, so I spent a lot of time this weekend thinking about this "bug". I did a few experiments. I wrote a piece of code to run statvfs (the blocking syscall) with a 5s timeout. Here is my conclusion : on network disconnection, statvfs fails and reports an IO error after 30s. This is why multiload-applet "freezes" : you can actually see that the graphs are updated every 30s (update takes place each time statvfs fails). The Preferences Dialog is unusable of course. (This 30s timeout may be system dependant). So this is the old story about GUI and long running computations. To solve this we could : - patch libgtop so that statvfs() is called with a 5s timeout - patch multiload so that glibtop_get_fsusage() is called with a 5s timeout - run each graph in a separate thread/process Then i realized that Linux 2.6 does not give read/write stats for network filesystem (nothing about samba nor nfs in /proc/diskstats). And by the way libgtop Linux 2.6 implementation is the only one that provides read/write usage. Keep in mind that this multiload graph is about disks. So the fix is easy : ignore "nfs" and "smbfs" partitions in multiload-applet. I've checked linux and *BSD glibtop_get_mountlist() and it's fine. I could make a patch. PS : Maybe i could extend glibtop_get_mountlist(buf, all_fs) to glibtop_get_mountlist(buf, which) (without breaking ABI/API) where "which" could be : - 0 : only real filesystems (everything but stuff like /proc) - 1 : all fs - 2 : only local disk-based filesystems
Created attachment 57363 [details] [review] that simple
Rock'n'roll. Put it in.
thank you for fixing that :)
(In reply to comment #3) > The original reporter hasn't given any comments I'm sorry about that. I didn't notice your question, even though it apparently did hit my inbox. Thanks for fixing it though.