GNOME Bugzilla – Bug 600057
nautilus crashed with SIGSEGV in open64()
Last modified: 2009-11-18 15:00:25 UTC
This was reported at https://bugs.edge.launchpad.net/ubuntu/+source/gvfs/+bug/424043. Some users are reporting this crash, and also large memory leaks in gvfsd-metadata (which I think are related). I've attached the stack trace due to the large size of it
Created attachment 146519 [details] Stacktrace Ok, I had to compress it so Bugzilla would let me attach it. Anyway, this is a good summary of what it looks like (repeated for 2000 frames):
+ Trace 218659
The high memory imprint is not caused by a memory leak but the code going into an infinite loop: The problem is that meta_tree_refresh_locked () calls meta_tree_needs_rereading(). If the latter returns TRUE we call "meta_tree_init ()" which then again calls meta_tree_refresh_locked (). My best bet currently is that the "rotated" bit is wrongly set on a stable file. Of course that shouldn't ever going to be the case. Maybe the file got corrupted, like it seems to be the case in bug 598561.
Created attachment 147555 [details] ~/.local/share/gvfs-metadata/home from Kai Lüke, downloaded from the launchpad bugreport To reproduce the issue use 'meta-ls' from gvfs sources: ./meta-ls <metadata file> / I'm getting segfault due to endless loop in meta_tree_init(), just like reported above.
So, this is pretty serious issue - as long as reading metadata is done in libgio and not in an isolated daemon, any read attempt from a corrupted database like this will lead to segfault of parent application due to stack overflow. Nautilus and Epiphany will both crash on startup. Still, no good repro steps :-((
I have pushed a fix for the crasher (infinite loop) to git master. With this patch we compare the inode number and if they are equal we stop refreshing. Of course that doesn't fix the root cause, that we have a stable file with a rotated bit set to 1.
Quick summary of the origin of this bug: [15:21] alex: Here is the race: [15:21] alex: we write the temp file with new data [15:21] alex: fsync it, now all *data* (not file metadata) for the directory is on disk [15:21] alex: then we rename the new file over the old one [15:22] alex: then we write the "rotated" bit to the old file [15:22] alex: wait a bit, then *boom* system died [15:22] alex: maybe the rotated bit was written to the old file? [15:22] alex: maybe the metadata from the rename were not? [15:23] alex: so, on reboot we have the old data left, but its marked as rotated [15:23] alex: solution: fsync the directory after the rename before writing the rotated bit In git master there is now a patch (6592ecb3b) that does exactly that. Closing this bug.