GNOME Bugzilla – Bug 349149
Crash in sftp module
Last modified: 2006-08-11 09:42:05 UTC
Distribution: Debian testing/unstable Package: rhythmbox Severity: Normal Version: GNOME2.14.2 0.9.5 Gnome-Distributor: Debian Synopsis: Crash when SFTP-accessed library loses connection during playback Bugzilla-Product: rhythmbox Bugzilla-Component: general Bugzilla-Version: 0.9.5 BugBuddy-GnomeVersion: 2.0 (2.14.1) Description: Description of the crash: Rhythmbox crashes when playing a song from a library accessed over SFTP, to which connectivity is lost. How often does this happen? Every time. Debugging Information: Backtrace was generated from '/usr/bin/rhythmbox' Using host libthread_db library "/lib/libthread_db.so.1". [Thread debugging using libthread_db enabled] [New Thread 47674855786656 (LWP 13681)] [New Thread 1124362592 (LWP 13754)] [New Thread 1099184480 (LWP 13690)] [New Thread 1074006368 (LWP 13683)] 0x00002b5c2655b0ff in waitpid () from /lib/libpthread.so.0
+ Trace 69761
Thread 1 (Thread 47674855786656 (LWP 13681))
------- Bug created by bug-buddy at 2006-07-29 07:39 -------
That looks like it's crashing inside GnomeVFS's sftp module. Is there any chance you could try copying a file with Nautilus or other gnomevfs-using application, and delibrately making the network connection die (by pulling out the network cord or something)?
I tried reproducing the behavior a couple of ways: gnomevfs-cat: had it pull a large file across an sftp connection, then killed the corresponding sshd child on the remote host. This produced some error reporting (which, retrying with RB, I find produced there also) but a clean exit without crashing: (process:14862): gnome-vfs-modules-CRITICAL **: Could not read 1 bytes (process:14862): gnome-vfs-modules-CRITICAL **: Could not read 4 bytes (process:14862): gnome-vfs-modules-CRITICAL **: ID mismatch (4286513152 != 3069) (process:14862): gnome-vfs-modules-CRITICAL **: Expected SSH2_FXP_STATUS(101) packet, got 0 (process:14862): gnome-vfs-modules-CRITICAL **: Could not read 4 bytes close `sftp://jezebel/home/aqua/lyrics.tar.gz': Generic error $ echo $? 1 nautilus: same repro procedure. Nautilus popped up an I/O error dialog offerring options to retry or cancel. To be fair to RB, Nautilus did spin the CPU while doing this, something I've seen RB do as well but have not been able to reproduce on a debug build.
Looking through gnomevfs' bug list, this might be a case of Bug#332028, which going purely on the submitter's description could account for both behaviors.
It looks like you're running rhythmbox with G_DEBUG=fatal_warnings or fatal_criticals, which will cause it to abort when the gnome-vfs sftp method logs a critical message. When I tried to reproduce this, I got the same set of messages you got from gnomevfs-cat, but no crash. I think nautilus is using different gnome-vfs methods, so it might get different results.
Okay, I see it. It's not $G_DEBUG, it's a malloc failure, which triggers an abort regardless of what $G_DEBUG is set to. The failure is induced by g_new() being called with an uninitialized int32. On my amd64 box, this tends to point to a negative value, which is translated into a very large positive one. Here's the top of the trace:
+ Trace 70228
The faulty routines are buffer_read_block(), buffer_read_gint32() and buffer_read() in gnome-vfs, modules/sftp-method.c. buffer_read_block() includes this (p_len is passed on the stack): *p_len = buffer_read_gint32 (buf); data = g_new (gchar, *p_len); buffer_read (buf, data, *p_len); buffer_read_gint32() calls buffer_read, with a bit of debugging and casting afterward: static gint32 buffer_read_gint32 (Buffer *buf) { gint32 data; [...] buffer_read (buf, &data, sizeof (gint32)); [...] return GINT32_TO_BE (data); } and buffer_read(), attempting to avoid reading beyond the end of its buffer, performs a bounds check, but has no way of signalling to its caller that the read has in fact failed: static void buffer_read (Buffer *buf, gpointer data, guint32 size) { guint32 len; [...] if (buf->write_ptr - buf->read_ptr < size) g_critical ("Could not read %d bytes", size); len = MIN (size, buf->write_ptr - buf->read_ptr); memcpy (data, buf->read_ptr, len); buf->read_ptr += len; } In this instance, after the ssh connection breaks, buf->write_ptr == buf->read_ptr, and hence the memcpy() becomes a no-op. buffer_read_gint32() has no way of checking for this and so it returns a uint32's worth of data off the stack to buffer_read_block(), which has no way of knowing either, and calls g_new(char, *p_len) where *p_len has just been filled with garbage. g_malloc raises an abort and RB dies. So... either the read should never have happened following the SIGPIPE, or these gnome-vfs read routines should be returning actually-read counts, not just values. This looks like a gnome-vfs bug, unless there's a peculiarity about the vfs failure-mode interface concerning signals I'm unaware of.
Marking as a dupe of bug 332028, since it looks to be the same. *** This bug has been marked as a duplicate of 332028 ***