Bug 547568 – gvfsd-trash crashed with SIGSEGV in g_main_context_dispatch()

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 547568 - gvfsd-trash crashed with SIGSEGV in g_main_context_dispatch()


Summary:	gvfsd-trash crashed with SIGSEGV in g_main_context_dispatch()


Status:	RESOLVED FIXED

Product:	gvfs
Classification:	Core
Component:	trash backend
Version:	0.99.x
Hardware:	Other Linux

Importance:	Normal critical
Target Milestone:	---
Assigned To:	gvfs-maint
QA Contact:	gvfs-maint

URL:
Whiteboard:

Duplicates:	547567 547726 (view as bug list)
Depends on:
Blocks:

Reported:	2008-08-13 10:35 UTC by Sebastien Bacher
Modified:	2008-09-23 19:17 UTC

See Also:
GNOME target:	2.24.x
GNOME version:	2.23/2.24

Attachments
Test Program (696 bytes, text/plain) 2008-09-12 19:44 UTC, Christian Kellner		Details
don't leave dangling idles (7.29 KB, patch) 2008-09-13 02:23 UTC, Matthias Clasen	none	Details \| Review

Description Sebastien Bacher 2008-08-13 10:35:19 UTC

the bug has been opened on https://bugs.launchpad.net/ubuntu/+source/gvfs/+bug/254479

"I think this is similar to #252174 but that one was marked as invalid. Maybe there will be something useful in this core dump. I had just logged into a Gnome session after using a KDE4 session.

+ Trace 205087

#0 _g_dbus_bus_list_names_with_prefix
at gdbusutils.c line 683
#1 add_timeout
at gdbusutils.c line 1006
#2 g_main_context_dispatch
from /usr/lib/libglib-2.0.so.0
#3 ??
from /usr/lib/libglib-2.0.so.0
#4 g_main_loop_run
from /usr/lib/libglib-2.0.so.0
#5 daemon_main
at daemon-main.c line 270
#6 main
at daemon-main-generic.c line 39


could be the same issue than bug #547567 but the stacktraces are different

Comment 1 Sebastien Bacher 2008-08-15 16:47:36 UTC

valgrind list such errors too:

"==28416== Jump to the invalid address stated on the next line
==28416==    at 0x519F0D9: ???
==28416==    by 0x41996E3: g_main_context_prepare (gmain.c:2392)
==28416==    by 0x4199B69: g_main_context_iterate (gmain.c:2685)
==28416==    by 0x419A3A1: g_main_loop_run (gmain.c:2928)
==28416==    by 0x8051772: daemon_main (daemon-main.c:270)
==28416==    by 0x8051A44: main (daemon-main-generic.c:39)
==28416==  Address 0x519f0d9 is not stack'd, malloc'd or (recently) free'd"

Comment 2 Matthias Clasen 2008-08-25 04:49:09 UTC

Curious stacktrace, given that _g_dbus_bus_list_names_with_prefix doesn't seem to be called at all anywhere inside gvfs...

Comment 3 Sebastien Bacher 2008-09-11 10:17:22 UTC

*** Bug 547567 has been marked as a duplicate of this bug. ***

Comment 4 Sebastien Bacher 2008-09-11 10:17:34 UTC

*** Bug 547726 has been marked as a duplicate of this bug. ***

Comment 5 Sebastien Bacher 2008-09-11 10:25:29 UTC

gicmo thinks that's because dbus is used in different thread, that seems to be a crasher lot of users are running into seeing the number of duplicates ubuntu is getting, GNOME probably doesn't get those since bug-buddy is not running on 2.23

Comment 6 Christian Kellner 2008-09-12 19:44:31 UTC

Created attachment 118615 [details]
Test Program

I can get a crash with that program 1 out of maybe 7 times (or something).

The Stacktrace looks like the following then:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f1c88265770 (LWP 16604)]
0x00007f1c85a15960 in ?? () from /lib/libdbus-1.so.3
(gdb) thread apply all bt full

+ Trace 206700

Thread 2 (Thread 0x41eb9950 (LWP 16607))

#0 read
from /lib/libpthread.so.0
#1 g_key_file_load_from_fd
at /usr/include/bits/unistd.h line 45
#2 IA__g_key_file_load_from_file
at /build/buildd/glib2.0-2.18.0/glib/gkeyfile.c line 502
#3 ??
from /usr/lib/gio/modules/libgioremote-volume-monitor.so
#4 g_io_module_load_module
at /build/buildd/glib2.0-2.18.0/gio/giomodule.c line 176
#5 IA__g_type_module_use
at /build/buildd/glib2.0-2.18.0/gobject/gtypemodule.c line 255
#6 IA__g_io_modules_load_all_in_directory
at /build/buildd/glib2.0-2.18.0/gio/giomodule.c line 267
#7 _g_io_modules_ensure_loaded
at /build/buildd/glib2.0-2.18.0/gio/giomodule.c line 333
#8 get_default_vfs
at /build/buildd/glib2.0-2.18.0/gio/gvfs.c line 187
#9 IA__g_once_impl
at /build/buildd/glib2.0-2.18.0/glib/gthread.c line 190
#10 IA__g_file_new_for_path
at /build/buildd/glib2.0-2.18.0/gio/gfile.c line 4844
#11 thread_func
#12 g_thread_create_proxy
at /build/buildd/glib2.0-2.18.0/glib/gthread.c line 635
#13 start_thread
from /lib/libpthread.so.0
#14 clone
from /lib/libc.so.6
#15 ??

Comment 7 Christian Kellner 2008-09-12 19:45:44 UTC

And another one:

Starting program: /home/gicmo/Devel/temp/crash_test 
[Thread debugging using libthread_db enabled]
[New Thread 0x7fdeb22b3770 (LWP 16861)]
[New Thread 0x41275950 (LWP 16864)]
XX

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fdeb22b3770 (LWP 16861)]
0x00007fdeafa63960 in ?? () from /lib/libdbus-1.so.3
(gdb) thread apply all bt full

+ Trace 206701

Thread 2 (Thread 0x41275950 (LWP 16864))

#0 _int_malloc
from /lib/libc.so.6
#1 malloc
from /lib/libc.so.6
#2 _IO_str_overflow_internal
from /lib/libc.so.6
#3 _IO_default_xsputn_internal
from /lib/libc.so.6
#4 vfprintf
from /lib/libc.so.6
#5 __vasprintf_chk
from /lib/libc.so.6
#6 IA__g_vasprintf
at /usr/include/bits/stdio2.h line 199
#7 IA__g_strdup_vprintf
at /build/buildd/glib2.0-2.18.0/glib/gstrfuncs.c line 218
#8 IA__g_logv
#9 IA__g_log
at /build/buildd/glib2.0-2.18.0/glib/gmessages.c line 517
#10 ??
from /usr/lib/gio/modules/libgioremote-volume-monitor.so
#11 g_io_module_load_module
at /build/buildd/glib2.0-2.18.0/gio/giomodule.c line 176
#12 IA__g_type_module_use
at /build/buildd/glib2.0-2.18.0/gobject/gtypemodule.c line 255
#13 IA__g_io_modules_load_all_in_directory
at /build/buildd/glib2.0-2.18.0/gio/giomodule.c line 267
#14 _g_io_modules_ensure_loaded
at /build/buildd/glib2.0-2.18.0/gio/giomodule.c line 333
#15 get_default_vfs
at /build/buildd/glib2.0-2.18.0/gio/gvfs.c line 187
#16 IA__g_once_impl
at /build/buildd/glib2.0-2.18.0/glib/gthread.c line 190
#17 IA__g_file_new_for_path
at /build/buildd/glib2.0-2.18.0/gio/gfile.c line 4844
#18 thread_func
#19 g_thread_create_proxy
at /build/buildd/glib2.0-2.18.0/glib/gthread.c line 635
#20 start_thread
from /lib/libpthread.so.0
#21 clone
from /lib/libc.so.6
#22 ??

Comment 8 Christian Kellner 2008-09-12 20:01:37 UTC

Just for the record: removing either the libgioremote-volume-monitor.so module or both .monitor files (ghoto2 and hal) from /usr/share/gvfs/remote-volume-monitors/ fixes the crash here.

Just having one of them leads to crashes again. This one is with only the gphoto2 monitor:

[Thread debugging using libthread_db enabled]
[New Thread 0x7f955fb1d770 (LWP 17327)]
[New Thread 0x41c2b950 (LWP 17330)]
XX

Program exited with code 01.
(gdb) r
Starting program: /home/gicmo/Devel/temp/crash_test 
[Thread debugging using libthread_db enabled]
[New Thread 0x7f9dee695770 (LWP 17331)]
[New Thread 0x425bd950 (LWP 17332)]
XX

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f9dee695770 (LWP 17331)]
0x00007f9debe45960 in ?? () from /lib/libdbus-1.so.3
(gdb) thread apply all bt full

+ Trace 206702

Thread 2 (Thread 0x425bd950 (LWP 17332))

#0 type_add_flags_W
at /build/buildd/glib2.0-2.18.0/gobject/gtype.c line 3428
#1 IA__g_type_register_dynamic
at /build/buildd/glib2.0-2.18.0/gobject/gtype.c line 2526
#2 IA__g_type_module_register_type
at /build/buildd/glib2.0-2.18.0/gobject/gtypemodule.c line 426
#3 ??
from /usr/lib/gio/modules/libgioremote-volume-monitor.so
#4 g_io_module_load_module
at /build/buildd/glib2.0-2.18.0/gio/giomodule.c line 176
#5 IA__g_type_module_use
at /build/buildd/glib2.0-2.18.0/gobject/gtypemodule.c line 255
#6 IA__g_io_modules_load_all_in_directory
at /build/buildd/glib2.0-2.18.0/gio/giomodule.c line 267
#7 _g_io_modules_ensure_loaded
at /build/buildd/glib2.0-2.18.0/gio/giomodule.c line 333
#8 get_default_vfs
at /build/buildd/glib2.0-2.18.0/gio/gvfs.c line 187
#9 IA__g_once_impl
at /build/buildd/glib2.0-2.18.0/glib/gthread.c line 190
#10 IA__g_file_new_for_path
at /build/buildd/glib2.0-2.18.0/gio/gfile.c line 4844
#11 thread_func
#12 g_thread_create_proxy
at /build/buildd/glib2.0-2.18.0/glib/gthread.c line 635
#13 start_thread
from /lib/libpthread.so.0
#14 clone
from /lib/libc.so.6
#15 ??

(gdb)

Comment 9 Matthias Clasen 2008-09-12 20:24:42 UTC

Adding dbus_threads_init_default() to your test program makes the crashes go away.

Comment 10 Christian Kellner 2008-09-12 22:10:30 UTC

Indeed, couldn't get it to crash yet. Wrong direction ... :-/

Comment 11 Matthias Clasen 2008-09-13 00:19:34 UTC

Are we sure that gvfsd-trash calls dbus_threads_init ?

Comment 12 Matthias Clasen 2008-09-13 02:22:18 UTC

Nevermind, it does.

So it seems we are back to the earlier hypothesis of what happens:
  GProxyVolumeMonitor queues an idle to emit a signal, and before that is run,
  libgio-remote-volume-monitor.so is unloaded, causing a segfault when the
  idle is run, because the callback is no longer present.

If that is the case, here is a patch that might fix it. Unfortunately, I cannot reproduce the crash, so I can't test the patch.

Comment 13 Matthias Clasen 2008-09-13 02:23:23 UTC

Created attachment 118637 [details] [review]
don't leave dangling idles

Comment 14 Matthias Clasen 2008-09-13 05:13:40 UTC

Additionally, if the problem is related to the unloading of modules (as it seems to be, putting

G_DEBUG=resident-modules 

in the environment should work around the problem.

Comment 15 David Zeuthen (not reading bugmail) 2008-09-13 06:03:41 UTC

(In reply to comment #12)
>   libgio-remote-volume-monitor.so is unloaded,

Under what circumstances would libgio-remote-volume-monitor.so ever be unloaded?

Comment 16 Matthias Clasen 2008-09-13 18:13:51 UTC

All giomodules get loaded in _g_io_modules_ensure_loaded to register their types,
and then they get unloaded again until the registered types are actually instantiated.

But from what I've seen so far, g_proxy_volume_monitor_setup_session_bus_connection and
g_proxy_volume_monitor_teardown_session_bus_connection
_seem_ to be doing the right thing with all the timeouts that are attached to the mainloop.

Comment 17 Alexander Larsson 2008-09-23 09:24:08 UTC

I have a suspicion about this. The helper library "common/" subdirectory in gvfs is (statically) linked into both the daemon and the client module. This is a bit tricky in the case of the trash backend, as that loads the client module into a daemon and thus has two copies of the common code.

Since the linking is static there should be no problems with calls to the wrong version of the code, but there might be some global data that points to the other copy and then that somehow gets unmapped when that module is no longer used.

Comment 18 Alexander Larsson 2008-09-23 09:27:26 UTC

One difference the proxy volume monitoring has on the trash is that it integrates a mainloop (using the common/ code) during the module loading, so this is the main suspect.

Comment 19 Alexander Larsson 2008-09-23 13:19:01 UTC

I was able to repeat the crash by doing "killall -9 gvfsd-trash; gvfs-ls trash://" until the gvfs-ls failed with a timeout. That means the automatically spawned gvfsd-trash has crashed. 

I commited some code to the 2.24 branch that uses a single copy of the common code. This means we get a more readable crash, since we're not calling an unloaded address. The output for the crash is now:

process 13106: dbus_watch_handle: Watch is invalid, it should have been removed

This makes sense in that previously we called an dbus idle causing a crash due to the code being unloaded. Now its not unloaded, but its still getting an idle called when it shouldn't.

Comment 20 Alexander Larsson 2008-09-23 14:23:09 UTC

Ok. I see what is happening. We run _g_dbus_connection_integrate_with_main() and then soon thereafter _g_dbus_connection_remove_from_main() on a thread. However sometimes the mainloop runs, handling the dbus fd while the other thread calls g_dbus_connection_remove_from_main, which frees the IOHandler while its running.

Comment 21 Christian Kellner 2008-09-23 15:05:03 UTC

_g_dbus_connection_integrate_with_main () was my main suspect the other day as well, but then I prolly run out of time or was bind ;-):
"<gicmo> hmm might it be that its _g_dbus_connection_integrate_with_main () being called from the thread oder something"

Comment 22 Alexander Larsson 2008-09-23 19:17:14 UTC

This should be fixed with the latest commit:
2008-09-23  Alexander Larsson  <alexl@redhat.com>

        * monitor/proxy/gproxyvolumemonitor.[ch]:
        * monitor/proxy/gproxyvolumemonitor.h:
        * monitor/proxy/remote-volume-monitor-module.c:
	Only call the IsSupported dbus call when the class
	is actually needed instead of on gio init.
	Don't integrate internal session bus with mainloop
	during is_support code, as that is not necessary yet, and
	it caused problem if done in a thread.
	
	This fixes the trash crash issue in bug #547568.