GNOME Bugzilla – Bug 712235
gvfsd-network SIGSEGV in g_vfs_monitor_emit_event()
Last modified: 2016-12-01 08:30:03 UTC
The bug was reported on https://bugs.launchpad.net/ubuntu/+source/gvfs/+bug/971209 The description is not very descriptive, but it seems it might have been to do with local dnssd shares Stacktrace from gvfs 1.12.0: "#0 0x00007ff0b27a151e in g_vfs_monitor_emit_event (monitor=0x1fe7190, event_type=G_FILE_MONITOR_EVENT_CREATED, file_path=0x1fda220 "/dnssd-domain-EPSON5BCB78%20(WorkForce%20630)._smb._tcp", other_file_path=0x0) at gvfsmonitor.c:298 l = <optimized out> subscriber = <optimized out> message = <optimized out> iter = {dummy1 = 0x1fee400, dummy2 = 0x38, dummy3 = 33481792, dummy4 = 0, dummy5 = -1311459810, dummy6 = 32752, dummy7 = 0, dummy8 = 0, dummy9 = 24, dummy10 = 32752, dummy11 = -265066544, pad1 = 32767, pad2 = -265066624, pad3 = 0x1fe3890} event_type_dbus = 0
+ Trace 232758
The bug is still happening in 1.18 (it's also the most reported segfault on gvfs in Ubuntu 13.10) Similar fedora bug report: https://bugzilla.redhat.com/show_bug.cgi?id=902541
Any idea how to reproduce this? Both linked bug reports appear to have occurred when the network backend is announcing to any subscribed listeners that an Epson printer has just been connected. The line that is failing AFAICT is gvfsmonitor.c:357 for (l = monitor->priv->subscribers; l != NULL; l = l->next) From my reading of the backtrace and coredump, it would appear that priv is invalid. I have no idea why this would be.
Not sure how to trigger the issue no, some of the Ubuntu reports' descriptions suggests it's happening when browsing smb shares (well, I guess it's not specific to those but that happen to be what those users are doing). Is there a way to "emulate" new local shares (e.g avahi advertized ones)? It seems that maybe running the gvfs process under valgrind while doing that might give some results...
(In reply to comment #2) > Is there a way to "emulate" new local shares (e.g avahi advertized ones)? It > seems that maybe running the gvfs process under valgrind while doing that might > give some results... avahi-publish-service testserv _smb._tcp 445 This didn't trigger a segfault but I haven't tried running under valgrind.
Thanks, I tried to run it under valgrind a bit but without luck. There is very few shares here I can connect to though, I'm going to try to have a look/try again later though.
This is still happening: https://retrace.fedoraproject.org/faf/problems/?component_names=gvfs&function_names=recompute_files There are also another backtraces, but I suppose they are duplicates... Truncated backtrace: Thread no. 1 (6 frames) #0 g_type_check_instance_is_fundamentally_a at gtype.c:4033 #1 g_object_ref at gobject.c:3045 #2 network_file_new at gvfsbackendnetwork.c:105 #3 recompute_files at gvfsbackendnetwork.c:441 #4 idle_add_recompute at gvfsbackendnetwork.c:504 #9 daemon_main at daemon-main.c:396 Truncated backtrace: Thread no. 1 (10 frames) #0 recompute_files at gvfsbackendnetwork.c:361 #1 mount_smb_done_cb at gvfsbackendnetwork.c:522 #2 g_simple_async_result_complete at gsimpleasyncresult.c:801 #3 _g_simple_async_result_complete_with_cancellable at gvfsdaemondbus.c:633 #4 mount_reply at gdaemonfile.c:2012 #5 g_task_return_now at gtask.c:1104 #6 g_task_return at gtask.c:1162 #7 reply_cb at gdbusproxy.c:2569 #8 g_task_return_now at gtask.c:1104 #9 g_task_return at gtask.c:1162
I finally managed what is wrong there. You are right, that valgrind is silent, when you just mount the backend, or publish new services. However you might see interesting logs, when you execute multiple mount operations concurrently and mount registering fails. _finalize is called in this case. Some idle sources and signal handlers are not removed and consequently some functions might be called with already freed backend private data... (Maybe we should not allow concurrent mount operations for one scheme...)
Created attachment 319807 [details] [review] network: Fix crashes when mount failed This patch fixes the crashes and also some other memory leaks.
(In reply to Ondrej Holy from comment #6) > I finally managed what is wrong there. You are right, that valgrind is > silent, when you just mount the backend, or publish new services. However > you might see interesting logs, when you execute multiple mount operations > concurrently and mount registering fails. e.g.: valgrind --track-origins=yes /usr/libexec/gvfsd-network & sleep 2; gvfs-mount network://
Comment on attachment 319807 [details] [review] network: Fix crashes when mount failed master: commit 45c4dcc3bcd40df923ef1e6152eb3502f0030960 gnome-3-18: commit 428adf56bbca298c0a92091de6dc10369d020a83
gnome-3-16: commit fdfa790f7435332fa7942a209d436e4b854909c
Obviously, not all crashes has been fixed: https://bugzilla.redhat.com/show_bug.cgi?id=1391518 I can reproduce it using the command from Comment 8 and changing corresponding gsettings properties in parallel...
Created attachment 339438 [details] [review] network: Disconnect all signal handlers in finalize Not all signal handlers has been removed in finalize by commit 45c4dcc. Disconnect rest of the signal handlers in order to avoid potential crashes...
Created attachment 339439 [details] [review] network: Leak mutex on finalize to avoid crash The mutex may be locked when finalize. Leak the mutex in order to avoid potential crash.
Created attachment 340233 [details] [review] network: Fix crashes when finalize SMB backend mount operation might be still running when finalize is called. Increase backend reference count when calling g_file_mount_enclosing_volume in order to be sure that finalize is called after the operation is done.
Attachment 339438 [details] pushed as ae33e47 - network: Disconnect all signal handlers in finalize Attachment 340233 [details] pushed as 5f7a36a - network: Fix crashes when finalize
Pushed also for gnome-3-22 branch.
*** Bug 524515 has been marked as a duplicate of this bug. ***