After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 712235 - gvfsd-network SIGSEGV in g_vfs_monitor_emit_event()
gvfsd-network SIGSEGV in g_vfs_monitor_emit_event()
Status: RESOLVED FIXED
Product: gvfs
Classification: Core
Component: network backend
1.18.x
Other Linux
: Normal normal
: ---
Assigned To: gvfs-maint
gvfs-maint
: 524515 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2013-11-13 16:27 UTC by Sebastien Bacher
Modified: 2016-12-01 08:30 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
network: Fix crashes when mount failed (2.46 KB, patch)
2016-01-27 08:44 UTC, Ondrej Holy
committed Details | Review
network: Disconnect all signal handlers in finalize (1.48 KB, patch)
2016-11-10 09:53 UTC, Ondrej Holy
committed Details | Review
network: Leak mutex on finalize to avoid crash (925 bytes, patch)
2016-11-10 09:53 UTC, Ondrej Holy
none Details | Review
network: Fix crashes when finalize (1.32 KB, patch)
2016-11-18 13:51 UTC, Ondrej Holy
committed Details | Review

Description Sebastien Bacher 2013-11-13 16:27:36 UTC
The bug was reported on https://bugs.launchpad.net/ubuntu/+source/gvfs/+bug/971209

The description is not very descriptive, but it seems it might have been to do with local dnssd shares

Stacktrace from gvfs 1.12.0:

"#0  0x00007ff0b27a151e in g_vfs_monitor_emit_event (monitor=0x1fe7190, event_type=G_FILE_MONITOR_EVENT_CREATED, file_path=0x1fda220 "/dnssd-domain-EPSON5BCB78%20(WorkForce%20630)._smb._tcp", other_file_path=0x0) at gvfsmonitor.c:298
        l = <optimized out>
        subscriber = <optimized out>
        message = <optimized out>
        iter = {dummy1 = 0x1fee400, dummy2 = 0x38, dummy3 = 33481792, dummy4 = 0, dummy5 = -1311459810, dummy6 = 32752, dummy7 = 0, dummy8 = 0, dummy9 = 24, dummy10 = 32752, dummy11 = -265066544, pad1 = 32767, pad2 = -265066624, pad3 = 0x1fe3890}
        event_type_dbus = 0
  • #1 update_from_files
    at gvfsbackendnetwork.c line 208
  • #2 recompute_files
    at gvfsbackendnetwork.c line 402
  • #3 idle_add_recompute
    at gvfsbackendnetwork.c line 410
  • #4 g_main_dispatch
    at /build/buildd/glib2.0-2.32.0/./glib/gmain.c line 2515
  • #5 g_main_context_dispatch
    at /build/buildd/glib2.0-2.32.0/./glib/gmain.c line 3052
  • #6 g_main_context_iterate
    at /build/buildd/glib2.0-2.32.0/./glib/gmain.c line 3123
  • #7 g_main_context_iterate
    at /build/buildd/glib2.0-2.32.0/./glib/gmain.c line 3060
  • #8 g_main_loop_run
    at /build/buildd/glib2.0-2.32.0/./glib/gmain.c line 3317
  • #9 daemon_main
    at daemon-main.c line 300
  • #10 main
    at daemon-main-generic.c line 39


The bug is still happening in 1.18 (it's also the most reported segfault on gvfs in Ubuntu 13.10)

Similar fedora bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=902541
Comment 1 Ross Lagerwall 2013-11-18 13:26:59 UTC
Any idea how to reproduce this?

Both linked bug reports appear to have occurred when the network backend is announcing to any subscribed listeners that an Epson printer has just been connected.

The line that is failing AFAICT is gvfsmonitor.c:357
  for (l = monitor->priv->subscribers; l != NULL; l = l->next)

From my reading of the backtrace and coredump, it would appear that priv is invalid.  I have no idea why this would be.
Comment 2 Sebastien Bacher 2013-11-18 15:05:37 UTC
Not sure how to trigger the issue no, some of the Ubuntu reports' descriptions suggests it's happening when browsing smb shares (well, I guess it's not specific to those but that happen to be what those users are doing).

Is there a way to "emulate" new local shares (e.g avahi advertized ones)? It seems that maybe running the gvfs process under valgrind while doing that might give some results...
Comment 3 Ross Lagerwall 2013-11-18 15:27:50 UTC
(In reply to comment #2)
> Is there a way to "emulate" new local shares (e.g avahi advertized ones)? It
> seems that maybe running the gvfs process under valgrind while doing that might
> give some results...

avahi-publish-service testserv _smb._tcp 445

This didn't trigger a segfault but I haven't tried running under valgrind.
Comment 4 Sebastien Bacher 2013-11-18 16:03:17 UTC
Thanks, I tried to run it under valgrind a bit but without luck. 
There is very few shares here I can connect to though, I'm going to try to have a look/try again later though.
Comment 5 Ondrej Holy 2015-11-16 12:31:20 UTC
This is still happening:
https://retrace.fedoraproject.org/faf/problems/?component_names=gvfs&function_names=recompute_files

There are also another backtraces, but I suppose they are duplicates...

Truncated backtrace:
Thread no. 1 (6 frames)
 #0 g_type_check_instance_is_fundamentally_a at gtype.c:4033
 #1 g_object_ref at gobject.c:3045
 #2 network_file_new at gvfsbackendnetwork.c:105
 #3 recompute_files at gvfsbackendnetwork.c:441
 #4 idle_add_recompute at gvfsbackendnetwork.c:504
 #9 daemon_main at daemon-main.c:396

Truncated backtrace:
Thread no. 1 (10 frames)
 #0 recompute_files at gvfsbackendnetwork.c:361
 #1 mount_smb_done_cb at gvfsbackendnetwork.c:522
 #2 g_simple_async_result_complete at gsimpleasyncresult.c:801
 #3 _g_simple_async_result_complete_with_cancellable at gvfsdaemondbus.c:633
 #4 mount_reply at gdaemonfile.c:2012
 #5 g_task_return_now at gtask.c:1104
 #6 g_task_return at gtask.c:1162
 #7 reply_cb at gdbusproxy.c:2569
 #8 g_task_return_now at gtask.c:1104
 #9 g_task_return at gtask.c:1162
Comment 6 Ondrej Holy 2016-01-27 08:39:43 UTC
I finally managed what is wrong there. You are right, that valgrind is silent, when you just mount the backend, or publish new services. However you might see interesting logs, when you execute multiple mount operations concurrently and mount registering fails. _finalize is called in this case. Some idle sources and signal handlers are not removed and consequently some functions might be called with already freed backend private data...

(Maybe we should not allow concurrent mount operations for one scheme...)
Comment 7 Ondrej Holy 2016-01-27 08:44:43 UTC
Created attachment 319807 [details] [review]
network: Fix crashes when mount failed

This patch fixes the crashes and also some other memory leaks.
Comment 8 Ondrej Holy 2016-01-27 13:14:11 UTC
(In reply to Ondrej Holy from comment #6)
> I finally managed what is wrong there. You are right, that valgrind is
> silent, when you just mount the backend, or publish new services. However
> you might see interesting logs, when you execute multiple mount operations
> concurrently and mount registering fails.

e.g.:
valgrind --track-origins=yes /usr/libexec/gvfsd-network & sleep 2; gvfs-mount network://
Comment 9 Ondrej Holy 2016-01-29 12:25:49 UTC
Comment on attachment 319807 [details] [review]
network: Fix crashes when mount failed

master:
commit 45c4dcc3bcd40df923ef1e6152eb3502f0030960 

gnome-3-18:
commit 428adf56bbca298c0a92091de6dc10369d020a83
Comment 10 Ondrej Holy 2016-02-18 13:29:44 UTC
gnome-3-16:
commit fdfa790f7435332fa7942a209d436e4b854909c
Comment 11 Ondrej Holy 2016-11-10 09:51:54 UTC
Obviously, not all crashes has been fixed:
https://bugzilla.redhat.com/show_bug.cgi?id=1391518

I can reproduce it using the command from Comment 8 and changing corresponding gsettings properties in parallel...
Comment 12 Ondrej Holy 2016-11-10 09:53:22 UTC
Created attachment 339438 [details] [review]
network: Disconnect all signal handlers in finalize

Not all signal handlers has been removed in finalize by commit 45c4dcc.
Disconnect rest of the signal handlers in order to avoid potential
crashes...
Comment 13 Ondrej Holy 2016-11-10 09:53:43 UTC
Created attachment 339439 [details] [review]
network: Leak mutex on finalize to avoid crash

The mutex may be locked when finalize. Leak the mutex in order to
avoid potential crash.
Comment 14 Ondrej Holy 2016-11-18 13:51:45 UTC
Created attachment 340233 [details] [review]
network: Fix crashes when finalize

SMB backend mount operation might be still running when finalize is called.
Increase backend reference count when calling g_file_mount_enclosing_volume
in order to be sure that finalize is called after the operation is done.
Comment 15 Ondrej Holy 2016-11-18 13:54:29 UTC
Attachment 339438 [details] pushed as ae33e47 - network: Disconnect all signal handlers in finalize
Attachment 340233 [details] pushed as 5f7a36a - network: Fix crashes when finalize
Comment 16 Ondrej Holy 2016-11-18 13:58:37 UTC
Pushed also for gnome-3-22 branch.
Comment 17 Ondrej Holy 2016-12-01 08:30:03 UTC
*** Bug 524515 has been marked as a duplicate of this bug. ***