GNOME Bugzilla – Bug 755805
proxy volume monitor: Guard access to the internal caches and properly handle failure to create a remote proxy
Last modified: 2015-10-16 15:11:33 UTC
Here is a downstream bug: https://bugzilla.redhat.com/show_bug.cgi?id=1199821 With a stacktrace that looks like this: Thread no. 1 (10 frames) #0 g_hash_table_iter_next at ghash.c:791 #1 get_mounts at gproxyvolumemonitor.c:154 #3 mounts_setup at storage.c:698 #4 storage_init at storage.c:165 #5 g_type_create_instance at gtype.c:1870 #6 g_object_new_internal at gobject.c:1774 #9 storage_new at storage.c:849 #10 media_art_process_initable_init at extract.c:192 #11 g_initable_new_valist at ginitable.c:228 #12 g_initable_new at ginitable.c:146 The assertion that is triggered in g_hash_table_iter_next means that the hash table was modified while you are iterating over it, despite get_mounts() taking the proxy_vm lock. Looking around in the code shows that name_owner_appeared and name_owner_vanished both access the hash table without taking the lock, and name_owner_vanished removes all mounts from the hash table. And sure enough, the log file excerpt in the downstream bug has: Mar 08 13:32:07 localhost.localdomain gnome-session[1909]: (tracker-extract:2304): GVFS-RemoteVolumeMonitor-WARNING **: Owner of volume monitor org.gtk.Private.UDisks2VolumeMonitor disconnected from the bus; removing drives/volumes/mounts Mar 08 13:32:07 localhost.localdomain gnome-session[1909]: (tracker-extract:2304): GVFS-RemoteVolumeMonitor-WARNING **: Owner of volume monitor org.gtk.Private.MTPVolumeMonitor disconnected from the bus; removing drives/volumes/mounts Mar 08 13:32:07 localhost.localdomain gnome-session[1909]: (tracker-extract:2304): GVFS-RemoteVolumeMonitor-WARNING **: Owner of volume monitor org.gtk.Private.GPhoto2VolumeMonitor disconnected from the bus; removing drives/volumes/mounts Mar 08 13:32:07 localhost.localdomain gnome-session[1909]: (tracker-extract:2304): GVFS-RemoteVolumeMonitor-WARNING **: Owner of volume monitor org.gtk.Private.GoaVolumeMonitor disconnected from the bus; removing drives/volumes/mounts Mar 08 13:32:07 localhost.localdomain gnome-session[1909]: (tracker-extract:2304): GVFS-RemoteVolumeMonitor-WARNING **: Owner of volume monitor org.gtk.Private.AfcVolumeMonitor disconnected from the bus; removing
Created attachment 312417 [details] [review] proxy volume monitor: Guard access to the internal caches
Review of attachment 312417 [details] [review]: Looks right to me.
(In reply to Matthias Clasen from comment #0) This looks like the user was logging out: > Mar 08 13:32:07 localhost.localdomain gnome-session[1909]: > (tracker-extract:2304): GVFS-RemoteVolumeMonitor-WARNING **: Owner of volume > monitor org.gtk.Private.UDisks2VolumeMonitor disconnected from the bus; > removing drives/volumes/mounts > Mar 08 13:32:07 localhost.localdomain gnome-session[1909]: > (tracker-extract:2304): GVFS-RemoteVolumeMonitor-WARNING **: Owner of volume > monitor org.gtk.Private.MTPVolumeMonitor disconnected from the bus; removing > drives/volumes/mounts > Mar 08 13:32:07 localhost.localdomain gnome-session[1909]: > (tracker-extract:2304): GVFS-RemoteVolumeMonitor-WARNING **: Owner of volume > monitor org.gtk.Private.GPhoto2VolumeMonitor disconnected from the bus; > removing drives/volumes/mounts > Mar 08 13:32:07 localhost.localdomain gnome-session[1909]: > (tracker-extract:2304): GVFS-RemoteVolumeMonitor-WARNING **: Owner of volume > monitor org.gtk.Private.GoaVolumeMonitor disconnected from the bus; removing > drives/volumes/mounts > Mar 08 13:32:07 localhost.localdomain gnome-session[1909]: > (tracker-extract:2304): GVFS-RemoteVolumeMonitor-WARNING **: Owner of volume > monitor org.gtk.Private.AfcVolumeMonitor disconnected from the bus; removing Which brings me to this CRITICAL in the logs: Mar 08 13:33:51 localhost.localdomain gnome-session[4152]: (tracker-extract:4361): GVFS-CRITICAL **: fill_mountable_info: assertion 'proxy != NULL' failed Which is caused by the inability to connect to org.gtk.vfs.Daemon:/org/gtk/vfs/mounttracker Is there any reason to use g_printerr in create_mount_tracker_proxy instead of g_warning or g_debug? That way the actual error message would end up in the logs, which is better than that CRITICAL from g_return_if_fail. Secondly, the g_return_if_fail seems too harsh. CRITICALs are usually for programmer errors, but in this case creating a proxy can fail. How about demoting it to a if (proxy == NULL) return?
Created attachment 312429 [details] [review] client: Use g_warning if proxy creation failed (In reply to Debarshi Ray from comment #3) > (In reply to Matthias Clasen from comment #0) > (snip) > > Is there any reason to use g_printerr in create_mount_tracker_proxy instead > of g_warning or g_debug? That way the actual error message would end up in > the logs, which is better than that CRITICAL from g_return_if_fail. I don't think there is a reason for using g_printerr..
Comment on attachment 312417 [details] [review] proxy volume monitor: Guard access to the internal caches Pushed to master, gnome-3-16 and gnome-3-14.
All done ?
Comment on attachment 312429 [details] [review] client: Use g_warning if proxy creation failed commit eef594d24918d28ad5f0f736825a333df955c6c4
Done, thanks...
Reopening, it seems the bug isn't fixed as per the downstream bug report...
Taking a second look since people are still hitting this. (In reply to Matthias Clasen from comment #0) > Thread no. 1 (10 frames) > #0 g_hash_table_iter_next at ghash.c:791 > #1 get_mounts at gproxyvolumemonitor.c:154 > #3 mounts_setup at storage.c:698 > #4 storage_init at storage.c:165 > #5 g_type_create_instance at gtype.c:1870 > #6 g_object_new_internal at gobject.c:1774 > #9 storage_new at storage.c:849 > #10 media_art_process_initable_init at extract.c:192 > #11 g_initable_new_valist at ginitable.c:228 > #12 g_initable_new at ginitable.c:146 > > The assertion that is triggered in g_hash_table_iter_next means that the > hash table was modified while you are iterating over it That bug might have been there in the code, but I don't think that is why it is crashing. We have this in the logs just before it crashes: GLib-CRITICAL **: g_hash_table_iter_init: assertion 'hash_table != NULL' failed And then the top two frames look like this:
+ Trace 235590
Taken together, I think the call to g_hash_table_iter_init failed because monitor->mounts == NULL, and then the first call to g_hash_table_iter_next crashes at this line: g_return_val_if_fail (ri->version == ri->hash_table->version, FALSE); ... not because the versions don't match, but because ri->hash_table == NULL. If it was just failing the version check then we would have seen another CRITICAL in the logs, and things would continue without a crash. Plus if you look at this: hash_iter = {dummy1 = 0x0, dummy2 = 0x2, dummy3 = 0x0, dummy4 = 2, dummy5 = 0, dummy6 = 0x0} ... and compare it with the RealIter struct, then you will see that ri->hash_table is hash_iter.dummy1, ri->position is hash_iter.dummy4 and ri->version is (int) hash_iter.dummy6.
Now the question is: "how can the mounts GHashTable be NULL". Since GProxyVolumeMonitor instances are never meant to be destroyed (they have dummy finalize and dispose methods), it is not a ref counting issue. The only way the mounts GHashTable can be NULL is if we failed to create a GDBusProxy for the remote volume monitor in g_proxy_volume_monitor_constructor. That is not surprising because the logs do indicate that the session is being closed, which means the connection to the bus might have been closed. I think the session logs are somehow missing a message like this: "The connection is closed (g-io-error-quark, 18)" ... which was present in the logs that Vadim gave me from his reproducer.
Created attachment 313454 [details] [review] proxy volume monitor: Properly handle failure to create a remote proxy
Review of attachment 313454 [details] [review]: Thanks for the patch. It looks good to me. Let's wait for test results from Vadim before pushing.
Created attachment 313463 [details] [review] proxy volume monitor: Properly handle failure to create a remote proxy We should also ensure that the reference counting stays consistent with the non-error paths. Currently we are not taking the extra reference when we failed to create a GDBusProxy, and that can lead to the big fat g_warning in the finalize.
From IRC: <oholy> rishi: please push it also upstream
Pushed to master, gnome-3-16 and gnome-3-14.