GNOME Bugzilla – Bug 678555
Crash on unmount
Last modified: 2016-02-15 08:31:19 UTC
Got this while a demo Core was generated by `/usr/libexec/gvfs-udisks2-volume-monitor'. Program terminated with signal 11, Segmentation fault.
+ Trace 230399
Thread 1 (Thread 0x7fece2aad800 (LWP 1880))
Would be interesting to know how this had happened. Reading the code it seems sane to me, though the private data structure is passed there a lot and a forgotten non-cancelled callback may still use the freed memory.
It seems this is pretty common crash, see: https://retrace.fedoraproject.org/faf2/problems/890220/
I've got a patch for this and it may possibly fix bug 751038 as well. I just need to do some testing on it.
Created attachment 305624 [details] [review] udisks2: Prevent race between unmount reply and retry timer Currently it is possible for the unmount op reply and the retry unmount timer to race. A udisks2 unmount operation (or umount spawned command) is started via the timer. In the meantime, a "cancel" or "force unmount" reply is received which completes the gvfs unmount operation and frees the private data. When the udisks2 unmount operation started by the timer completes, it tries to use the freed data and segfaults. To fix this, prevent starting an unmount operation when another is already in progress. If a timer callback is received while an unmount operation is in progress, simply ignore it. If an unmount op reply is received while an unmount operation is in progress, store the result of the reply and handle it once the unmount operation has completed.
(In reply to Ross Lagerwall from comment #3) > I've got a patch for this and it may possibly fix bug 751038 as well. I just > need to do some testing on it. Looking again, I don't think this is related to bug 751038. I've done some basic testing and I think it fixes the issue. I'd appreciate some more testing though.
Review of attachment 305624 [details] [review]: It is really good point, Ross! I've not tested it yet, but... I think that also following change should be done (imagine the case when on_mount_op_reply with "force unmount" choice is invoked immediately before lsof_command_cb due to timeout)... @@ -726,7 +726,7 @@ lsof_command_cb (GObject *source_object, } out: - if (!data->completed) + if (!data->completed && !data->reply_set) { gboolean is_eject; gboolean is_stop; ::: monitor/udisks2/gvfsudisks2mount.c @@ +817,3 @@ + return; + } + It seems enough to set data->in_progress = FALSE once here and other occurrences of data->in_progress = FALSE can be removed, don't you think?
Created attachment 305737 [details] [review] udisks2: Prevent race between unmount reply and retry timer Currently it is possible for the unmount op reply and the retry unmount timer to race. A udisks2 unmount operation (or umount spawned command) is started via the timer. In the meantime, a "cancel" or "force unmount" reply is received which completes the gvfs unmount operation and frees the private data. When the udisks2 unmount operation started by the timer completes, it tries to use the freed data and segfaults. To fix this, prevent starting an unmount operation when another is already in progress. If a timer callback is received while an unmount operation is in progress, simply ignore it. If an unmount op reply is received while an unmount operation is in progress, store the result of the reply and handle it once the unmount operation has completed.
Created attachment 305738 [details] [review] udisks2: Prevent race between unmount reply and retry timer Currently it is possible for the unmount op reply and the retry unmount timer to race. A udisks2 unmount operation (or umount spawned command) is started via the timer. In the meantime, a "cancel" or "force unmount" reply is received which completes the gvfs unmount operation and frees the private data. When the udisks2 unmount operation started by the timer completes, it tries to use the freed data and segfaults. To fix this, prevent starting an unmount operation when another is already in progress. If a timer callback is received while an unmount operation is in progress, simply ignore it. If an unmount op reply is received while an unmount operation is in progress, store the result of the reply and handle it once the unmount operation has completed.
(In reply to Ondrej Holy from comment #6) > Review of attachment 305624 [details] [review] [review]: > > It is really good point, Ross! I've not tested it yet, but... > > I think that also following change should be done (imagine the case when > on_mount_op_reply with "force unmount" choice is invoked immediately before > lsof_command_cb due to timeout)... I don't think this is a problem. unmount_show_busy holds an extra ref to data so it can't be freed before lsof_command_cb runs. And updating the list of processes should be OK during force unmount (even force-unmount may return busy in which case we need to update the list again). > @@ -726,7 +726,7 @@ lsof_command_cb (GObject *source_object, > } > > out: > - if (!data->completed) > + if (!data->completed && !data->reply_set) > { > gboolean is_eject; > gboolean is_stop; > > ::: monitor/udisks2/gvfsudisks2mount.c > @@ +817,3 @@ > + return; > + } > + > > It seems enough to set data->in_progress = FALSE once here and other > occurrences of data->in_progress = FALSE can be removed, don't you think? OK, the updated patch sets in_progress to FALSE either in unmount_data_complete or unmount_show_busy.
Review of attachment 305738 [details] [review]: Looks good, thanks!
Pushed to master as c014b64fa17f5b725f1f13d7952ec479e45e6031. Thanks for the reviews!
Ross, please push this also into the stable branches...
OK, pushed to gnome-3-14 and gnome-3-16.
Thanks!
It seems we introduced another bug, or didn't fix all the races here, see: https://bugzilla.redhat.com/show_bug.cgi?id=1276236 Truncated backtrace: Thread no. 1 (10 frames) #0 g_simple_async_result_set_error at gsimpleasyncresult.c:719 #1 mount_op_reply_handle at gvfsudisks2mount.c:666 #2 unmount_show_busy at gvfsudisks2mount.c:819 #3 unmount_cb at gvfsudisks2mount.c:869 #4 g_simple_async_result_complete at gsimpleasyncresult.c:763 #5 reply_cb at gdbusproxy.c:2623 #6 g_simple_async_result_complete at gsimpleasyncresult.c:763 #7 g_dbus_connection_call_done at gdbusconnection.c:5502 #8 g_simple_async_result_complete at gsimpleasyncresult.c:763 #9 complete_in_idle_cb at gsimpleasyncresult.c:775
*** Bug 731814 has been marked as a duplicate of this bug. ***
Created attachment 320606 [details] valgrind output I have finally reproduce the crash in my environment and attaching valgrind output...
Review of attachment 305738 [details] [review]: ::: monitor/udisks2/gvfsudisks2mount.c @@ +690,3 @@ + data->reply_choice = choice; + data->reply_set = TRUE; + if (!data->completed || !data->in_progress) The problem is that this condition is always true... there should be && instead of ||. However it seems to me that data->completed can't be true in this place ever...
Created attachment 320607 [details] [review] udisks2: Avoid crashes during unmount Commit c014b64 was pushed to prevent race between unmount reply and retry timer. Result of mount operation reply should be stored if unmount operation is in progress, however it isn't, because the conditional statement is always true. Fix the condition accordingly.
Review of attachment 320607 [details] [review]: Looks good. Thanks for working this out.
(In reply to Ondrej Holy from comment #18) > Review of attachment 305738 [details] [review] [review]: > > ::: monitor/udisks2/gvfsudisks2mount.c > @@ +690,3 @@ > + data->reply_choice = choice; > + data->reply_set = TRUE; > + if (!data->completed || !data->in_progress) > > The problem is that this condition is always true... there should be && > instead of ||. However it seems to me that data->completed can't be true in > this place ever... I think data->completed may be true here if unmount_data_complete() is called and then the "reply" signal is handled. This can only happen if there's still a ref for the UnmountData preventing the handler from being disconnected.
Comment on attachment 320607 [details] [review] udisks2: Avoid crashes during unmount Thanks for review! Pushed to master as commit 1f28d65, gnome-3-18 as commit 12f104b, and gnome-3-16 as commit 88d13b4...