GNOME Bugzilla – Bug 757747
[Regression] Since version 3.18, moving many files in Nautilus often but not always takes a very long time to complete
Last modified: 2016-12-16 14:07:37 UTC
Created attachment 315063 [details] test script Steps to reproduce: 1. run attached script 2. select and dnd text files to a subfolder Screencast: https://www.dropbox.com/s/rfika08z6x3gp49/Screencast_Nautilus.webm?dl=0
I’m also seeing this on multiple machines (all Fedora Workstation 23 x86-64 with Nautilus 3.18.5, ext4fs on LUKS if that matters). Steps: 1) Create a large number of files. 2) Select them in Nautilus. 3) Drag them into another directory (on the same volume). Expected behaviour: Near instantaneous move. What happens: Usually (but not always!), Nautilus will appear to freeze, with the user interface partially responsive (as in Gedgon’s video). Some files may be jerkily moved before this occurs. Sometimes the move eventually completes after a few minutes (maybe always if one waits long enough). Monitoring the directories (e.g. ls in a terminal) confirms that the files are being moved at the rate Nautilus shows in its windows: this isn’t just a user interface lag. I guess this could be GIO/GVFS rather than Nautilus itself. If there are any tests I can run to help pin it down, I’m happy to try them.
I’ve filed a Fedora bug at https://bugzilla.redhat.com/show_bug.cgi?id=1321684
Can you try with gvfs-move in the command line?
I just tried with 100 empty files moved into an empty directory. Nautilus got stuck at number 43 all ten times I tried. “gvfs-move File* Dir/” worked perfectly all ten times I ran it. I’ve no idea if there’s any significance in the number 43, but I tried a few different file basenames and directory names.
-> 3.20. We are blocking the threads somewhere. It's true that we changed some thread shared memory access, since before we were accessing shared memory without a mutex, but I cannot see how that can be the problem. Anyway, the performance disparity is really big. Not sure how this got unnoticed until now. I guess not much people move > 50 files at once (to be clear, 50 selected files, not a folder which contains > 50 files, that is instantaneous.)
After some testing, I could reproduce only 1/20 times. Also, it blocks on g_file_move. So my guess it's that there are too many threads in the context of nautilus. It's surprising to me though that this somehow changed in 3.18, the number of threads or using g_file_move have been there since forever. gvfs-move doesn't use any thread, and does the process synchronously, so that removes most of the overhead.
After some more testing, this seems like a general issue. I could reproduce with copy operations, where the copy took few minutes when the command line mv took few seconds. This needs further investigation.
Created attachment 327347 [details] [review] file-undo-operations: use queue for storing sources and destinations When copying, moving or linking files, original sources and their destinations are stored in an information structure used for undoing the operation. The sources and destinations were appended at the end of a list. Due to append taking linear time, there was a noticeable decrease in speed when operating with many files. In order to fix this, the sources and destinations are stored in a queue that allows appending in constant time.
Review of attachment 327347 [details] [review]: Cool, LGTM! Do you have any numbers (average of seconds improvement or so)? Would be cool to know. Also, worth to mention in the commit message that the problem is not being linear, but being quadratic because of the linearity of g_list_append and the loop to add files here one by one. Commit to master and 3.20 please.
Created attachment 327593 [details] [review] file-undo-operations: use queue for storing sources and destinations When copying, moving or linking files, original sources and their destinations are stored in an information structure used for undoing the operation. The sources and destinations were appended at the end of a list. Due to append taking linear time, these file operations would have a resulting quadratic complexity depending on the number of files. When operating on thousands of files, this would lead to a significant decrease in speed. In order to fix this, the sources and destinations are stored in a queue that allows appending in constant time.
(In reply to Carlos Soriano from comment #9) > Review of attachment 327347 [details] [review] [review]: > > Cool, LGTM! > Do you have any numbers (average of seconds improvement or so)? Would be > cool to know. > Also, worth to mention in the commit message that the problem is not being > linear, but being quadratic because of the linearity of g_list_append and > the loop to add files here one by one. > Commit to master and 3.20 please. For 60000 files we are talking about 5 minutes vs 6 seconds.
(In reply to Razvan Chitu from comment #11) > For 60000 files we are talking about 5 minutes vs 6 seconds. Awesome, thanks! Unfortunately, it's doesn't resolve the issue.
(In reply to gedgon from comment #12) > (In reply to Razvan Chitu from comment #11) > > For 60000 files we are talking about 5 minutes vs 6 seconds. > > Awesome, thanks! Unfortunately, it's doesn't resolve the issue. This fixes an efficiency issue in Nautilus alone. I've tracked another issue to gvfs, let's hope that the next patches will fix it!
Review of attachment 327593 [details] [review]: Why did I not reviewed this? Sorry for the delay, I though we already had this on master. LGTM, thanks Razvan!
Comment on attachment 327593 [details] [review] file-undo-operations: use queue for storing sources and destinations Not marking as fixed since we need to figure out the other issues. Attachment 327593 [details] pushed as e9fe639 - file-undo-operations: use queue for storing sources and destinations
(In reply to Carlos Soriano from comment #15) > Not marking as fixed since we need to figure out the other issues. And yet it's marked as fixed :/
*** Bug 771519 has been marked as a duplicate of this bug. ***
(In reply to Razvan Chitu from comment #13) > (In reply to gedgon from comment #12) > > (In reply to Razvan Chitu from comment #11) > > > For 60000 files we are talking about 5 minutes vs 6 seconds. > > > > Awesome, thanks! Unfortunately, it's doesn't resolve the issue. > > This fixes an efficiency issue in Nautilus alone. I've tracked another issue > to gvfs, let's hope that the next patches will fix it! Hello Razvan, have you progressed on the second patch ? Thanks !
Well I have a lot of pictures to sort and move and it's utterly annoying to use Nautilus for this quite simple use case. I hope that can be fixed. For now I will do it on my test partition with Manjaro with KDE plamsa and Dolphin.
(In reply to jeremy9856 from comment #19) > Well I have a lot of pictures to sort and move and it's utterly annoying to > use Nautilus for this quite simple use case. I hope that can be fixed. For > now I will do it on my test partition with Manjaro with KDE plamsa and > Dolphin. Please keep comments strictly to the technical part that could help fixing this bug.
Sorry but that maybe wasn't out of context because I just tried with Manjaro / Kde / Dolphin and moving pictures is really slow too ! Don't know if it's related.
My point is about comment #19, which had no information at all. Dolphin being slow as you point in comment #21 it is relevant, but afaik they don't use gvfs.
I don't know what happen but I installed Thunar on my Fedora 24 install and that work perfectly well. I found many users that have this kind of problem on the web with Nautilus or Nautilus forks for years. Don't know if that can help https://bugs.launchpad.net/ubuntu/+source/nautilus/+bug/1208993 https://github.com/mate-desktop/caja/issues/528 https://bugs.launchpad.net/ubuntu/+source/linux/+bug/388595 https://bugs.launchpad.net/ubuntu/+source/nautilus/+bug/1133477 https://ubuntuforums.org/showthread.php?t=1618281
I re-tried on Manjaro with Dolphin and today that worked... Probably not related to this problem. I also tried with Nautilus 3.22 on KDE desktop on Manjaro and that didn't worked.
Created attachment 339105 [details] Debug log I made a debug log while I was moving some pictures. The error message "Le délai d'attente est dépassé" can be translated by "The waiting period is exceeded". I Hope it help.
-> gvfs based on comment 25 log
As said on IRC those are native files, so it's probably not gvfs but glib, right?
po/fr.po-#: ../gio/gdbusconnection.c:1879 po/fr.po-msgid "Timeout was reached" po/fr.po:msgstr "Le délai d'attente est dépassé" Those error messages would be probably something else, or side-effect... GVfs should not be involved in this case, so rather GLib. But, Comment 4 says that gvfs-move works correctly. Comment 23 says it works correctly also in Thunar, which is based on GIO/GVfs. So it seems to me this is rather something in Nautilus, not GLib... Razvan, you told in Comment 13 that you tracked some issue to GVfs, can you be more descriptive?
(In reply to Ondrej Holy from comment #28) > GVfs should not be involved in this case, so rather GLib. But, Comment 4 > says that gvfs-move works correctly. Comment 23 says it works correctly also > in Thunar, which is based on GIO/GVfs. So it seems to me this is rather > something in Nautilus, not GLib... +1. I have seen this in ordinary non-gvfs folders when using nautilus. So I think it is a nautilus bug only, not in glib or gvfs. As far as I can tell the file transactions (copy, move) work fine and get completed, but it is just nautilus which doesn't catch up, probably missing some "operation finished" signal.
Created attachment 339315 [details] [review] metabuilder: use sequences instead of lists The metabuilder stores files and data in sorted lists. An insertion into a sorted list is done in linear time, which highly affects efficiency. In order to fix this, use GSequence instead which does insertion and deletion in logarithmic time.
Created attachment 339316 [details] [review] metabuilder: use queues instead of lists Some of the lists used by the metabuilder have items appended to them. Appending to a list is done in linear time and this highly affects efficiency. In order to fix this, use GQueue which supports appending in constant time.
Thank you Razvan. I hope that will do the trick !
It is helping a bit in terms of speed at least. Moving the whole mozilla source tree between two internal hard disks (both ext4) now starts instantly and copies faster than before. Nautilus used 20% CPU though. The memory used by nautilus gradually increases. I saw an extra 64MB increase in memory and it was not freed at the end. gvfsd-metadata used 7 to 10% CPU with negligible increase in memory consumption. Deleting the files from new location uses 10 seconds for "preparing to delete" which uses to be longer. nautilus used ~27% CPU on average. nautilus memory usage increased by only 8MB this time. Gvfsd-metadata went up from 0.4MB to 4.1MB and stayed there. In both file operations, the speed is somewhat faster with this patch but the memory spikes are an issue.
I forgot to say thank you for the patches and investigating. It would be nice to improve performance and resource usage of gio/gvfs/nautilus.
Review of attachment 339315 [details] [review]: Thanks for those patches. It looks good except unused variables, although, I haven't tested it yet... ::: metadata/metabuilder.c @@ +183,2 @@ { GList *l; metabuilder.c:184:10: warning: unused variable ‘l’ @@ +290,3 @@ { MetaFile *src_child, *dest_child; GList *l; metabuilder.c:292:10: warning: unused variable ‘l’ @@ +398,3 @@ const char *key) { MetaData *data; metabuilder.c:400:13: warning: unused variable ‘data’
Review of attachment 339316 [details] [review]: It looks good except one detail, although, I haven't tested it yet... Maybe it would be worth to replace GList *values from MetaData struct also... ::: metadata/metabuilder.c @@ +788,3 @@ append_time_t (out, child->last_changed, builder); + if (child->children) This should be fixed in a separate patch.
I have been debugging it a bit. Nautilus has over 80 threads when the problems start occurring for me. I don't really think that this is the right way. Nautilus should use some thread poll with some reasonable limit. The attached patches don't solve the issue for me. gvfs_metadata_call_get_tree_from_device_sync timeouts with "Error: Timeout was reached" for some reason. Nautilus stucks later anyway, even if I hardcode the result, or add a bigger timeout... The slowdown seems to be caused by g_dbus_connection_flush_sync, which is called from g_daemon_vfs_local_file_moved. The move operation is fast if I comment out this function... By the way, is it necessary to set the following empty attributes for each file? $ gvfs-info -a "metadata::*" /tmp/test/subfolder/File0000.txt uri: file:///tmp/test/subfolder/File0000.txt attributes: metadata::nautilus-icon-position: metadata::screen:
(In reply to Ondrej Holy from comment #37) > I have been debugging it a bit. Nautilus has over 80 threads when the > problems start occurring for me. I don't really think that this is the right > way. Nautilus should use some thread poll with some reasonable limit. The Yeah, this has been known for quite a long time. Of course it's something I would like to change (hopefully this next GSoC), but has been like this forever, so unlikely it's going to be the cause of this problem, no? > attached patches don't solve the issue for me. I was expecting that, this looks something different. > > gvfs_metadata_call_get_tree_from_device_sync timeouts with "Error: Timeout > was reached" for some reason. Nautilus stucks later anyway, even if I > hardcode the result, or add a bigger timeout... You mean even if you early return or something nautilus gets stuck? > > The slowdown seems to be caused by g_dbus_connection_flush_sync, which is > called from g_daemon_vfs_local_file_moved. The move operation is fast if I > comment out this function... So this is the root of the problem? > > By the way, is it necessary to set the following empty attributes for each > file? > $ gvfs-info -a "metadata::*" /tmp/test/subfolder/File0000.txt > uri: file:///tmp/test/subfolder/File0000.txt > attributes: > metadata::nautilus-icon-position: > metadata::screen: I don't have this for any file except files in the desktop. I guess this file is not in the desktop for you right? Or did you create this file from templates or something?
(In reply to Carlos Soriano from comment #38) > (In reply to Ondrej Holy from comment #37) > > (snip) > > The slowdown seems to be caused by g_dbus_connection_flush_sync, which is > > called from g_daemon_vfs_local_file_moved. The move operation is fast if I > > comment out this function... > > So this is the root of the problem? Yes, probably, maybe it might be caused by commit b78cc85, or something might be changed in dbus. I will debug it further... > > By the way, is it necessary to set the following empty attributes for each > > file? > > $ gvfs-info -a "metadata::*" /tmp/test/subfolder/File0000.txt > > uri: file:///tmp/test/subfolder/File0000.txt > > attributes: > > metadata::nautilus-icon-position: > > metadata::screen: > > I don't have this for any file except files in the desktop. I guess this > file is not in the desktop for you right? Or did you create this file from > templates or something? The files are not on the desktop. The original files were created by the attached script, but the attributes appear just on the moved files...
(In reply to Ondrej Holy from comment #39) > (In reply to Carlos Soriano from comment #38) > > (In reply to Ondrej Holy from comment #37) > > > (snip) > > > The slowdown seems to be caused by g_dbus_connection_flush_sync, which is > > > called from g_daemon_vfs_local_file_moved. The move operation is fast if I > > > comment out this function... > > > > So this is the root of the problem? > > Yes, probably, maybe it might be caused by commit b78cc85, or something > might be changed in dbus. I will debug it further... Nope, the commit makes it slightly worse, however, it is not the culprit... It works nicely if I remove g_dbus_connection_flush_sync, even also if I use gvfs_metadata_call_move_sync... it needs to be discussed with some gdbus guys....
Thanks a lot Ondrej for digging into it. I will try to investigate further the metadata of the desktop in non desktop files, that's strange :/ And if I can do something to help with the dbus issue... feel free to ask
I was convinced that this happens because g_dbus_connection_flush_sync is called from many threads, but I am not sure about it right now. I made a script which calls g_file_move from a thread pool and I am not able to reproduce the issue this way. I wonder what is a difference here...
Ok, the difference is that the additional metadata (i.e. metadata::nautilus-icon-position, metadata::screen) are set concurrently, which somehow blocking the threads when flushing the connection... WORKAROUND: It seems that Ctrl+x + Ctrl+v, or Cut + Paste works instantly! I don't see the addition metadata in this case! That's the reason why I didn't see this issue earlier, I don't use drag&drop...
(In reply to Ondrej Holy from comment #43) > WORKAROUND: It seems that Ctrl+x + Ctrl+v, or Cut + Paste works instantly! That's exactly what I'm seeing for weeks now.
Created attachment 340663 [details] [review] metadata: Use sequences instead of lists I modified this patch as per the review. It has really significant speedup in some cases...
Created attachment 340664 [details] [review] metadata: Use queues instead of lists dtto
Created attachment 340665 [details] [review] metadata: Fix bogus condition The file->children condition is always true at this point, child->children should be there instead in order to speed up processing. This fix doesn't affect the functionality, it just slightly improves processing.
Created attachment 340666 [details] [review] client: Use sync methods instead of flushing dbus g_dbus_connection_flush_sync causes troubles in certain cases when using multiple threads. The problem occurs when there are ongoing gvfs_metadata_call_set_sync calls. It blocks threads e.g. when moving files over Nautilus. Nautilus freezes sometimes due to it. Use sync methods instead of flushing dbus. I don't see any significant slowdown when using sync methods instead of the flushing according to my testing. It fixes hangs in Nautilus and moving in Nautilus is almost instant again. I am not sure about exact reasons, why the issue happens, but it may be fixed this way. Guys, can you please help testing this fixes?
Alex, can you please take a look at this? Don't you see any drawbacks of this solution?
Btw the progress bar seems don't work as expected when moving... the total number of files is decreasing and ends on 0/0.
(In reply to Ondrej Holy from comment #43) > WORKAROUND: It seems that Ctrl+x + Ctrl+v, or Cut + Paste works instantly! (In reply to Christian Stadelmann from comment #44) > That's exactly what I'm seeing for weeks now. Strange, the workaround doesn't work for me. Screencast, 1.5MB: https://www.dropbox.com/s/esasu067rcef0yw/757747.mp4 But with the patches, d&d and cut & paste works instantly here. Thanks!
(In reply to Ondrej Holy from comment #50) > Btw the progress bar seems don't work as expected when moving... the total > number of files is decreasing and ends on 0/0. I moved 158k files between two disks. The total number in the progress went from 500 to 158700 (no 0/0). But the indicator circle was around ~90% full when it completed. This was slow but is likely due to my computer's motherboard being slow. Moving files within the same disk was instant. Now I deleted the from the new location. This was slow but faster than non-patched gvfs. So yeah, apart from the indicator circle being off by a bit, this is pretty good. Thank you very much for the patches :)
> By the way, is it necessary to set the following empty attributes for each file? > $ gvfs-info -a "metadata::*" /tmp/test/subfolder/File0000.txt > uri: file:///tmp/test/subfolder/File0000.txt > attributes: > metadata::nautilus-icon-position: > metadata::screen: WOA! We should only do that if the target is the desktop, as we no longer support manual positioning on any other view.
(In reply to Alexander Larsson from comment #53) > > By the way, is it necessary to set the following empty attributes for each file? > > $ gvfs-info -a "metadata::*" /tmp/test/subfolder/File0000.txt > > uri: file:///tmp/test/subfolder/File0000.txt > > attributes: > > metadata::nautilus-icon-position: > > metadata::screen: > > WOA! We should only do that if the target is the desktop, as we no longer > support manual positioning on any other view. BTW it works nicely without changes in gvfs client if I remove nautilus_file_set_metadata calls from nautilus-directory.c...
(In reply to Ondrej Holy from comment #48) > Created attachment 340666 [details] [review] [review] > client: Use sync methods instead of flushing dbus bare g_file_move for 10000 files: _async+flush: 3.5sec (1000 threads), 6.5sec (linear) sync: 3.5sec (1000 threads), 8.5sec (linear) So it is slightly worse for linear approach, however, I think it is acceptable (Nautilus uses threads and it seems it is not able to handle so many files anyway...) In some cases, the flushing causes that the files are moving in groups of 1000 in the case of 1000 threads, which isn't really wanted...
Thank you so much Ondrej !
Also the last set of patches heavily improve on gvfsd-metadata memory usage. Previously, it would increase my 10MB every time I delete 100k files. Now it doesn't.
Attachment 340663 [details] pushed as 0e001c9 - metadata: Use sequences instead of lists Attachment 340664 [details] pushed as fe33512 - metadata: Use queues instead of lists Attachment 340665 [details] pushed as 1224147 - metadata: Fix bogus condition Attachment 340666 [details] pushed as 57acc80 - client: Use sync methods instead of flushing dbus
Pushed to master, not sure about gnome-3-22... Letting this bug open due to the unwanted metadata attributes set by Nautilus... Carlos, can you please take a look?
I hope the fixes can make it to gnome-3-22 to avoid waiting Fedora 26. Again thank you Ondrej !
(In reply to Ondrej Holy from comment #59) > Pushed to master, not sure about gnome-3-22... > > Letting this bug open due to the unwanted metadata attributes set by > Nautilus... Carlos, can you please take a look? Thanks Ondrej! The debug you did is invaluable. Sure, sorry for taking time, just came from a hackfest. I had this on my personal blocker list for 3.22, so it will be good to have this there too. I will do the change for not writing metadata if it's not the desktop (it's actually quite deep into nautilus this). However if I understood correctly, this can happen with any metadata right? So there is still a problem somewhere there that blocks threads if multiple of them are setting metadata right?
> However if I understood correctly, this can happen with any metadata right? Yes > So there is still a problem somewhere there that blocks threads if multiple > of them are setting metadata right? This should be fixed by attachment 340666 [details] [review] (on the other hand, the move/remove operation might be blocked by journal rotation due to this change, however, it shouldn't be significant).
Let's see if this works fine...
whops, I commented in the wrong bur report. So here is the Nautilus counterpart (fixed): https://bugzilla.gnome.org/show_bug.cgi?id=775910
Should it be reopened ?
I think I spoke too fast :) I thought that since you commented on the wrong bug report, you maybe have closed this one instead of the other. I didn't saw that the fix was pushed into gvfs and that we waited for the Nautilus fixes. Anyway thank you very much Ondrej and Carlos !!!
By the way Ondrej do you plan to make a point release of gvfs to make available the fixes (Gnome 3.20, 3.22) because for now they are just in master ? Thanks !
Sorry, I was busy... I have pushed all the patches in the gnome-3-22 (I really hope we don't break anything with the sequences/queues), and only attachment 340666 [details] [review] for gnome-3-20/gnome-3-18 (it won't be so fast as it is with sequences/queues, but will fix the hangs)... I hope I will make releases soon...
Just a note that I made 1.30.3 release for gnome-3-22 including this. Sorry for the delay, I wanted to include also patch from Bug 773300...