After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 591363 - Performance issue when trashing files
Performance issue when trashing files
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: gio
2.20.x
Other Linux
: Normal major
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2009-08-10 19:41 UTC by Mark
Modified: 2009-08-11 19:19 UTC
See Also:
GNOME target: ---
GNOME version: 2.25/2.26



Description Mark 2009-08-10 19:41:12 UTC
Hi,

A few days ago i deleted a few thousand files in Nautilus. It was extremely slow while a rm -rf *.jpg does the exact same thing is just a few micro seconds. So today i started digging in the code and making a simple sample program that just deletes all the files in a certain folder you provide in the variable: "dirpath".
You can find the code here: http://codepad.org/mV0sgWIv

If you need to generate thousands of files run something like this:
for i in `seq 10000`; do touch $i.txt; done

My results for trashing (running the code provided in the above codepad link):
...
snip // file info
...
Number of files: 1927

real	1m50.061s
user	0m0.344s
sys	0m0.596s

That is horrible slow!
Deleting files (not trashing them) takes 0.8 seconds for the same operation and the same amount of files.

In case you wonder. all files where .jpg thumbnail sized files of roughly 50 KiB each.

Right now i narrowed this down to GVFS's trash function whichever function that is.. it has a few layers and i'm not advanced enough in this library to figure all that stuff out and patch it. It would be faster and better for the quality of the patch if someone more knowledgeable about GVFS could take a look at it and patch it.

A tough guess is that the real trash function (that needs to be fixed to make it fast) is in the file: glocalfile.c on line 1701 (g_local_file_trash) but i could be way off.

Marked this bug as major since every user will at one point in time encounter this. And not only under gnome but every application that uses GVFS for file operations. Perhaps it should be a blocker for version 2.21 of glib?

I hope i provided enough information to get this bug fixed.
If not, feel free to ask.

Good luck,
Mark.
Comment 1 Mark 2009-08-11 14:10:27 UTC
// copy from the message i posted on gtk-devel-list: http://mail.gnome.org/archives/gtk-devel-list/2009-August/msg00037.html

Hi,

A few days ago i deleted thousands of files with just nautilus. That went fine but horribly slow. doing the same with the rm command was way faster.
I made a bug report about it yesterday with my results from that moment: http://bugzilla.gnome.org/show_bug.cgi?id=591363

So i tried to trace this issue down. First i added dozens of debug messages to the function: g_local_file_trash in the file glocalfile.c and that resulted in one function that sucked up time.
The function was: g_file_set_contents (infofile, data, -1, NULL);

Now thankfully Alexander pointed me to where that function is going to (saved me probably a lot of time backtracing that). the function was going back to: write_to_temp_file in the file gfileutils.c
Then he gave a valuable suggestion: "can you try disabling the whole #ifdef HAVE_FSYNC bloc" and i did that. Sure enough that solved the performance hit but this might have other unexpected side effects you rather avoid like loss of data.

Anyway here are benchmarks with and without that single fsync block.

With #ifdef HAVE_FSYNC
----------------------------------------------
...
snip // file info
...
Number of files: 1927

real    1m50.061s
user    0m0.344s
sys     0m0.596s


Without #ifdef HAVE_FSYNC
----------------------------------------------
...
snip // file info
...
Number of files: 1927

real	0m0.902s
user	0m0.180s
sys	0m0.392s

With fsync file trashing is ~120x slower then without fsync! way to big difference if you ask me.

And to be perfectly clear about the lines that i removed in the "Without #ifdef HAVE_FSYNC" benchmark.
This is all of it:

#ifdef HAVE_FSYNC
  errno = 0;
  /* If the final destination exists, we want to sync the newly written
   * file to ensure the data is on disk when we rename over the destination.
   * otherwise if we get a system crash we can lose both the new and the
   * old file on some filesystems. (I.E. those that don't guarantee the
   * data is written to the disk before the metadata.)
   */
  if (g_file_test (dest_file, G_FILE_TEST_EXISTS) &&
      fsync (fileno (file)) != 0)
    { 
      save_errno = errno;
      
      g_set_error (err,
		   G_FILE_ERROR,
		   g_file_error_from_errno (save_errno),
		   _("Failed to write file '%s': fsync() failed: %s"),
		   display_name, 
		   g_strerror (save_errno));

      g_unlink (tmp_name);
      
      goto out;
    }
#endif

So, the problem is outlined here. Now what would be a solution for this issue?

Thanx,
Mark.
Comment 2 Alexander Larsson 2009-08-11 19:19:19 UTC
fixed in glib git by avoiding fsyncs in g_file_set_contents when replacing empty files.