After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 791457 - Slow transfer rate when writing to smb/cifs
Slow transfer rate when writing to smb/cifs
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: gio
2.54.x
Other Linux
: Normal normal
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2017-12-10 21:57 UTC by Andrés Souto
Modified: 2018-05-07 05:15 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
gio: bump splice copy buffer size to 256k by default (1.18 KB, patch)
2017-12-28 16:43 UTC, Andrés Souto
none Details | Review
gio: bump splice copy buffer size to 1024k (1.20 KB, patch)
2018-01-20 22:43 UTC, Andrés Souto
none Details | Review
gio: bump splice copy buffer size to 1024k (1.63 KB, patch)
2018-02-06 22:04 UTC, Andrés Souto
committed Details | Review
gio: bump splice copy buffer size to 1024k (2.11 KB, patch)
2018-02-16 12:02 UTC, Philip Withnall
committed Details | Review

Description Andrés Souto 2017-12-10 21:57:05 UTC
I have a SMB share in a gigabit ethernet network mounted by the system (it is not using gvfs) using the next options:

vers=3.0,rsize=1048576,wsize=1048576,rw,iocharset=utf8,dir_mode=0777,file_mode=0666

The problem is that when I copy a file from the local disk to the SMB share using nautilus, data transfer rate is never over 9MB/s. I have also tried using thunar and nemo and they behave exactly the same way. However, when nautilus reads from the SMB share data transfer reaches 100MB/s.

Using gvfs (i.e.: directly accessing from nautilus at smb://<host>) behaves better but still slower than expected: ~51MB/s

Using cp I got a similar transfer rate: ~50MB/s

I have also tried using kde's dolphin and obtain ~40MB/s

And finally, I tested the different data transfer rates depending on the block size using dd and below are the results I have obtained:

1.5kB   3.0 MB/s
4kb     7.2 MB/s
16kB   25.3 MB/s
32kB   38.6 MB/s
64kB   51.2 MB/s
128kB  64.5 MB/s
256kB  78.4 MB/s
512kB  89.0 MB/s
1MB    90.7 MB/s
2MB    88.8 MB/s
4MB    86.1 MB/s
8MB    88.1 MB/s

So, I thought the problem could be related to writing too small blocks to disk and I started checking what nautilus does and I found out it uses the
g_file_copy function (https://developer.gnome.org/gio/stable/GFile.html#g-file-copy)

But I didn't understand how this function works and find nothing here specifying how many bytes are write at once.

Everything was tested using Ubuntu 17.10
Comment 1 Ondrej Holy 2017-12-11 07:49:24 UTC
file_copy_fallback is what you are looking for:
https://git.gnome.org/browse/glib/tree/gio/gfile.c#n3040

and it has several branches, gvfs uses copy_stream_with_progress:
https://git.gnome.org/browse/glib/tree/gio/gfile.c#n2773

and its buffer has been bumped to 256kB quite recently, see:
https://bugzilla.gnome.org/show_bug.cgi?id=773823

but I suppose that splice_stream_with_progress is used in this case, which uses just 64kB buffer:
https://git.gnome.org/browse/glib/tree/gio/gfile.c#n2893

so maybe we should also bump this buffer to 256kB, or there must be another bottleneck...

It is hard to reach the maximal transfer rates in all cases, but we should reach comparable speed with cp.
Comment 2 Andrés Souto 2017-12-24 16:50:14 UTC
I increased the buffer for the splice_stream_with_progress to 256kB and I noticed no changed. Then I added a #undef HAVE_SPLICE at the beginning of gfile.c to force it to use read/write (like cp and dd) and I achieved a throughtput of ~73MB/s. It seems there is a problem in the cifs filesystem when writing using splice. I have still to further investigate this but it seems the problem is in the kernel, not here.

btw, I also tried copying a file in my local hard disk (ssd using ext4) and this are the (strange?) results I got using the different system calls:

splice 64kB ~153MB/s
splice 256kB ~160MB/s
read/write 256kB ~168MB/s

It seemed strange as splice is supposed to be fastest than plain read/write...
Comment 3 Andrés Souto 2017-12-28 16:32:00 UTC
I have submitted a patch to the kernel that fixes the first bottleneck: https://patchwork.kernel.org/patch/10134653/

Now the transfer rate I obtain using nautilus with the patch applied is:
64kB buffer: ~56MB/s (current value in glib)
256kB buffer: ~80MB/s
512kB buffer: ~87MB/s
1024kB buffer: ~97MB/s

So bumping this buffer at least to 256kB seems desirable. (I'll attach a patch with this change)


Just for fun, I did the same benchmark in my local ssd with the next results:
64kB buffer: ~153MB/s
256kB buffer: ~160MB/s
1024kB buffer: ~195MB/s

This is nearly a 22% improvement from 256kB buffer to 1MB buffer for both cases.

So now I wonder: there is some drawback if we increase buffer to 1MB? And if it has some drawback could make sense defining buffer size dynamically based on some parameter?
Comment 4 Andrés Souto 2017-12-28 16:43:39 UTC
Created attachment 366049 [details] [review]
gio: bump splice copy buffer size to 256k by default
Comment 5 Philip Withnall 2018-01-03 11:43:01 UTC
I can’t immediately think of any problems with using a 1MB buffer size. Ondrej, can you?

The original commit which introduced splice() support (bb4f63d6390fe5efd183f259e5bd891f89de9e24) doesn’t mention anything about the choice of buffer size, and neither does the bug about it (bug #604086).
Comment 6 Colin Walters 2018-01-03 13:28:21 UTC
See also https://github.com/coreutils/coreutils/blob/master/src/ioblksize.h
Comment 7 Ondrej Holy 2018-01-03 15:42:07 UTC
(In reply to Philip Withnall from comment #5)
> I can’t immediately think of any problems with using a 1MB buffer size.
> Ondrej, can you?

I don't know...

Just what I have in mind is that the progress can be reported slowly for network mounts in case of big buffers and wrong connectivity...
Comment 8 Philip Withnall 2018-01-04 11:15:31 UTC
OK, well it looks like we should definitely be increasing the buffer size to 128KiB as per comment #6, and keeping an eye on ioblksize.h in future.

Andrés, what file system are you using on your local SSD (comment #3)? I assume it’s ext4. Could you also do some tests with copying to/from FAT USB sticks and to/from DAV network shares with various buffer sizes?

I’m currently tempted to go with a 1MB buffer size, since it provides measurable improvements for CIFS and your local SSD, but before we go with that I want to verify that it doesn’t degrade performance for other common file transfer use cases. (Hopefully it’ll improve them too.)
Comment 9 Philip Withnall 2018-01-04 11:34:12 UTC
A coworker recently ran the ioblksize.h tests on a Yoga 900 and got the following results. tl;dr: 128KiB is best for syscall overhead (/dev/zero → /dev/null), but 256KiB is best for copying files on an SSD (ext4). There’s not much between them though. He decided to go with a 1MiB buffer size for the task he was working on.

Syscall overhead:

blocksize (B) | transfer rate (GB/s)
--------------+---------------------
         1024 |  3.4
         2048 |  5.9
         4096 |  9.1
         8192 | 11.1
        16384 | 13.3
        32768 | 14.2
        65536 | 15.7
       131072 | 16.1 (highest)
       262144 | 15.9
       524288 | 15.5
      1048576 | 15.3

Writing to disk:

blocksize (B) | transfer rate (MB/s)
--------------+---------------------
         1024 |  37.0
         2048 |  40.1
         4096 | 465
         8192 | 447
        16384 | 432
        32768 | 458
        65536 | 458
       131072 | 443
       262144 | 466 (highest)
       524288 | 450
      1048576 | 463
      2097152 | 452
      4194304 | 438
      8388608 | 457
     16777216 | 435
     33554432 | 451
Comment 10 Philip Withnall 2018-01-04 11:36:00 UTC
He also says:
 * 1MiB is an upper bound on the buffer size which could be used (in his case) because it’s the largest pipe buffer size which an unprivileged process can set (by default)
 * coreutils ensures its buffers are page-aligned, but he didn’t find that this made a significant difference in throughput
Comment 11 Andrés Souto 2018-01-07 19:56:24 UTC
Yes my local ssd is using ext4. Here are some benchmarks:


Copying a 1GB file from local SSD to USB stick:

blocksize (kB)| transfer rate (MB/s)
--------------+----------------------
           64 | 2.54
          128 | 2.47
          256 | 2.57
          512 | 2.41
         1024 | 2.48

Note: I did not empty cache between execution in this test but I supposed it wouldn't make a difference. I did also some test with a smaller file and results are nearly the same for every block size (2.71 MB/s 64kB, 2.79MB/s 1024kB)


In the next benchmarks reads and writes are done to a tmpfs, cache was emptied before each test with "sync; echo 3 >/proc/sys/vm/drop_caches" and I executed most of them 3 times and calculate the mean value and the standard deviation.

I used "time -f %e" to obtain the time needed by "gio copy" to copy a 2GB file.


Read from ext4 SSD

blocksize (kB)| transfer rate (MB/s)
--------------+----------------------
           64 | 389.85 σ=1.13
          128 | 386.91 σ=2.49
          256 | 392.09 σ=1.89
          512 | 402.67 σ=5.42
         1024 | 424.90 σ=0.88


Write to ext4 SSD

blocksize (kB)| transfer rate (MB/s)
--------------+----------------------
           64 | 285.66 σ=3.41
          256 | 287.55 σ=20.92
         1024 | 282.84 σ=9.75


Read from cifs

blocksize (kB)| transfer rate (MB/s)
--------------+----------------------
           64 | 106.23 σ=1.03
          128 | 109.07 σ=1.58
          256 | 106.52 σ=4.50
          512 | 109.01 σ=1.36
         1024 | 108.58 σ=1.09


Write to cifs

blocksize (kB)| transfer rate (MB/s)
--------------+----------------------
           64 | 51.52 σ=0.99
          128 | 65.41 σ=0.57
          256 | 76.64 σ=0.16
          512 | 85.72 σ=1.33
         1024 | 90.21 σ=0.72


Read from nfs

blocksize (kB)| transfer rate (MB/s)
--------------+----------------------
           64 | 108.80 σ=1.61
          128 | 107.96 σ=0.83
          256 | 108.85 σ=0.96
          512 | 107.49 σ=0.36
         1024 | 108.67 σ=0.67


Write to nfs

blocksize (kB)| transfer rate (MB/s)
--------------+----------------------
           64 | 85.12
          128 | 85.25 σ=0.13
          256 | 84.51 σ=0.84
          512 | 85.01 σ=0.61
         1024 | 84.81 σ=0.65

So apparently buffer size makes no difference except for writing to cifs and for reading from ext4 ssd. The first one could make sense because cifs module is not smart enough to wait for more data in order to minimize control traffic but no idea about the second one.

It also caught my attention that reading/writing from/to the ssd makes a bit of noise with 64kb, 128kB or 256kB buffer but it doesn't with 512kB or 1MB buffer.
Comment 12 Philip Withnall 2018-01-09 12:35:14 UTC
Seems like 1MiB buffers are a safe choice to go with, given they beat other buffer sizes, or are not particularly slower, on all of those benchmarks. Thanks for doing the benchmarks. :-)

Andrés, do you want to update your patch to go with a 1MiB buffer size? Please make sure it includes a comment which points to the analysis here.
Comment 13 Philip Withnall 2018-01-09 12:44:09 UTC
Review of attachment 366049 [details] [review]:

(Marking as needs-work accordingly.)
Comment 14 Andrés Souto 2018-01-20 22:43:01 UTC
Created attachment 367162 [details] [review]
gio: bump splice copy buffer size to 1024k
Comment 15 Philip Withnall 2018-01-22 11:09:00 UTC
Review of attachment 367162 [details] [review]:

Please include a comment in the code pointing to the analysis in this bug.

::: gio/gfile.c
@@ +2909,3 @@
   if (!g_unix_open_pipe (buffer, FD_CLOEXEC, error))
     return FALSE;
+  fcntl(buffer[1], F_SETPIPE_SZ, 1024*1024);

I suspect you need to check the return value of this call. If the kernel pipe size limits are lower than 1MiB, the do_splice() call below will end up trying to splice 1MiB of data into a pipe which is not big enough, which will probably fail. (I haven’t analysed the failure mode though.)

fcntl() returns the actual set capacity of the pipe, so you should pass that to the do_splice() call.  If fcntl() returns EPERM or EBUSY, it’s probably safest to call fcntl(F_GETPIPE_SZ) and use the pipe size from that in the call to do_splice() below.
Comment 16 Andrés Souto 2018-02-06 22:04:15 UTC
Created attachment 367966 [details] [review]
gio: bump splice copy buffer size to 1024k
Comment 17 Andrés Souto 2018-02-07 10:40:17 UTC
This is an update with last discoveries:

The cifs module patch is already accepted: https://github.com/torvalds/linux/commit/cd1aca29fa020b6e6edcd3d5b3e49ab877d1bed7

Slow transfer rate is directly related with how small is the payload size being sent in each SMB message. Theoretical transfer rate is achieve when payload is 1 MB per message.

If the server uses samba 4.6, each message carries 1 MB of payload independently of the buffer size used. Kernel patch and this patch makes no difference here.

If the server uses samba 4.4 (this is the case for the server where I discovered the problem) only 4096 bytes are sent by message. Kernel patch fixes this and allows each message to contain the same data as the buffer size. Finally, increasing the buffer to 1 MB (glib patch) makes it behave as expected also in this situation.

No idea why everything worked perfect with samba 4.6 version but it does not with 4.4 version.
Comment 18 Steve Christen 2018-02-13 05:56:26 UTC
What's about samba 4.7?
Comment 19 Andrés Souto 2018-02-15 20:15:00 UTC
(In reply to Steve Christen from comment #18)
> What's about samba 4.7?

It behaves like samba 4.6 (i.e.: 1MB per message using 64kB buffer).
Comment 20 Philip Withnall 2018-02-16 11:50:43 UTC
Review of attachment 367966 [details] [review]:

::: gio/gfile.c
@@ +2915,3 @@
+  if (buffer_size < 0)
+    {
+      buffer_size = fcntl(buffer[1], F_GETPIPE_SZ);

Having another think about it, I worry what will happen if fcntl() returns -1 (error) here. I’m going to rework this patch locally and push it.
Comment 21 Philip Withnall 2018-02-16 12:02:12 UTC
I pushed a modified version of the patch which adds a bit more error handling, in case the fcntl() calls fail or return 0 (for some reason).

The following fix has been pushed:
a5778ef gio: bump splice copy buffer size to 1024k
Comment 22 Philip Withnall 2018-02-16 12:02:19 UTC
Created attachment 368412 [details] [review]
gio: bump splice copy buffer size to 1024k

This change increases throughput when copying files for some filesystems

(Modified by Philip Withnall <withnall@endlessm.com> to add more error
handling.)