Bug 776339 – large file throughput via gvfs-smb is over 2x slower than smbclient or fstab mount.cifs

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 776339 - large file throughput via gvfs-smb is over 2x slower than smbclient or fstab mount.cifs


Summary:	large file throughput via gvfs-smb is over 2x slower than smbclient or fstab ...


Status:	RESOLVED OBSOLETE

Product:	gvfs
Classification:	Core
Component:	smb backend
Version:	1.28.x
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	gvfs-maint
QA Contact:	gvfs-maint

URL:
Whiteboard:

Depends on:	771022
Blocks:

Reported:	2016-12-21 11:29 UTC by Oliver Schonrock
Modified:	2018-09-21 18:04 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
smb: Introduce pull method (prototype) (5.47 KB, patch) 2017-01-09 12:03 UTC, Ondrej Holy	none	Details \| Review
smb: Add initial pull support (6.31 KB, patch) 2017-01-18 12:43 UTC, Ondrej Holy	none	Details \| Review
smb: Add heuristic to determine buffer size (3.90 KB, patch) 2017-01-18 12:44 UTC, Ondrej Holy	none	Details \| Review

Description Oliver Schonrock 2016-12-21 11:29:27 UTC

Similar issue to this:

https://bugzilla.gnome.org/show_bug.cgi?id=688079

But at higher speeds. Full gigabit network => expect ~100MB/s throughput

Server: FreeBSD 10.3, Samba 4.3, fast ZFS pool capable of > 150MB/s reads/writes
Client: Ubuntu 16.04 LTS, fast RAID-0 disk array capable of > 150MB/s reads/writes
Network: Client => Switch => Server  (all 1Gbps rated with Intel NICs etc)
testfile: 619.98 MB mp4 file => uncompressible garbage

WORKING CONTROL CASE #1 

fstab: 

//server/share  /mnt/share  cifs  uid=oliver,username=...,password=...,iocharset=utf8,sec=ntlm  0  0

mount -a 
cp /mnt/share/testfile .  (620MB file in 6.2s => 100MB/s totally consistent over multiple runs)

WORKING CONTROL CASE #2 

smbclient //server/share pw.. -U ... -c 'get testfile'
(620MB file in 5.6s => 110.7MB/s totally consistent over multiple runs)


SLOW GVFS CASE:
---------------

# create gvfs mount using nautilus
cp /var/run/user/1000/gvfs/smb-share\:server\=server\,share\=share/testfile .
(620MB file in 14.0s => 44.2MB/s quite consistent over multiple runs 0.5s variation)

(I also tested write (ie upload to samba server) performance. Results for working and slow gvfs case, are very similar to read/download). 


THEORY: BLOCKSIZE
-----------------
I understand that gvfs-smb backend uses libsmbclient and that smbget also uses that lib. So let's try smbget

smbget -u ... -p ... 'smb://server/share/testfile'
(620MB file in 13.5s => 46MB/s  .. SURPRISINGLY SIMILAR TO GVFS)

from smbget man page:

       -b, --blocksize
           Number of bytes to download in a block. Defaults to 64000.

try bigger blocksize (multiple trials show performance tops out at 16MB block size):

smbget --blocksize 16777216 -u ... -p ... 'smb://server/share/testfile'
(620MB file in 6.2s => 100MB/s  .. performance now roughly equiv to control cases)

There was this commit in 2013, which added the "-o big_writes" option

https://git.gnome.org/browse/gvfs/commit/?id=8835238

confirm we are using that:

$ ps waux | grep fuse
.. /usr/lib/gvfs/gvfsd-fuse /run/user/1000/gvfs -f -o big_writes

yes, we are. 

The above commit hard codes "conn->max_write" to 64kB if -o big_writes is passed.  

I DO NOT NOW IF conn->max_write is related to "smbget --blocksize" via underlying libsmbclient? 

If they are related, then making conn->max_write even bigger (16MB?) or making it end-user tunable via command line option to gvfsd-fuse might resolve the issue?

Please consider. I am available for testing/compiling/feedback...

Thank you.

Comment 1 Oliver Schonrock 2016-12-21 13:30:52 UTC

some src code forensics on block size:

in smbget (where I had success in improving throughput by increasing blocksize)

I could find that:

# source3/utils/smbget.c:L585
    bytesread = smbc_read(remotehandle, readbuf, opt.blocksize);

where opt.blocksize comes directly from --blocksize command line option

So blocksize is the 3rd param to smbc_read (which is from libsmbclient)

grep'ing the source of gvfs, I only find 2 calls to smbc_read:



./daemon/gvfsbackendsmb.c:1137: 
static gboolean copy_file
...
   char buffer[4096];
...
   res = smbc_read (backend->smb_context, from_file, buffer, sizeof(buffer)); 
...

and the second call is in:


./daemon/gvfsbackendsmb.c:829:
static void do_read
....

  /* libsmbclient limits blocksize to (64*1024)-2 for Windows servers,
   * let's do the same here to achieve reasonable performance (#588391)
   * TODO: port to pull mechanism (#592468)
   */
  if (bytes_requested > 65534)                  
    bytes_requested = 65534;

  res = smbc_read (op_backend->smb_context, (SMBCFILE *)handle, buffer, bytes_requested);

...

and there seems to be some recent discussion about the code limiting bytes_requested to 65534 here:
https://lists.samba.org/archive/samba/2016-October/204225.html

That discussion makes it sound like, some devs are aware that limiting the block size in gvfs do_read is not the right place (should be in libsmbclient, which can do so intelligently), and maybe this can be solved, given all that code is trying to deal with legacy upstream bugs...

Is there a concrete bug report which is dealing with the block size selection when calling libsmbclient->smbc_read?

this one? https://bugzilla.gnome.org/show_bug.cgi?id=592468  (very old)

I couldn't find whether conn->max_write relates to this at all (probably not).

Comment 2 Ondrej Holy 2016-12-21 14:43:13 UTC

Thanks for your bug report.

It is not really fair to compare the performance of the SMB backend over the fuse mount point. GVfs is designed to be used over GIO API, so you should use  e.g. gvfs-copy instead (fuse may introduce additional slowdown)...

I am not sure, but you can maybe override the conn->max_write if you start gvfsd-fuse with -o max_write=... conn->max_write relates to write speed over fuse mount point, but it would be better to test first over GIO API to see where is the problem. This options should not affect download speed from smb share at all..

Good point with the buffer size. I closed the following bug, because smbget had same speed (without additional options):
https://bugzilla.gnome.org/show_bug.cgi?id=738535
https://bugzilla.gnome.org/show_bug.cgi?id=762384

Yes, there were proposed/pushed some improvements recently:
https://bugzilla.gnome.org/show_bug.cgi?id=773632
https://bugzilla.gnome.org/show_bug.cgi?id=773826
https://bugzilla.gnome.org/show_bug.cgi?id=773823

Can you test with GVfs master which includes fix from Bug 773632 removing the read buffer limitation from backend? Can you rebuild it also with fix from Bug 773826?

Comment 3 Oliver Schonrock 2016-12-21 15:49:06 UTC

(In reply to Ondrej Holy from comment #2)
> It is not really fair to compare the performance of the SMB backend over the
> fuse mount point. GVfs is designed to be used over GIO API, so you should
> use  e.g. gvfs-copy instead (fuse may introduce additional slowdown)...

Thanks for your response. Fair point about fuse, so I tried directly with gvfs-copy:

rm -f testfile; time gvfs-copy -p 'smb://server/share/testfile' .; ls -l testfile 
Transferred 650.1 MB out of 650.1 MB (50.0 MB/s)

real	0m13.108s
user	0m0.080s
sys	0m1.448s
... 650100878 Jun 22 22:21 testfile

results vary somewhat, between 12.7 and 14s. So broadly inline with the "via fuse" results, maybe very slightly faster. But still around 47MB/s rather than 100MB/s. So not much point fiddling around with the fuse related conn->max_write. Thanks for clarifying what this is for. 

NOTE: the "-p" option to gvfs-copy reports in "millions of bytes" (ie 10^6), NOT in Megabytes (ie 2^20) like my previous report. Hence the ls -l  to make clear the actual size of the file. 

I will look into your other bugs/patches in a bit...

Comment 4 Oliver Schonrock 2016-12-22 16:13:37 UTC

Sorry took me a while to setup a build environment, because I was running U16.04. Now in a VM on U16.10 and have followed this:
https://wiki.gnome.org/Newcomers/BuildGnome

Please let me know if I am making any rookie errors here. 

Because I am now on a VM the performance is not quite as reliable, but by giving the VM 4 CPU cores and using a bridged NIC and piping output of smbget to /dev/null, I was able to get "near-metal" performance using smbget with large blocksize:

rm -f testfile; time smbget --blocksize 16000000 -O 'smb://murrays/videos/testfile' > /dev/null
smb://murrays/videos/testfileMB (98.45%) at 87.19MB/s ETA: 00:00:00
Downloaded 619.98MB in 7 seconds

real	0m6.829s
user	0m0.208s
sys	0m4.340s

now for gvfs-copy

first the U16.10 installed version

oliver@U16-10gvfs:~/Desktop$ gvfs-copy --version
gvfs 1.28.2  (actually the same as U16.04??)

oliver@U16-10gvfs:~/Desktop$ gvfs-copy -p 'smb://murrays/videos/testfile' .Transferred 650.1 MB out of 650.1 MB (9.0 MB/s)

That's really bad. 5x slower than on 16.04 real metal and 10x slower than smbget. Not sure if it's the VM or the newer version of gvfs / OS

now the freshly compiled git-master version

oliver@U16-10gvfs:~/jhbuild/checkout/gvfs$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working tree clean

oliver@U16-10gvfs:~/jhbuild/checkout/gvfs$ jhbuild run gvfs-copy --version
This tool has been deprecated, use 'gio copy' instead.

oliver@U16-10gvfs:~/jhbuild/checkout/gvfs$ jhbuild run gio version
2.51.0

oliver@U16-10gvfs:~/jhbuild/checkout/gvfs$ jhbuild run gio copy -p 'smb://murrays/videos/testfile' ~/Desktop/
Transferred 650.1 MB out of 650.1 MB (8.7 MB/s)

not good... no point going further at the moment

Any ideas of where I am going wrong here? I seem to have gone backwards by 5x, eventhough I can get smbget (using the same libsmbclient) to operate at as close to wire speed as I would expect given it's a VM...

Comment 5 Oliver Schonrock 2016-12-22 16:21:57 UTC

ok, testing with smbget and the default 64,000 blocksize:

oliver@U16-10gvfs:~/Desktop$ rm -f testfile; time smbget -U'oliver%hQnmbWL6' -O 'smb://murrays/videos/testfile' > /dev/null
(99.99%) at 9.12MB/s ETA: 00:00:00533
Downloaded 619.98MB in 68 seconds

real	1m7.754s
user	0m0.724s
sys	0m1.836s

which is the same BAD performance that the installed gvfs-copy and the git-master gio copy binaries provide. 

It's just that all 3 libsmbclient binaries perform 5x worse (with 64kB block size) than on real metal and U16.04LTS (not sure which is the culprit)...

btw the libsmbclient is same version on 16.10 and 16.04 as well:
/usr/lib/x86_64-linux-gnu/libsmbclient.so.0.2.3

I have noticed that without anything less than the 16MB block size that the "progress indicator" during the transfer, literally freezes regularly for fractions of the a second... buffer empty...?

Comment 6 Oliver Schonrock 2016-12-22 16:50:09 UTC

now trying the patch to gvfsreadchannel.c

https://bug773826.bugzilla-attachments.gnome.org/attachment.cgi?id=341541&action=diff&collapsed=&context=patch&format=raw&headers=1

oliver@U16-10gvfs:~/jhbuild/checkout/gvfs$ git diff
diff --git a/daemon/gvfsreadchannel.c b/daemon/gvfsreadchannel.c
index f219ded..b6b6a93 100644
--- a/daemon/gvfsreadchannel.c
+++ b/daemon/gvfsreadchannel.c
@@ -118,16 +118,18 @@ modify_read_size (GVfsReadChannel *channel,
     real_size = 16*1024;
   else if (channel->read_count <= 4)
     real_size = 32*1024;
-  else
+  else if (channel->read_count <= 5)
     real_size = 64*1024;
+  else
+    real_size = 128*1024;
 
   if (requested_size > real_size)
       real_size = requested_size;
 
   /* Don't do ridicoulously large requests as this
      is just stupid on the network */
-  if (real_size > 128 * 1024)
-    real_size = 128 * 1024;
+  if (real_size > 256 * 1024)
+    real_size = 256 * 1024;
 
   return real_size;
 }

$ jhbuild buildone gvfs
...
*** success *** [1/1]

oliver@U16-10gvfs:~/jhbuild/checkout/gvfs$ jhbuild run gio copy -p 'smb://murrays/videos/testfile' ~/Desktop/
Transferred 650.1 MB out of 650.1 MB (8.3 MB/s)

no change...

no hacking the patch to make a "stupidly large" max requested size of 16MB..

oliver@U16-10gvfs:~/jhbuild/checkout/gvfs$ git diff
diff --git a/daemon/gvfsreadchannel.c b/daemon/gvfsreadchannel.c
index f219ded..424ddfa 100644
--- a/daemon/gvfsreadchannel.c
+++ b/daemon/gvfsreadchannel.c
@@ -118,16 +118,18 @@ modify_read_size (GVfsReadChannel *channel,
     real_size = 16*1024;
   else if (channel->read_count <= 4)
     real_size = 32*1024;
-  else
+  else if (channel->read_count <= 5)
     real_size = 64*1024;
+  else
+    real_size = 16*1024*1024;
 
   if (requested_size > real_size)
       real_size = requested_size;
 
   /* Don't do ridicoulously large requests as this
      is just stupid on the network */
-  if (real_size > 128 * 1024)
-    real_size = 128 * 1024;
+  if (real_size > 16*1024*1024)
+    real_size = 16*1024*1024;
 
   return real_size;
 }
oliver@U16-10gvfs:~/jhbuild/checkout/gvfs$ jhbuild run gio copy -p 'smb://murrays/videos/testfile' ~/Desktop/
Transferred 650.1 MB out of 650.1 MB (8.2 MB/s)


no change again...somehow, I am not confident that


jhbuild buildone gvfs
jhbuild run gio copy  ....

is actually running a newly compiled binary?

Comment 7 Oliver Schonrock 2016-12-22 18:20:24 UTC

OK, PROGRESS!

I figured out thanks to this:

http://stackoverflow.com/questions/36717024/running-gvfs-after-building

how I could run my customised build of gvfsd

$ jhbuild run ~/jhbuild/install/libexec/gvfsd -r

oliver@U16-10gvfs:~/jhbuild/checkout/gvfs$ time jhbuild run gio copy -p 'smb://murrays/videos/testfile' ~/Desktop/
get to line: 153 in gio-tool-copy.c
Transferred 650.1 MB out of 650.1 MB (92.9 MB/s)

real	0m8.409s
user	0m0.208s
sys	0m0.748s

BETTER!

will play more to see how much buffer I really need.

Comment 8 Oliver Schonrock 2016-12-22 19:10:40 UTC

now that I know a bit more what I am doing with the jhbuild environment vis-a-vis gvfsd, restest the various cases:

using straight git master:

$ time jhbuild run gio copy -p 'smb://murrays/videos/testfile' ~/Desktop/
Transferred 650.1 MB out of 650.1 MB (8.9 MB/s)

real	1m13.405s

and with the patch from 773826:

$ patch --strip 1 < 773826.patch
$ jhbuild buildone -n gvfs
$ jhbuild run ~/jhbuild/install/libexec/gvfsd -r
$ jhbuild run gio mount 'smb://murrays/videos' 

$ time jhbuild run gio copy -p 'smb://murrays/videos/testfile' ~/Desktop/
Transferred 650.1 MB out of 650.1 MB (9.4 MB/s)
real	1m10.037s

No real improvement at 128kB buffer size

Try 1MB max:

$ time jhbuild run gio copy -p 'smb://murrays/videos/testfile' ~/Desktop/
Transferred 650.1 MB out of 650.1 MB (31.0 MB/s)

real	0m22.134s

Try 2MB max:

$ time jhbuild run gio copy -p 'smb://murrays/videos/testfile' ~/Desktop/
Transferred 650.1 MB out of 650.1 MB (46.4 MB/s)
real	0m14.864s

Try 4MB max:
$ time jhbuild run gio copy -p 'smb://murrays/videos/testfile' ~/Desktop/
Transferred 650.1 MB out of 650.1 MB (81.3 MB/s)
real	0m8.678s

Try 8MB max:

$ time jhbuild run gio copy -p 'smb://murrays/videos/testfile' ~/Desktop/
Transferred 650.1 MB out of 650.1 MB (92.9 MB/s)
real	0m8.683s

Try 16MB max:

$ time jhbuild run gio copy -p 'smb://murrays/videos/testfile' ~/Desktop/
Transferred 650.1 MB out of 650.1 MB (81.3 MB/s)
real	0m8.426s

(NOTE: the MB/s shows is from the progress indicator and not accurate, look at the elapsed wall clock time "real...")

For my situation, after 4MB buffer size the improvement is marginal / non-existent. 8.6s elapsed time is equivalent to 72MB/s.

Note that the initial ramp-up logic in modify_read_size() is quite visible by eye, and that this is a VM. For a bigger file, where the ramp-up is irrelevant, and "real metal" I suspect that 4-8MB buffer size would give me "near-wire-speed".

My patch is below. Same as #773826 with bigger numbers and some comments. As per those comments, what is the best way to expose this max-buffer size as a tunable to the end user, given that ~8MB is perhaps not reasonable for all?

modfied patch:

diff --git a/daemon/gvfsreadchannel.c b/daemon/gvfsreadchannel.c
index f219ded..d219905 100644
--- a/daemon/gvfsreadchannel.c
+++ b/daemon/gvfsreadchannel.c
@@ -118,16 +118,24 @@ modify_read_size (GVfsReadChannel *channel,
     real_size = 16*1024;
   else if (channel->read_count <= 4)
     real_size = 32*1024;
-  else
+  else if (channel->read_count <= 5)
     real_size = 64*1024;
+  else
+    /* see below regarding tunable huge buffer sizes */
+    real_size = 8 * 1024 *1024;
 
   if (requested_size > real_size)
       real_size = requested_size;
 
   /* Don't do ridicoulously large requests as this
-     is just stupid on the network */
-  if (real_size > 128 * 1024)
-    real_size = 128 * 1024;
+     is just stupid on the network.
+     8MB is much larger than the previous 256kB in patch #773826 
+     Is this unreasonably large as a default? Probably yes.
+     would it be possible to create a tunable setting so users with 
+     large/fast machines/networks can optimize this?
+  */
+  if (real_size > 8 * 1024  * 1024)
+    real_size = 8 * 1024  * 1024;
 
   return real_size;
 }

Comment 9 Oliver Schonrock 2016-12-22 19:49:07 UTC

Just for the record, the ramp-up (smallish file size) does not appear to be the problem in achieving wire speed. More of an issue was the VM modelled HDD, which is clearly not as fast as the real metal.

By writing to a tmpfs location, I can get better performance with 16MB buffer:

$ time jhbuild run gio copy -p 'smb://murrays/videos/testfile' /dev/shm
Transferred 650.1 MB out of 650.1 MB (108.4 MB/s)
real	0m6.552s

6.552s is equiv to: 94MB/s which is close enough to wire speed for me.

Comment 10 Oliver Schonrock 2016-12-22 20:10:16 UTC

On a broader note, it bugs me that such huge buffers are required to make gvfs-smb work well. I believe the reason is the underlying libsmbclient as the same symptoms hold for smbget.

Some small experimentation shows it might be protocol support:

https://www.samba.org/samba/docs/man/manpages/libsmbclient.7.html

[global]
client min protocol = SMB2

results in:

$ gio mount 'smb://murrays/videos' 
Error mounting location: Failed to mount Windows share: Connection timed out

ie it seems that libsmbclient does not support SMB2...might explain some of it...that's for upstream..

Comment 11 Oliver Schonrock 2016-12-22 21:02:54 UTC

sorry that last theory is incorrect. if you put:

[global]
client max protocol = SMB3
client min protocol = SMB3

it works..ie the all the client tools support SMB2/3 since samba 4.1:
https://www.samba.org/samba/history/samba-4.1.0.html

One issue certainly is, in my VM, that "gio copy" AND smbclient max out one CPU core. On bare metal they consume about 35% CPU. That's VMs for you. 

Anyway, over to you. Hopefully a tunable SMB_MAX_REQUEST_BYTES or similar can be added so that "gio copy" can perform at wirespeed on largish hardware/network.

Comment 12 Oliver Schonrock 2016-12-22 22:03:14 UTC

sorry, to be a pain, all the above is great and actionable I think for READ, but for WRITE it solves nothing (obviously):

$ time jhbuild run gio copy -p /dev/shm/testfile 'smb://murrays/videos/testfile.bak' 
Transferred 650.1 MB out of 650.1 MB (27.1 MB/s)
real	0m24.697s

I didn't manage to trace the write calls very well, I got as far as:

gvfsbackendsmb.c:L1161
static void
do_write (GVfsBackend *backend,
          GVfsJobWrite *job,
          GVfsBackendHandle _handle,
          char *buffer,
          gsize buffer_size)
{
  GVfsBackendSmb *op_backend = G_VFS_BACKEND_SMB (backend);
  SmbWriteHandle *handle = _handle;
  ssize_t res;
  smbc_write_fn smbc_write;

  smbc_write = smbc_getFunctionWrite (op_backend->smb_context);
  res = smbc_write (op_backend->smb_context, handle->file,
                                        buffer, buffer_size);
  if (res == -1)

called from...

gvfsjobwrite.c:L127

static void
run (GVfsJob *job)
{
  GVfsJobWrite *op_job = G_VFS_JOB_WRITE (job);
  GVfsBackendClass *class = G_VFS_BACKEND_GET_CLASS (op_job->backend);

  if (class->write == NULL)
    {
      g_vfs_job_failed (job, G_IO_ERROR, G_IO_ERROR_NOT_SUPPORTED,
			_("Operation not supported by backend"));
      return;
    }

  class->write (op_job->backend,
		op_job,
		op_job->handle,
		op_job->data,
		op_job->data_size);
}

but not sure where the job params are created/set? (incl data and data_size)

Comment 13 Ondrej Holy 2017-01-05 13:02:58 UTC

Thanks for the debugging...

(In reply to Oliver Schonrock from comment #5)
> ... 
> I have noticed that without anything less than the 16MB block size that the
> "progress indicator" during the transfer, literally freezes regularly for
> fractions of the a second... buffer empty...?

That's because the progress is reported after the each block, so bigger block means bigger "freeze"...

(In reply to Oliver Schonrock from comment #8)
> ... 
> My patch is below. Same as #773826 with bigger numbers and some comments. As
> per those comments, what is the best way to expose this max-buffer size as a
> tunable to the end user, given that ~8MB is perhaps not reasonable for all?

Yes, this is definitely unreasonable large, this wouldn't work generally. But there is another way what we can do... will write below.

(In reply to Oliver Schonrock from comment #11)
> sorry that last theory is incorrect. if you put:
> 
> [global]
> client max protocol = SMB3
> client min protocol = SMB3
> 
> it works..ie the all the client tools support SMB2/3 since samba 4.1:
> https://www.samba.org/samba/history/samba-4.1.0.html

Ok, it works and does it have any effect (on the performance)?

(In reply to Oliver Schonrock from comment #12)
> sorry, to be a pain, all the above is great and actionable I think for READ,
> but for WRITE it solves nothing (obviously):
> 
> ...
> 
> but not sure where the job params are created/set? (incl data and data_size)

I suppose it comes from the following (see Bug 773823):
https://git.gnome.org/browse/glib/tree/gio/gfile.c#n2779

But the same problem is also here, the buffer can't be increased up to several MB... but again we can use another trick.

I think what we need here is an implementation of pull/push methods (see Bug 592468). Pull/push methods have source and destination paths as arguments, so e.g. buffer sizes can be determined by the backend, not by channel, or client side. Those methods are used for transfers between local and remote filesystem only (so it won't speed up transfers between two remote filesystems...). Some other backends already use those methods in order to speed up the transfers.

I am adding this on my todo, however, patches are welcome if you are willing to help...

Comment 14 Ondrej Holy 2017-01-05 14:42:39 UTC

Basic push/pull implementation is really simple, but still we have to find some way, how to determine right buffer size. I afraid of we can't hardcode 8MB buffer for example. Have to look on cifs source codes...

Jeremy, don't you have any comment regarding this?

Comment 15 Oliver Schonrock 2017-01-05 17:39:38 UTC

(In reply to Ondrej Holy from comment #14)
> Basic push/pull implementation is really simple, 

sound great, especially if that addresses both directions

> but still we have to find
> some way, how to determine right buffer size. I afraid of we can't hardcode
> 8MB buffer for example. Have to look on cifs source codes...

only options I see are:

1. hardcode huge sizes (not acceptable)

2. provide a tunables for min/max read/write buffer sizes via dconf or similar

3. write some kind of adaptive algorithm which detects that the "buffer empty" condition is preventing max throughput and increases it on demand

4. try to figure out what mount.cifs kernel module is doing (eg pre-fetching 2 64kb buffers, see above)

5. call mount.cifs and let it worry about it

The last one is a bit left field but is what Steve French suggested when I emailed him about this bug:

On 24/12/16 21:07, Steve French wrote:
> I am curious why we don't simply have a pass through for the gnome/kde
> libs to call the kernel - or have the libsmbclient libs call kernel
> for the basic stuff that is better in kernel (open/read/write/close)
> and leave user space to do the other things that work better in
> userspace (listing shares, rpc management calls, change notify,
> witness protocol failover notification etc.)

Steve is the core maintainer of mount.cifs and a member of the samba project. 

For me Option:

- 2 is simplest and will get it done. 

- 4 seems interesting if libsmbclient API actually offers suitable calls to even do that. 

- 5 would be great, less reinventing the wheel, but needs serious collaboration. 

Which do you fancy?

Comment 16 Jeremy Allison 2017-01-05 17:46:52 UTC

Steve likes cifsfs because he wrote it :-). However using it would introduce a dependency in gnome on the Linux kernel to supply SMB client support - would break on FreeBSD for example (their kernel support is different). Plus there are embedded Linux versions that don't compile in the cifsfs kernel support.

It's not a viable cross-platform solution I'm afraid.

Comment 17 Oliver Schonrock 2017-01-05 17:54:03 UTC

Ok, I suepected something like that, so that leaves us with:

2. Tunables: the simple option and my favourite to get some solution out that makes a difference to users lives, since this has been going on for years...

3. adaptive algorithm...: this is vapourware?

4. learn from mount.cifs and adapt the way gvfs calls libsmbclient (buffere pre-fetch etc), or if libsmbclient API is not sufficient for that, then get (Steve?) to enhance libsmbclient API so we can make smarter calls?

The obvious way to go seems to be:

Short term: Tunables

Medium/long term: make libsmbclient and gvfs "smarter" 

Agree?

Comment 18 Ondrej Holy 2017-01-09 12:03:58 UTC

Created attachment 343147 [details] [review]
smb: Introduce pull method (prototype)

Currently, the pull operation provided by this patch blocks the backend
during a transfer, which is not acceptable for a real use. It should
run on separate thread instead (we need something similar to Bug 771022).
But it can be used for testing...

Default buffer size is 1MB. GVFS_SMB_BUFFER_SIZE environment variable can
be used to tweak the buffer size, e.g.:
GVFS_SMB_BUFFER_SIZE=2097152 # 2MB
GVFS_SMB_BUFFER_SIZE=4194304 # 4MB
GVFS_SMB_BUFFER_SIZE=8388608 # 8MB
GVFS_SMB_BUFFER_SIZE=16777216 # 16MB

Comment 19 Ondrej Holy 2017-01-09 12:04:56 UTC

So, I've attached patch which implements pull method (i.e. downloading). However, I don't see any significant speedup. Respectively, gvfs with/without pull, and mount -t cifs is more-or-less equivalent as per my testing on localhost. Can you test it in your environment?

Comment 20 Oliver Schonrock 2017-01-09 13:15:48 UTC

Thanks Ondrej. Result of testing below. Short answer..it works!

# using VM ubuntu 16.10 as before, with git master gvfs as base

# control case: git master plus 773826.patch with 16MB buffer size, ie our best case from previous testing in comment 8

rm -f /dev/shm/testfile; time jhbuild run gio copy -p 'smb://murrays/videos/testfile' /dev/shm; 
Transferred 650.1 MB out of 650.1 MB (92.9 MB/s)

real	0m7.464s

# with clean git master, no patches..slow as expected because channel buff size @64k, not patched with 773826.patch
rm -f /dev/shm/testfile; time jhbuild run gio copy -p 'smb://murrays/videos/testfile' /dev/shm; 
Transferred 650.1 MB out of 650.1 MB (8.6 MB/s)

real	1m16.440s

# with git master plus ondrey-776339-18.patch. no ENV, ie default SMB_BUFFER_SIZE = 1MB, already faster
rm -f /dev/shm/testfile; time jhbuild run gio copy -p 'smb://murrays/videos/testfile' /dev/shm; 
Transferred 650.1 MB out of 650.1 MB (25.0 MB/s)

real	0m26.443s

# try to increase buffer size to 16MB 
export GVFS_SMB_BUFFER_SIZE=16777216
jhbuild run ~/jhbuild/install/libexec/gvfsd -r

rm -f /dev/shm/testfile; time jhbuild run gio copy -p 'smb://murrays/videos/testfile' /dev/shm;
Transferred 650.1 MB out of 650.1 MB (92.9 MB/s)

real	0m7.291s

# YUP! that works. speed as good as previous 773826.patch. 85MB/s is very close to wire speed given this is a VM.

Comment 21 Oliver Schonrock 2017-01-09 13:28:17 UTC

regarding the fact that you did not see improvement over mount.cifs or over unpatched git master, 2 theories:

1. something in your environment is not fast enough and therefore it becomes the bottleneck rather than the libsmbclient->smbc_read buffer size

2. because you are testing localhost your packet roundtrip time is near zero, so asking for many small buffers is (nearly) as fast as asking for a few big ones. 

Interestingly my ping time to the server is 0.7ms and at 100MByte/s a 64kb buffer would theoretically take 0.625ms, so we can see how round trip time can quickly become significant. Over a wireless typical network ping times are often 3 to 10ms ie 4 to 15 times slow than my wired Gbit network... 

Is this all about latency? hence the mount.cifs 2 buffer pre-fetch?

Given your success with pull should we try push next?

Comment 22 Oliver Schonrock 2017-01-09 21:42:31 UTC

realised my git wasn't "pulled" and saw that 773826 has actually been pushed. so I did a git pull and now gvfs doesn't compile for me:

/home/oliver/jhbuild/checkout/gvfs/common/gvfsdnssdresolver.c: In function ‘g_vfs_dns_sd_resolver_resolve’:
/home/oliver/jhbuild/checkout/gvfs/common/gvfsdnssdresolver.c:1166:7: error: label ‘out’ used but not defined
       goto out;


am I being dumb?

Comment 23 Ondrej Holy 2017-01-10 08:34:44 UTC

Sorry for inconviences, see commit 1805ea6.

Comment 24 Oliver Schonrock 2017-01-10 09:48:50 UTC

thanks. This is all looking quite promising. What's the plan now?

1. Is it realistic to use a similar approach for do_push? (other than limits on your time). To prove that is provides the same speedup. 

2. Solve the "blocking the backend issue" for both pull and push? ( I don't quite understand this issue). From the linked discussion, this might not be that easy? Dependent on old one-smb-context-per-thread prototype code from Lagerwall which was segfaulting? 

3. find a way to use the ENV vars. It is not clear to me where these would be set. Is that distro dependent? Couldn't find how gvfsd is actually started in a real system. dbus-1?

Comment 25 Ondrej Holy 2017-01-11 14:05:05 UTC

(In reply to Oliver Schonrock from comment #24)
> thanks. This is all looking quite promising. What's the plan now?
> 
> 1. Is it realistic to use a similar approach for do_push? (other than limits
> on your time). To prove that is provides the same speedup. 

I suppose so...

> 2. Solve the "blocking the backend issue" for both pull and push? ( I don't
> quite understand this issue). From the linked discussion, this might not be
> that easy? Dependent on old one-smb-context-per-thread prototype code from
> Lagerwall which was segfaulting? 

Have to discuss this with Ross, but I did not see any segfaults with the patches I proposed on that bug, so it should be ok...

> 3. find a way to use the ENV vars. It is not clear to me where these would
> be set. Is that distro dependent? Couldn't find how gvfsd is actually
> started in a real system. dbus-1?

Hmm, to be honest, I don't want to push the env-var-based solution. I want to propose something more clever...

I am finally able to reproduce the bug (simulating network latency). I used wireshark and realized that you have to wait for response after each request with small buffer, but multiple requests is sent together with bigger buffers without waiting for responses. So it behaves like a bit limited sliding window algorithm with big buffers.

The best solution would be to implement functions to send/receive whole file in libsmbclient (with some progress callback), maybe it would be good to file it in samba tracker at least...

I am experimenting right now with an algorithm which determines buffer sizes based on time of transmission... will attach later for test if it work.

Comment 26 Oliver Schonrock 2017-01-11 15:05:27 UTC

(In reply to Ondrej Holy from comment #25)
> I am finally able to reproduce the bug (simulating network latency). I used
> wireshark and realized that you have to wait for response after each request
> with small buffer, but multiple requests is sent together with bigger
> buffers without waiting for responses. So it behaves like a bit limited
> sliding window algorithm with big buffers.

Yes, that tallies with what this guy over at kde kioslaves found, also by reverse engineering it from wireshark.

https://bugs.kde.org/show_bug.cgi?id=291835#c26

He found out the simple (but effective!) strategy that mount.cifs uses:

"cifs mounted: requests 2x 61440 bytes, waits for ONE answer, requests next 61kb."

> The best solution would be to implement functions to send/receive whole file
> in libsmbclient (with some progress callback), maybe it would be good to
> file it in samba tracker at least...

Yeah, it would be great to get the mount.cifs algos for this into libsmbclient. Just pass libsmbcient a stream or filepointer?

> I am experimenting right now with an algorithm which determines buffer sizes
> based on time of transmission... will attach later for test if it work.

That's the "vapourware" i described above here...

https://bugzilla.gnome.org/show_bug.cgi?id=776339#c17

good luck with that..

Comment 27 Ondrej Holy 2017-01-18 12:43:35 UTC

Created attachment 343716 [details] [review]
smb: Add initial pull support

Copying is slow currently, because it uses small buffer sizes. The
small buffer sizes are ineffective because the fact that it waits for
response after each packet (additional reason might be the fact that
samba uses non-standard buffer sizes). Multiple packets are sent at
once with big buffer sizes without waiting for response. Let's
implement pull support with bigger buffer in order to speed up copying.

Comment 28 Ondrej Holy 2017-01-18 12:44:14 UTC

Created attachment 343717 [details] [review]
smb: Add heuristic to determine buffer size

Commit fb79ac2 introduced pull with bigger buffer, but the buffer is
still small to see some significant speedup. This patch adds heuristic
to detemine optimal buffer size. The premise is that bigger buffer
size means better transmission speed. However, big buffer sizes can't
be used headlongly, because progress has to be reported from the
operation regularly. The heuristic tries to find optimal buffer size
to be able reporting the progress regularly.

Comment 29 Ondrej Holy 2017-01-18 12:46:58 UTC

Patches from Bug 771022 have to be applied before patches from this bug.

I've revamped the patches a bit, but I've realized unfortunately that libsmbclient segfaults also with separate contexts (as was mentioned by Ross on the mailing lists) :-(

Comment 30 Bastien Nocera 2017-01-18 12:51:39 UTC

(In reply to Ondrej Holy from comment #28)
> Created attachment 343717 [details] [review] [review]
> smb: Add heuristic to determine buffer size
> 
> Commit fb79ac2 introduced pull with bigger buffer, but the buffer is

You should include a placeholder commit hash, because I couldn't find anything related here.

Comment 31 Ondrej Holy 2017-01-18 13:32:37 UTC

(In reply to Bastien Nocera from comment #30)
> (In reply to Ondrej Holy from comment #28)
> > Created attachment 343717 [details] [review] [review] [review]
> > smb: Add heuristic to determine buffer size
> > 
> > Commit fb79ac2 introduced pull with bigger buffer, but the buffer is
> 
> You should include a placeholder commit hash, because I couldn't find
> anything related here.

Yes, the commit message is wrong, sorry for that, it means attachment 343716 [details] [review]...

Comment 32 Ondrej Holy 2017-01-18 13:34:00 UTC

Just a note that Bug 771022 contains more info about the segfaults...

Comment 33 Oliver Schonrock 2017-01-22 18:16:06 UTC

(In reply to Ondrej Holy from comment #29)
> Patches from Bug 771022 have to be applied before patches from this bug.
> 
> I've revamped the patches a bit, but I've realized unfortunately that
> libsmbclient segfaults also with separate contexts (as was mentioned by Ross
> on the mailing lists) :-(

So that get this quite stuck for now?

Is it that libsmbsclient is just not thread safe?

Comment 34 Ondrej Holy 2017-01-23 07:35:56 UTC

I am afraid of this is stopper for us currently.

AFAIK libsmbsclient should be thread safe with separate context, but see:
https://bugzilla.samba.org/show_bug.cgi?id=11413

I have to test the attached patch, to see whether it fixes the problems...

Comment 35 Jeremy Allison 2017-01-23 16:52:00 UTC

Yes, currently libsmbclient is not thread-safe, even with separate contexts. This is on our list of things to fix, but we're resource constrained on this.

Comment 36 GNOME Infrastructure Team 2018-09-21 18:04:40 UTC

-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/gvfs/issues/292.