After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 604086 - Use splice(2) when doing local file copies
Use splice(2) when doing local file copies
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: gio
unspecified
Other All
: Normal enhancement
: ---
Assigned To: Christian Kellner
gtkdev
Depends on:
Blocks:
 
 
Reported: 2009-12-08 15:40 UTC by Christian Kellner
Modified: 2010-02-15 12:27 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
GFileDescriptorBased interface (6.36 KB, patch)
2010-02-07 16:32 UTC, Christian Kellner
none Details | Review
Implement interface in GLocalFileInputStream (2.24 KB, patch)
2010-02-07 16:33 UTC, Christian Kellner
none Details | Review
Implement interface in GLocalFileOutputFile (2.60 KB, patch)
2010-02-07 16:34 UTC, Christian Kellner
none Details | Review
Add support for splice(2) (5.63 KB, patch)
2010-02-07 16:37 UTC, Christian Kellner
none Details | Review
GFileDescriptorBased interface (6.38 KB, patch)
2010-02-10 10:17 UTC, Christian Kellner
none Details | Review
Add support for splice(2) (5.94 KB, patch)
2010-02-10 10:17 UTC, Christian Kellner
none Details | Review
GFileDescriptorBased interface (6.42 KB, patch)
2010-02-10 10:46 UTC, Christian Kellner
none Details | Review
GFileDescriptorBased interface (6.38 KB, patch)
2010-02-12 12:40 UTC, Christian Kellner
none Details | Review
Implement interface in GLocalFileInputStream (2.31 KB, patch)
2010-02-12 12:41 UTC, Christian Kellner
none Details | Review
Implement interface in GLocalFileOutputFile (4.70 KB, patch)
2010-02-12 12:41 UTC, Christian Kellner
none Details | Review
Add support for splice(2) (7.46 KB, patch)
2010-02-12 12:44 UTC, Christian Kellner
none Details | Review
Add support for splice(2) (7.59 KB, patch)
2010-02-15 11:10 UTC, Christian Kellner
none Details | Review

Description Christian Kellner 2009-12-08 15:40:41 UTC
When using splice() for local file copies we can transfer data between file descriptors without copying the data from kernel- to user-space.
Comment 1 Christian Kellner 2009-12-08 15:42:11 UTC
This will happen in g_local_file_copy (), and I already started on this task.
Comment 2 Dan Winship 2009-12-08 16:15:05 UTC
I was thinking about this a little when working on gsocket and thinking about splice/sendfile and related APIs on other platforms. One idea I had was that we could add a GFileDescriptorBased interface, which would be implemented by G{Input,Output}StreamUnix, GLocalFile{Input,Output}Stream, GSocket{Input,Output}Stream, and anything else that's backed by a file descriptor, and then g_output_stream_splice() could check to see if the two streams you passed were splice(2)-compatible (or sendfile(2)-compatible, etc) and do that if appropriate.

And on the subject of "splice(2)-compatible", the splice man page says "one of the descriptors must refer to a pipe" which would seem to make it inappropriate for g_local_file_copy()? (Unless the man page is just wrong.)
Comment 3 Alexander Larsson 2009-12-10 09:46:40 UTC
Having an interface for that seems nice.

For the "must refer to a pipe" thing, i think you are misunderstanding how splice() works. You're not meant to splice directly from a to b and have the kernel do this all by itself. Instead it uses the traditional copy loop:

while (read(a, buffer) != eof) {
  while (more in buffer) 
   write (b, buffer)
}

However, instead of allocating buffer in userspace we allocate the buffer in kernel space by opening a pipe and using the pipe buffer as storage, then we read turns to splice into pipe and write to splice from pipe.
Comment 4 Christian Kellner 2009-12-11 13:04:18 UTC
After a short discussion with alex, we both agreed that having an interface for fd based objects is indeed the way to go, and I am on that. I'll post patches.
Comment 5 Christian Kellner 2010-02-07 16:32:34 UTC
Created attachment 153204 [details] [review]
GFileDescriptorBased interface
Comment 6 Christian Kellner 2010-02-07 16:33:39 UTC
Created attachment 153205 [details] [review]
Implement interface in GLocalFileInputStream
Comment 7 Christian Kellner 2010-02-07 16:34:07 UTC
Created attachment 153206 [details] [review]
Implement interface in GLocalFileOutputFile
Comment 8 Christian Kellner 2010-02-07 16:37:37 UTC
Created attachment 153207 [details] [review]
Add support for splice(2)
Comment 9 Christian Dywan 2010-02-10 09:35:36 UTC
(In reply to comment #8)
> Created an attachment (id=153207) [details] [review]
> Add support for splice(2)
>+  total_size = -1;
>+  /* avoid performance impact of querying total size when it's not needed */
>+  if (progress_callback)
>+    {
>+      struct stat sbuf;
>+
>+      if (fstat (fd_in, &sbuf) == 0)
>+        total_size = sbuf.st_size;
>+    }
>+
>+  if (total_size == -1)
>+    total_size = 0;

Small note: The "== -1" belongs inside the if().
Comment 10 Christian Dywan 2010-02-10 09:38:32 UTC
(In reply to comment #5)
> Created an attachment (id=153204) [details] [review]
> GFileDescriptorBased interface

+ * g_file_descriptor_based_get_handle:
+ * @fd_based: a #GFileDescriptorBased.
+ *
+ * Gets the underlying file descriptor.
+ *
+ * Returns: The file descriptor
+ **/
+int
+g_file_descriptor_based_get_handle (GFileDescriptorBased *fd_based)

There's a "Since: 2.24" missing.
Comment 11 Christian Kellner 2010-02-10 10:17:24 UTC
Created attachment 153397 [details] [review]
GFileDescriptorBased interface
Comment 12 Christian Kellner 2010-02-10 10:17:56 UTC
Created attachment 153398 [details] [review]
Add support for splice(2)
Comment 13 Christian Kellner 2010-02-10 10:46:58 UTC
Created attachment 153403 [details] [review]
GFileDescriptorBased interface

Add "Since: 2.24" to docs.
Comment 14 Dan Winship 2010-02-10 14:14:48 UTC
I think gfiledescriptorbased.[ch] should be in unix_sources/giounixinclude_HEADERS, to make it clear that it's only for real unix file descriptors, so we don't have to worry about exactly which kind of "file descriptor" it uses on Windows.

(and I'd call the getter "g_file_descriptor_based_get_fd", not "get_handle", because "handle" sounds like you're trying to be abstract, and we're not).
Comment 15 Alexander Larsson 2010-02-10 15:30:35 UTC
agree with dan

Also, splice_stream_with_progress() should be able to fail at runtime (if
splice is not supported on e.g. the kernel or a specific filesystem, or if
pipe() fails) and then fall back to copy_stream_with_progress.

Otherwise it looks good to me.
Comment 16 Christian Kellner 2010-02-11 11:51:18 UTC
Just for the record some very rough benchmarks using "time gvfs-copy":

Non-Splice:
0.02s user 2.98s system 81% cpu 3.678 total
0.02s user 2.78s system 22% cpu 12.430 total
0.03s user 2.66s system 74% cpu 3.605 total
0.02s user 2.50s system 33% cpu 7.434 total

Splice
0.00s user 0.94s system 73% cpu 1.296 total
0.00s user 1.55s system 85% cpu 1.814 total
0.00s user 0.99s system 73% cpu 1.347 total
0.01s user 0.91s system 12% cpu 7.238 total

Splice (copy & sync)
0.00s user 1.44s system 85% cpu 1.683 total
0.00s user 1.01s system 23% cpu 4.311 total
0.00s user 1.20s system 81% cpu 1.462 total
0.00s user 0.94s system 21% cpu 4.340 total

Splice (copy & sync & echo 3 > /proc/sys/vm/drop_caches)
0.00s user 1.92s system 68% cpu 2.802 total
0.00s user 1.23s system 27% cpu 4.512 total
0.00s user 1.25s system 76% cpu 1.621 total
0.01s user 1.16s system 23% cpu 5.046 total
Comment 17 Christian Kellner 2010-02-12 12:40:45 UTC
Created attachment 153624 [details] [review]
GFileDescriptorBased interface
Comment 18 Christian Kellner 2010-02-12 12:41:17 UTC
Created attachment 153625 [details] [review]
Implement interface in GLocalFileInputStream
Comment 19 Christian Kellner 2010-02-12 12:41:51 UTC
Created attachment 153626 [details] [review]
Implement interface in GLocalFileOutputFile
Comment 20 Christian Kellner 2010-02-12 12:44:16 UTC
Created attachment 153627 [details] [review]
Add support for splice(2)

Changes of the patches:

* Install GFileDescriptorBased header as unix specific header file. It still compiles on !unix platform as well since it is used internally.
* Also convert GLocalFileIOStream to use the interface instead of the private _get_fd call of GLocalFileOutputStream.
* Fallback support to splice_stream_with_progress
Comment 21 Alexander Larsson 2010-02-15 09:02:41 UTC
+#ifdef HAVE_SPLICE
+#include "gfiledescriptorbased.h"
+#endif

I don't think you have to condition this, as it always exists (its used for all local files, even on win32 after all). Not that it matters, just means less ugly ifdefs.

I don't see offset_in/out being changed ever, and the splice manpage says nothing about it being update to the new position. Is this just a problem with the docs?

Also, i think do_splice should handle EINTR
Comment 22 Christian Kellner 2010-02-15 11:10:28 UTC
Created attachment 153817 [details] [review]
Add support for splice(2)

(In reply to comment #21)
> +#ifdef HAVE_SPLICE
> +#include "gfiledescriptorbased.h"
> +#endif
> I don't think you have to condition this, as it always exists (its used for all
> local files, even on win32 after all). Not that it matters, just means less
> ugly ifdefs.
Yeah, I totally agree. That is some leftover from my attempts to compile *everything* only on unix systems. Fixed.

> I don't see offset_in/out being changed ever, and the splice manpage says
> nothing about it being update to the new position. Is this just a problem with
> the docs?
The docs are unclear here. The offsets are updated by splice(). I had the other way around in the first place and I got some funny results ;-)
 
> Also, i think do_splice should handle EINTR
Doh! Yeah of course.
Comment 23 Alexander Larsson 2010-02-15 12:13:16 UTC
looks good to commit
Comment 24 Christian Kellner 2010-02-15 12:27:57 UTC
Pushed to master. Thanks everybody for the review.