After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 690525 - g_file_replace_contents_async doesn't copy its @contents argument
g_file_replace_contents_async doesn't copy its @contents argument
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: gio
unspecified
Other Linux
: Normal major
: ---
Assigned To: gtkdev
gtkdev
: 730067 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2012-12-19 20:49 UTC by Simonas Kazlauskas
Modified: 2014-08-24 09:28 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Testcase (2.16 KB, text/x-python)
2012-12-19 20:49 UTC, Simonas Kazlauskas
  Details
Tests for pygobject (18.59 KB, patch)
2013-01-08 21:29 UTC, Simonas Kazlauskas
committed Details | Review
GMainContext: add qdata (5.32 KB, patch)
2013-09-25 13:19 UTC, Allison Karlitskaya (desrt)
none Details | Review
GFile: add GBytes version of _replace_contents_async() (7.27 KB, patch)
2013-12-02 18:02 UTC, Xavier Claessens
reviewed Details | Review
GFile: add GBytes version of _replace_contents_async() (6.40 KB, patch)
2013-12-02 19:45 UTC, Xavier Claessens
accepted-commit_now Details | Review
Document clearly async functions not copying its args (1.86 KB, patch)
2013-12-02 19:45 UTC, Xavier Claessens
accepted-commit_now Details | Review

Description Simonas Kazlauskas 2012-12-19 20:49:39 UTC
Created attachment 231934 [details]
Testcase

Calling g_file_replace_contents_async seems to write incorrect data to the file.

For data included in file it seems to write at least three different values, alternating every several runs. It writes the correct ammount of data, but data is malformed.

This behaviour is reproducible both on Python 2.7 and 3.3.
Comment 1 Simonas Kazlauskas 2012-12-21 19:35:31 UTC
Reproducible on master(53bc12a87da8), and PyGI 3.6.x.

Quoting git bisect: “c7aa0e79dfb4c1092c51ae1464b8414083b4f3fc is the first bad commit”.
Comment 2 Simonas Kazlauskas 2012-12-21 19:50:12 UTC
Looks more like a problem of bad interaction between introspection and and library rules. As replace_contents_async has transfer-ownership="none" and Gio doesn't ref content array it gets.

Now I'm a bit confused about this transfer-ownership business (I guess none means that you're free it any time) so I will not try to decide which project is to blame.
Comment 3 Simonas Kazlauskas 2013-01-08 18:27:59 UTC
Forgot to mention, Comment 2 should be ignored as it is utter nonsense.
Comment 4 Simonas Kazlauskas 2013-01-08 21:29:41 UTC
Created attachment 233020 [details] [review]
Tests for pygobject
Comment 5 Simon Feltman 2013-01-10 09:19:28 UTC
(In reply to comment #2)
> Looks more like a problem of bad interaction between introspection and and
> library rules. As replace_contents_async has transfer-ownership="none" and Gio
> doesn't ref content array it gets.
> 
> Now I'm a bit confused about this transfer-ownership business (I guess none
> means that you're free it any time) so I will not try to decide which project
> is to blame.

In this case I think g_file_replace_contents_async should be making a copy of the input data. Especially in an async situation. That or a binding friendly version needs to be created (something like g_file_replace_contents_async_with_copy) and use a "Rename to" annotation.
Comment 6 Martin Pitt 2013-01-10 13:02:23 UTC
Comment on attachment 233020 [details] [review]
Tests for pygobject 

Thanks for the tests!

I committed them with a few PEP-8 cleanups, but I disabled the test as it currently breaks the whole testsuite:

test_replace_contents_async (test_gio.TestGFile) ... 
(runtests.py:6679): GLib-GIO-CRITICAL **: g_task_propagate_boolean: assertion `task->result_set == TRUE' failed
Comment 7 Simon McVittie 2013-09-17 11:37:25 UTC
(In reply to comment #5)
> In this case I think g_file_replace_contents_async should be making a copy of
> the input data. Especially in an async situation. That or a binding friendly
> version needs to be created (something like
> g_file_replace_contents_async_with_copy) and use a "Rename to" annotation.

I ran into this in telepathy-glib C code: in <https://bugs.freedesktop.org/show_bug.cgi?id=63402> I was surprised that Chandni had to copy the data, write the copy to the file and keep it around for the duration of the async operation.

I think it should copy the data, or perhaps have a version that takes a GBytes.

Workaround: use g_file_replace_async(), write to the returned output stream, and explicitly close the output stream.
Comment 8 Xavier Claessens 2013-09-24 14:11:56 UTC
I think it is important to avoid data copy when possible, we don't know how big is the data going to be written. So I think we should either have a _take()  and/or a _bytes() version, but not something that g_memdup() the whole thing. 

I think we have 2 possibilities for g_file_replace_contents_async():
1) Don't change it but document clearly that it does not copy the data.
2) Make it g_membup() the data and free it when done. And make it clear in the doc.

We could also deprecate it in favor of _take() and/or _bytes() version.


In our use case (telepathy-glib) we want to write data coming from DBus. The data is given as GArray so we could do zero-copy...

bytes = g_bytes_new_with_free_func(array->data, array->len, g_array_unref, g_array_ref (array));

... if only dbus-glib wasn't using g_array_free()...

With GDBus we could just do g_variant_get_data_as_bytes() to do zero-copy.
Comment 9 Allison Karlitskaya (desrt) 2013-09-24 15:07:02 UTC
Compare with g_output_stream_write_async().  This function (even in its default implementation) requires that the passed-in buffer remains valid until the end of the async call.

Since we have prior art here, I think this isn't a bug.  We definitely could use better documentation on both of these functions, though, and a GBytes version wouldn't hurt.

For binding, we have been talking about adding an 'asynccall' scope for callbacks.  Maybe a similar concept could be useful for input arguments as well...
Comment 10 Allison Karlitskaya (desrt) 2013-09-25 13:19:56 UTC
Created attachment 255684 [details] [review]
GMainContext: add qdata
Comment 11 Allison Karlitskaya (desrt) 2013-09-25 14:27:40 UTC
Comment on attachment 255684 [details] [review]
GMainContext: add qdata

Uh. Misfire, obviously.
Comment 12 Xavier Claessens 2013-12-02 18:02:07 UTC
Created attachment 263319 [details] [review]
GFile: add GBytes version of _replace_contents_async()
Comment 13 Xavier Claessens 2013-12-02 18:05:58 UTC
I fell in the trap again, so I decided to make that patch. This allows doing:

bytes = g_bytes_new_with_free_func (data, size, g_free, data);
g_file_replace_contents_bytes_async(file, bytes, ...);
g_bytes_unref (bytes);

Like that we don't have to take care of freeing the data in the callback, and we can even use it without a callback if we don't care about the result.
Comment 14 Xavier Claessens 2013-12-02 18:07:47 UTC
g_bytes_new_take() even.
Comment 15 Colin Walters 2013-12-02 18:24:12 UTC
Review of attachment 263319 [details] [review]:

This bit me too in hotssh.

Also, can you add a test?

::: gio/gfile.c
@@ +7315,3 @@
+ * Note that no copy of @content will be made, so it must stay valid until
+ * @callback is called. See g_file_replace_contents_bytes_async() for a #GBytes
+ * version that allows forgetting about the contents immediately.

...that will automatically hold a reference to the contents (without copying) for the duration of the call.

?

This might be an evil enough trap that it could go under <warning>

@@ +7347,3 @@
+ * @user_data: the data to pass to callback function
+ *
+ * Same as g_file_replace_contents_async() but taking a #GBytes input instead.

"but takes"

::: gio/gfile.h
@@ +1220,3 @@
 					      GCancellable           *cancellable,
 					      GError                **error);
+GLIB_AVAILABLE_IN_2_40

Er, no, this needs to stay _IN_ALL.

@@ +1230,3 @@
 					      GAsyncReadyCallback     callback,
 					      gpointer                user_data);
 GLIB_AVAILABLE_IN_ALL

And *these* should be _IN_2_40
Comment 16 Xavier Claessens 2013-12-02 19:45:47 UTC
Created attachment 263331 [details] [review]
GFile: add GBytes version of _replace_contents_async()
Comment 17 Xavier Claessens 2013-12-02 19:45:50 UTC
Created attachment 263332 [details] [review]
Document clearly async functions not copying its args

Usually async methods copy/ref its arguments so caller can
forget about them. g_file_replace_contents_async() and
g_output_stream_write_async() are exceptions.
Comment 18 Xavier Claessens 2013-12-02 19:48:13 UTC
(In reply to comment #15)
> Review of attachment 263319 [details] [review]:
> 
> Also, can you add a test?

g_file_replace_contents_async() is implemented using g_file_replace_contents_bytes_async() so it's already tested for free.

> ::: gio/gfile.c
> @@ +7315,3 @@
> + * Note that no copy of @content will be made, so it must stay valid until
> + * @callback is called. See g_file_replace_contents_bytes_async() for a
> #GBytes
> + * version that allows forgetting about the contents immediately.
> 
> ...that will automatically hold a reference to the contents (without copying)
> for the duration of the call.

Sounds better, changed.

> This might be an evil enough trap that it could go under <warning>

Did that.

> @@ +7347,3 @@
> + * @user_data: the data to pass to callback function
> + *
> + * Same as g_file_replace_contents_async() but taking a #GBytes input instead.
> 
> "but takes"

fixed

> ::: gio/gfile.h
> @@ +1220,3 @@
>                            GCancellable           *cancellable,
>                            GError                **error);
> +GLIB_AVAILABLE_IN_2_40
> 
> Er, no, this needs to stay _IN_ALL.
> 
> @@ +1230,3 @@
>                            GAsyncReadyCallback     callback,
>                            gpointer                user_data);
>  GLIB_AVAILABLE_IN_ALL
> 
> And *these* should be _IN_2_40

right, fixed.

Note that my patch does not add g_file_replace_contents_bytes_finish() because g_file_replace_contents_finish() can be used for it. Is that a problem? Not sure what's the policy here.
Comment 19 Colin Walters 2013-12-02 19:48:17 UTC
Review of attachment 263332 [details] [review]:

Excellent!
Comment 20 Colin Walters 2013-12-02 19:50:35 UTC
Review of attachment 263331 [details] [review]:

Still no test =(

But otherwise looks fine.
Comment 21 Colin Walters 2013-12-02 19:53:19 UTC
(In reply to comment #18)
> 
> 
> g_file_replace_contents_async() is implemented using
> g_file_replace_contents_bytes_async() so it's already tested for free.

Yeah...but that's not the same thing exactly.  Code coverage tools want each one to be called.

> Note that my patch does not add g_file_replace_contents_bytes_finish() because
> g_file_replace_contents_finish() can be used for it. Is that a problem? Not
> sure what's the policy here.

We have other cases like that (g_data_input_stream_read_line_utf8() for example), I think it's a bit unfortunate in that vala has to guess at the name of the _finish function, but that's OK by me.
Comment 22 Colin Walters 2013-12-02 19:55:46 UTC
(In reply to comment #21)
> (In reply to comment #18)
> > 
> > 
> > g_file_replace_contents_async() is implemented using
> > g_file_replace_contents_bytes_async() so it's already tested for free.
> 
> Yeah...but that's not the same thing exactly.  Code coverage tools want each
> one to be called.

Sorry, I was thinking about this backwards...you're right.  (But there's still something to be said for a test for each public API entry point)
Comment 23 Xavier Claessens 2013-12-02 21:06:29 UTC
Thanks, pushed both commits.

All entry points needs to be tested explicitly, really? I think coverage tools can detect when we test A and A calls into B, no?
Comment 24 Matthias Clasen 2013-12-03 00:06:33 UTC
Coverage tools don't care, but maintainers do. Having each public api tested directly is certainly a goal that we have. For instance, you mentioned a particular point about being able to unref the GBytes before the async is finished - that is exactly the kind of api guarantee that I want to see a test for.
Comment 25 Simon Feltman 2014-08-24 05:24:03 UTC
*** Bug 730067 has been marked as a duplicate of this bug. ***