GNOME Bugzilla – Bug 712390
consider making g_{output,input}_stream_{write,read}_bytes[_async] process all bytes or bust
Last modified: 2013-11-19 15:01:30 UTC
It's pretty awkward to deal with partial reads and writes using the the GBytes variants of GInputStream and GOutputStream. It would be good if they could handle the partial loop themselves (or defer to read_all/write_all) rather than punt to the user (and come with an associated, documented gaurantee). Very few of the current callers of these apis properly handle the loop. Of that limited set, the change should be fairly backward compatible. One hiccup is bug 679662 comment 5. Namely, these apis are primarly useful for language bindings, and language bindings sometimes turn errors into exceptions, which means we can't reliably return an error and the number of bytes processed at the same time. Still, maybe worth doing. IRC Log: <halfline> Jasper: the other option is you could be symetrical with how you're doing the writes...e.g. g_file_read / g_input_stream_read_bytes <Jasper> I suppose, yeah. <Jasper> halfline, don't you have to call read_bytes in a loop, though? <Jasper> Like, isn't that the reason we have load_contents? <halfline> true, though that's true with write_bytes also <Jasper> nothing that I see that uses write_bytes loops <halfline> \o/ API trap <Jasper> halfline, like, gnome-continuous doesn't do it, and it has all the uptime in the world <halfline> Jasper: i don't quite follow <halfline> are you saying you don't consider it a bug that the programs don't loop? <Jasper> halfline, yes <Jasper> because a program that's running 24/7 doesn't loop, and it hasn't seen any bugginess as a result <Jasper> maybe it's a linux/posix thing where posix says it might be possible that you have to loop, but linux doesn't ever do that <halfline> filesystem specific behavior <halfline> Jasper: so it can happen on certain filesystems <Jasper> ah <halfline> but it can also happen if for instance a signal comes in after some bytes were written <halfline> but before they all were <halfline> if no bytes are written EINTR will get returned <Jasper> and gio doesn't loop for me in that case? <halfline> if half are written then the number written will get returned <halfline> how would it know that case? <Jasper> it sees errno=EINTR? <halfline> errno is not well defined on success <halfline> (with limited, documented exceptions) <Jasper> well, it knows when it failed, because I passed it a GBytes, which has a size <halfline> yea it's explicitly documented as being able to return short <Jasper> why can't we make write_bytes loop for us? <halfline> well there is a write_all <Jasper> just make write_bytes use that then <halfline> what about it's async cousin ? <halfline> *its <Jasper> what about it? <halfline> do we fix it to have the new semantics ? <Jasper> why not <halfline> well the api isn't really written with that in mind, given it returns number of bytes written, but changing it might be okay if we document why the api is weird for its behavior <halfline> there is the problem that technically you need to poll() for writability in your loop <halfline> especially if the fd is nonblocking <halfline> but i guess write_all as that problem too <Jasper> halfline, passing that responsibility onto the client isn't better <Jasper> almost nobody is looping, so almost nobody is going to add an fd watch on an internal fd <halfline> worth talking to danw about <Jasper> I suppose... <danw> it's intentional that it can do a short write, and there's a very good reason for it, which i don't remember at the moment, but it's probably in the original bugzilla bug <halfline> https://bugzilla.gnome.org/show_bug.cgi?id=671139 <Services> Bug 671139: normal, Normal, ---, gtkdev, RESOLVED FIXED, need (transfer async) for io stream buffers <halfline> humm, don' t see anything in the bug <danw> yeah... or either of the dups. <danw> anyway, you can't just change the existing API to promise it does a write_all, because then you have to deal with the case where the first write succeeds and the second fails. (write_all returns an nwrote that is separate from the success/failure indication) <halfline> danw: not following <danw> aha! https://bugzilla.gnome.org/show_bug.cgi?id=679662#c5 <Services> Bug 679662: enhancement, Normal, ---, gtkdev, UNCONFIRMED, Simple way to read/write as much as possible from/to GBytes <halfline> ahh <danw> though in many cases if you get a failure you're just going to throw the already-written data away anyway, so it doesn't matter if you don't know how many bytes it was... <Jasper> danw, right <halfline> unless the error was EINTR <halfline> but i guess it could handle that error internally <Jasper> You should never see EINTR <Jasper> danw, I don't see a need to know how many bytes you wrote before failing. That sort of case should probably handled by a progress callback anyway <danw> yeah, pretty sure gio never exposes EINTR <danw> a grep of a jhbuild checkout suggests that roughly 0% of g_output_stream_write_all() users look at bytes_written on failure
<Jasper> danw, if the first write succeeds and the second one fails, then n_written is first_written_bytes + second_written_bytes, no? <danw> no, it's just first_written_bytes <Jasper> halfline, seems the plymouth fix didn't fix it <halfline> no surprise i guess <Jasper> danw, oh, I thought you could do partial writes <Jasper> like, if the second one write got halfway but failed, then it would be first_written_bytes + second_written_bytes <Jasper> danw, you mean write(2) ? <danw> or g_io_stream_write() <halfline> will change to Since: 2.40 <Jasper> danw, I thought that write could be partial, and that's why you had to loop <halfline> danw: that's not true, you can have short write()'s in theory too <Jasper> Like, that's what we're discussing. <danw> Jasper: er, sorry, yes, it can be partial. but it can't both write data and return an error <danw> so, if the second write fails, second_written_bytes is 0, by definition <Jasper> Ah. <desrt> are we talking about breaking API? <desrt> my favourite topic!! <danw> no <Jasper> desrt, how would it break? <danw> silently improving <halfline> desrt: should be pretty backward compatible <Jasper> danw, wait, so what's returned on error? 0? <desrt> you want to modify _write_bytes() to loop? <danw> well... breaking in a case that it seems no one cares about <danw> Jasper: -1 <Jasper> Ah. <desrt> this is a compatibility problem <halfline> explain <desrt> the reason that all of our read/write calls do partial results is the same reason as for POSIX <desrt> when something polls as read for read/write then you are only guaranteed to have a single read/write before you block <danw> desrt: "<danw> a grep of a jhbuild checkout suggests that roughly 0% of g_output_stream_write_all() users look at bytes_written on failure" <desrt> ya. i saw that. <desrt> still technically a break <danw> yes <halfline> still waiting for the explanation ! <danw> anyway, we can just add g_output_stream_write_bytes_all() <desrt> if it hits nobody, then i'm happy enough to fix it <Jasper> g_output_stream_write_* is already blocking <desrt> or rather, i'm happy enough for someone else to fix it :) <danw> halfline: it's a break because *in theory*, someone might want to differentiate "failed immediately" and "failed after writing some data" <desrt> Jasper: it's not blocking if you polled first. that's the point. <danw> desrt: that's explicitly declared to not work actually. <desrt> poll() returning a particular condition as being ready means that you can do that operation -- ONCE <danw> g_output_stream_write() can *always* block, even if the stream just polled as writable <Jasper> g_output_stream_write() is a blocking call, no matter what. <desrt> danw: uh? <Jasper> If you polled first, who cares. That's not what our API talks aobut. <desrt> sounds like a bug to me... <desrt> Jasper: i definitely care. this is news to me. <Jasper> Do we poll for writability anywhere combined with GOutputStream? <danw> desrt: it's a race condition. maybe it was writable when you called poll, but something changed to make it not writable before you called write <danw> desrt: and WinSock does this intentionally <desrt> danw: how is that possible? <desrt> ....had to be windows <danw> desrt: the moon got further away, so now you can't write to it right away. <desrt> danw: this is clearly a windows bug <desrt> POSIX is really really clear on this point and it's damn important that it work this way <danw> gio isn't POSIX <Jasper> I don't think we want GOutputStream to be tied to POSIX semantics <danw> nor is WinSock <desrt> POSIX semantics here are _sane_ <danw> yes, but we can't emulate them in this case <desrt> i don't even know why we have pollability on our streams if the result is meaningless <danw> anyway, g_pollable_output_stream_write_nonblocking() is guaranteed to *not* block in that case <desrt> so why do we have _write() at all? <desrt> and not always _write_all()? <Jasper> Mistakes were made. <danw> because we had them before we had pollable streams <danw> and because there's still the problem of distinguishing "failed immediately" from "wrote some and then failed" <desrt> so let's modify _write() to always write all <desrt> and deprecate _write_all() <desrt> same argument, right? <halfline> not exactly, it becomes more awkward with the GBytes variants <danw> yes-ish, except that everyone expects _write() to do partial writes, and everyone seems to expect _write_bytes() to NOT do partial writes <halfline> because you have to allocate a new GByte structure if you're going to do the loop from the outside <danw> anyway, we can just add g_output_stream_write_bytes_all() <desrt> danw: i also notice that the default implementation of g_pollable_output_stream_write_nonblocking() makes the same assumption that i do <desrt> ie: is racy <danw> desrt: sure, because that works for many stream types <danw> and if it doesn't, you override it <desrt> the docs should probably be more clear on this point... <desrt> (man... i had no idea this was so evil) <desrt> output_stream_write should probably also mention that it can block, even if the stream just polled writable * desrt is sure that all kinds of people have made this mistake.... <desrt> almost seems that we should deprecate _write() <danw> are "all kinds of people" using the pollable APIs? I thought most people were either using _write() without polling, and expecting it to block, or were using _write_async() <desrt> danw: anyone who implements a server based on poll() is likely to be using write() after poll() <desrt> unless they paid very close attention <danw> are you talking about poll(2) or g_output_stream_is_writable()? <desrt> i'm talking about _create_source() <desrt> either on the socket or stream <desrt> g_pollable_output_stream_create_source() does indeed mention that it is potentially unreliable <danw> as does g_socket_create_source(). or at least, somewhere in the GSocket docs does <desrt> so i just made this mistake, i think.... <desrt> https://git.gnome.org/browse/glib/tree/gio/gmarkupreader.c?h=wip/new-parser#n231 <danw> a grep of a slightly-out-of-date jhbuild checkout shows that only glib, glib-networking, and libsoup are using the pollable APIs <desrt> well <desrt> it's a moot point anyway, since it's done * desrt is just very surprised <desrt> about the bytes: if nobody is using it, maybe we should just break it? <desrt> i'm guessing not a lot of people care about the partial number of bytes written if we have an error case
anyway, I think the fix here is to implement write_bytes_all(), which is bug 679662
*** This bug has been marked as a duplicate of bug 679662 ***