Bug 637845 – Streams for multipart upload

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 637845 - Streams for multipart upload


Summary:	Streams for multipart upload


Status:	RESOLVED OBSOLETE

Product:	libsoup
Classification:	Core
Component:	HTTP Transport
Version:	unspecified
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	libsoup-maint@gnome.bugs
QA Contact:	libsoup-maint@gnome.bugs

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2010-12-23 01:20 UTC by Jim Nelson
Modified:	2018-09-21 16:07 UTC

See Also:
GNOME target:	---
GNOME version:	---

Description Jim Nelson 2010-12-23 01:20:00 UTC

When doing a multipart upload, the current implementation requires all the buffers be in memory before starting the transaction.  In the case of uploading large files (i.e. video, audio, etc.), that can be onerous.

Dan Winship suggested using mmap I/O, which is a fair solution, but a more thorough solution would be to allow streaming each section of the multipart payload.  This could be done many ways, but the way I'm envisioning is to pass a GInputStream rather than a SoupBuffer, that way the source of data could be just about anything.

Comment 1 David Woodhouse 2011-03-21 09:56:23 UTC

We *don't* seem to require all buffers in memory before starting the transaction.

If I set the Transfer-Encoding of the request buffer to chunked, I can then call soup_message_body_append() from the wrote-headers and wrote-chunk signal handlers, and it all works fine. I have to call soup_message_body_complete() when I'm done, of course.

The problem is that although we don't have to have it all in memory in advance, libsoup does still accumulate it all into memory, even when asked not to. To be fair, this is clearly documented — but it's a pain, because the whole point in streaming, in my case, is that the request body can be *huge* (with Exchange we submit the whole of a MIME message to be sent, including all the attachments, base64-encoded in an XML node.). We don't *want* to have it all in memory.

The reason given in the documentation for this behaviour is that the request might be needed again for resending if we needed to authenticate. I don't quite buy that argument — if I gave you the data in my wrote-headers and wrote-chunk signal handlers the *first* time, I can damn well do so again the *second* time too, as long as you called my "restarted" signal handler to let me know I have to start again from the beginning.

I'm about to try a horrid workaround in my wrote-chunk signal handler, which will call soup_message_body_get_chunk() to get the chunk which was just written, then call soup_message_body_wrote_chunk() for *myself* on that chunk.

Dan, any better suggestions?

Comment 2 David Woodhouse 2011-04-26 12:00:40 UTC

http://david.woodhou.se/soup-client-request-streaming.patch works for me.

For libsoup 2.32 I need the /* OH GOD SHOOT ME NOW */ version of the code in my
restarted signal handled, but we fixed it in HEAD with commit 608b14e7 so that
just calling soup_message_body_truncate() will suffice.

Do not try this at home, kids. This is not a supported API of libsoup.

Comment 3 Dan Winship 2015-02-10 11:58:32 UTC

[mass-moving all "UNCONFIRMED" libsoup bugs to "NEW" after disabling the "UNCONFIRMED" status for this product now that bugzilla.gnome.org allows that. bugspam-libsoup-20150210]

Comment 4 GNOME Infrastructure Team 2018-09-21 16:07:14 UTC

-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/libsoup/issues/36.