GNOME Bugzilla – Bug 513810
Buffer management features for increased streaming client performance
Last modified: 2008-02-07 02:35:09 UTC
I'd like to be able to (1) control the maximum size of the chunks and (2) provide pre-allocated buffers to eliminate copying data.
From: Dan Winship <danw@gnome.org> Wouter Cloetens wrote: > One thing I'm still missing is application buffer management. I'd like > to be able to (1) control the maximum size of the chunks and (2) provide > pre-allocated buffers to eliminate copying data. Another possibility would be to have a "gst_buffer_set_free_func()" or something, so that you could have the GstBuffer share the memory with the SoupBuffer directly, and then free the SoupBuffer when the GstBuffer is freed. Benjamin Otte is running into the same problem in swfdec (except worse, because in some cases he ends up streaming from SoupBuffers to GstBuffers via SwfdecBuffers). We were talking on #gnome-hackers the other day about the possibility of having a refcounted buffer type directly in glib. Anyway, I'm still trying to figure out exactly how an API like this would work... currently the code isn't really written with that possibility in mind, although the addition of GCancellable args to the SoupSocket calls will make it easier I think... I'm assuming it would probably involve a new SoupSession subclass with some methods sort of like neon's ne_begin_request/ne_read_response_block, but if you have different ideas of what you want the API to look like, chime in (or write a "this is the patch I'd like to be able to apply to souphttpsrc" patch. :)
From: Wouter Cloetens <wouter@mind.be> On Tue, Jan 29, 2008 at 08:57:11AM -0500, Dan Winship wrote: > Wouter Cloetens wrote: > > One thing I'm still missing is application buffer management. I'd like > > to be able to (1) control the maximum size of the chunks and (2) provide > > pre-allocated buffers to eliminate copying data. (1) RESPONSE_BLOCK_SIZE in soup-message-io.c is the first issue. I would make this dynamic by adding a buffer-size property, to end up in SoupMessageIOData. Not for read_metadata() perhaps, but certainly for read_body_chunk(). (2) > Another possibility would be to have a "gst_buffer_set_free_func()" or > something, so that you could have the GstBuffer share the memory with > the SoupBuffer directly, and then free the SoupBuffer when the GstBuffer > is freed. You'll have to dynamically allocate it then, instead of allocating it on the stack. GstBuffer allows me to provide a pointer to the buffer data, the length of the data, and also a pointer to the allocated memory. When the GstBuffer is no longer needed, g_free() is called on that last pointer. That's a neat feature, because it allows you to specify that the useable data starts at a certain offset in the allocated buffer, avoiding another copy. A simple alternative to the SOUP_MESSAGE_OVERWRITE_CHUNKS flag, e.g. SOUP_MESSAGE_MALLOC_CHUNKS, could specify the new behaviour of malloc'ing the buffer and letting the got-chunk signal callback worry about freeing it. That may not correspond nicely to the SoupBuffer logic, but AFAICT, none of its functionality is needed in this particular instance. > Benjamin Otte is running into the same problem in swfdec (except worse, > because in some cases he ends up streaming from SoupBuffers to > GstBuffers via SwfdecBuffers). We were talking on #gnome-hackers the > other day about the possibility of having a refcounted buffer type > directly in glib. Oh that would be cool. > Anyway, I'm still trying to figure out exactly how an API like this > would work... currently the code isn't really written with that > possibility in mind, The suggestion above seems straightforward. API impact is also limited to a new property and a new flag, neither of which break backward compatibility. > although the addition of GCancellable args to the > SoupSocket calls will make it easier I think... > > I'm assuming it would probably involve a new SoupSession subclass with > some methods sort of like neon's > ne_begin_request/ne_read_response_block, but if you have different ideas > of what you want the API to look like, chime in (or write a "this is the > patch I'd like to be able to apply to souphttpsrc" patch. :) How does my simple suggestion sound? I'm willing to brew some code.
Created attachment 104231 [details] [review] Buffer management. This just tackles issue (2) for now, and I have not yet tested it, but I thought I'd post this to get some feedback.
Created attachment 104449 [details] [review] Buffer management. This patch fixes a bug in the previous patch; the buffers were still freed after the got-chunk callback was invoked. This patch also adds the response block size. Tested with a patched GStreamer souphttpsrc.
Created attachment 104595 [details] [review] Buffer management. This is danw's patch on the mailing list, with the following differences: - soup_buffer_new_external() was renamed to soup_buffer_new_custom(), - destroy_data was renamed to custom_data, - a getter for custom_data, soup_buffer_get_custom_data(), was added. This has been tested with a modified GStreamer souphttpsrc.
Created attachment 104596 [details] [review] GStreamer souphttpsrc patch using custom libsoup buffer management. An example of the use. A GstBuffer is allocated by a downstream element in the pipeline and used as custom_data in conjunction with the separately allocated (or hardware memory mapped) data buffer. The GstBuffer is reffed at allocation time, and unreffed in the destroy notification callback. When all downstream GStreamer elements are done with the GstBuffer and its data, the refcount goes to zero and both objects are freed. The allocated buffer size is the minimum of the remaining HTTP content length and the element's user-provided "blocksize" property.
> + length = MIN (basesrc->blocksize, max_len); Note that the docs say that max_len may be 0, meaning "I don't know how much data is coming" (eg, when reading a terminated-by-eof response body). So this should be: if (max_len) length = MIN (baserc->blocksize, max_len); else length = baserc->blocksize; (I think.) > + gst_buffer_ref (gstbuf); Don't add an extra ref from the chunk allocator; you want the SoupBuffer to be holding the only ref to the GstBuffer when the allocator returns, so that if soup_socket_read() returns an error, the GstBuffer will get destroyed along with the SoupBuffer. Wait until the got_chunk callback, when you set *src->outbuf, to add the extra ref. I ended up going with "soup_buffer_new_with_owner" and "soup_buffer_get_owner" btw, but other than the names, it's basically like in your last revision of the patch. And if this works for you, then I'll commit it.
Created attachment 104611 [details] [review] GStreamer souphttpsrc patch using custom libsoup buffer management. Thanks for catching those. I've documented lifecycle management in the source, because it is getting complicated enough to be bound to confuse anyone who wants to maintain it.
committed, with the naming difference I mentioned before ("soup_buffer_new_with_owner" and "soup_buffer_get_owner"), and fixing a dumb bug in the soup-message-io.c changes. libsoup trunk will become 2.3.2 on Monday night for GNOME 2.21.91, so after then you can depend on libsoup-2.4 >= 2.3.2 in configure to get these changes.
(oops, closing)