GNOME Bugzilla – Bug 747959
buffer: minimize memory copy in _memory_add()
Last modified: 2018-11-03 12:27:00 UTC
If all the slots for the memory blocks are full, instead of merging all block to single memory block, merge only best consecutive blocks to make a room for new memory block.
Created attachment 301689 [details] [review] buffer: merging best consecutive blocks in _memory_add() Finding and merging best consecutive blocks for merging to reduce memory copy operations.
The problem I see here is that it will cause a performance decrease if you add more memories later after you merged two. Then you will have to merge another two (which previously might've been three). And then another. And then ... I think the better solution here is to create an extended GstMemory array that is used once we get to more than 16 memories. Instead of using the statically allocated array we could use a GPtrArray once then 17th memory is added, and maybe for simplicity copy over all pointers from the statically allocated one and stop using it completely.
Created attachment 301775 [details] [review] buffer: Dynamic memory allocation for GstMemory* array Dynamic memory allocation for the GstMemory* array which can be resize in _memory_add at the time of buffer overflow.
Comment on attachment 301775 [details] [review] buffer: Dynamic memory allocation for GstMemory* array >+ buffer->mem = g_slice_alloc (sizeof (GstMemory *) * GST_BUFFER_MEM_MAX); >+ buffer->max_len = GST_BUFFER_MEM_MAX; I think we still want the static array, so that we can avoid the extra alloc in the normal case. Have a look at GstBufferList.
Yes, definitely. The new array should only be used once >16 memories are used, otherwise we use the old array and don't have additional unnecessary allocations.
Review of attachment 301775 [details] [review]: ::: gst/gstbuffer.c @@ +151,3 @@ guint len; + GstMemory **mem; + guint max_len; Also I really meant a GPtrArray (that is allocated for >16 memories) so that you don't have to do the bookkeeping and reallocations yourself :)
I'm not convinced having an array of hundreds of GstMemory is a good idea. The cost of creating and destroying all of those memories may exceed the benefit of the zero-copy. I think this is why Wim limited it to relatively small value in the first place. Possibly, you may want to benchmark your patch against just allocation a large buffer and doing memcpy() into it?
Well, you're right of course, only that if you get to that point you have already incurred the cost of creating that memory and you will have to incur the cost of destroying it either way (now or later), so if what you're saying is "we should ideally never end up in such a situation" then I agree. I also think we would still want some hard-limit at which point we force-merge. Whether it makes more sense to merge all into one or not also depends on the use later (if it's going to be mapped as a whole forcing a merge anyway, as 99% of code does, then that's more efficient overall). Clearly only bad options ;)
Created attachment 301968 [details] [review] buffer: Dynamic memory allocation for GstMemory* array in overflow Allocation of GstMemory* array with extended size in case of overflow. By default it will use static array. On overflow it will allocate new one with extended size.
What's your use case for this btw?
I have no specific use case in mind for above patch. I just thought that performance can be improved in buffer management here by avoiding the memory copy in case of overflow. Its better to use the variable length array for memory blocks instead of fix sized array as provided in buffer list. Even though the memory is force merged in the case of getting mapped info for write/read mode, it can be avoided by iterating every memory block of buffer while reading.
It's not clear if that actually improves any performance, and arguably the code where this happens should be fixed at a higher level. Would be good to get some somewhat realistic test cases (that actually make sense!) for this and check what difference it makes.
Without a compelling use case for this I'm tempted to say this is not something that actually requires fixing in GstBuffer. If the number of memories overflows, something should probably be done differently somewhere higher level. Perhaps you could make a patch to remove the FIXME/TODO instead? :) Only use case that comes to mind is e.g. an RTP depayloader where the output buffer is made up of a large number of input payloads (and no parsing/copying is needed). I believe GstAdapter would handle the final memory merging/packing in this case though.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gstreamer/issues/107.