GNOME Bugzilla – Bug 705678
sysmem allocator copies too much
Last modified: 2013-09-05 05:01:25 UTC
Not sure whether this is a bug in the sysmem allocator, or in elements. In short, v4l2src ! rtpjpegpay consumes 100% cpu, almost all in memcpy. v4l2src emits (JPEG) buffers of about 60KB each with the NO_SHARE memory flag, and rtpjpegpay tries to cut each buffer into 42 or so sub-buffers. Each sub-buffer copies the memory using gst_buffer_copy_region, but it turns out that the sysmem allocator handles that by allocating a new maxsize'd 600KB memory region (the maxsize of the full original buffer), and copying the full 600KB into it, then returning a new GstMemory pointing to the appropriate offset in that memory. Of course, this means it copies around 42 * 600KB per frame, instead of ~ 60KB or total data. Modifying rtpjpegpay to do a single gst_memory_copy_region (... 0, size) at the start to get a share-able buffer reduces CPU consumption for the pipeline to < 1%, but it's wasteful to copy the buffer in the general case. I think the correct fix is to make the sysmem allocator respect the requested offset/size parameters, even though it means returning a block that can't be resized back to the same maxsize as the original GstMemory. It should probably preserve the original alignment though.
Created attachment 251455 [details] [review] 0001-sysmem-Only-copy-the-requested-part-of-memory-instea.patch Yes, for sysmem it always copies the complete source memory for some reason. This should fix it
Doesn't break the core unit tests at least, but please tell me if it fixes things for you :)
Review of attachment 251455 [details] [review]: Yes, it fixes my problem with gstrtpjpegpay. The only part that gives me pause is creating a new block and copying in at offset 0. In cases where alignment is important, the output memory region may not have the correct data alignment if the offset isn't a multiple of the alignment.
Review of attachment 251455 [details] [review]: Actually, it's not the only place where the offset is allowed to not be aligned (fallback_mem_copy does it too), so I guess it doesn't matter.
_sysmem_new_block() is making sure that the alignment of the new memory is correct. commit 5d26d67ba7d8ec0565e81d2a22abb7b6ead2fed6 Author: Sebastian Dröge <slomo@circular-chaos.org> Date: Tue Aug 13 13:06:50 2013 +0200 sysmem: Only copy the requested part of memory instead of the complete source memory https://bugzilla.gnome.org/show_bug.cgi?id=705678
Yes, the new memory block will be aligned. The data within it may not have the same alignment after the copy - for example, copy from a buffer containing uint32's with an offset of 3 - the new memory won't have aligned uint32s where the old buffer did.
Cherry-picked into 1.0 branch. I believe the alignment issue is a concern, but pre-existing and unrelated to this patch (?).