After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 705678 - sysmem allocator copies too much
sysmem allocator copies too much
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gstreamer (core)
git master
Other Linux
: Normal normal
: 1.0.10
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on: 707534
Blocks:
 
 
Reported: 2013-08-08 14:31 UTC by Jan Schmidt
Modified: 2013-09-05 05:01 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
0001-sysmem-Only-copy-the-requested-part-of-memory-instea.patch (1.23 KB, patch)
2013-08-13 11:08 UTC, Sebastian Dröge (slomo)
committed Details | Review

Description Jan Schmidt 2013-08-08 14:31:14 UTC
Not sure whether this is a bug in the sysmem allocator, or in elements. In short, v4l2src ! rtpjpegpay consumes 100% cpu, almost all in memcpy.

v4l2src emits (JPEG) buffers of about 60KB each with the NO_SHARE memory flag, and rtpjpegpay tries to cut each buffer into 42 or so sub-buffers. Each sub-buffer copies the memory using gst_buffer_copy_region, but it turns out that the sysmem allocator handles that by allocating a new maxsize'd 600KB memory region (the maxsize of the full original buffer), and copying the full 600KB into it, then returning a new GstMemory pointing to the appropriate offset in that memory.

Of course, this means it copies around 42 * 600KB per frame, instead of ~ 60KB or total data.

Modifying rtpjpegpay to do a single gst_memory_copy_region (... 0, size) at the start to get a share-able buffer reduces CPU consumption for the pipeline to < 1%, but it's wasteful to copy the buffer in the general case.

I think the correct fix is to make the sysmem allocator respect the requested offset/size parameters, even though it means returning a block that can't be resized back to the same maxsize as the original GstMemory. It should probably preserve the original alignment though.
Comment 1 Sebastian Dröge (slomo) 2013-08-13 11:08:19 UTC
Created attachment 251455 [details] [review]
0001-sysmem-Only-copy-the-requested-part-of-memory-instea.patch

Yes, for sysmem it always copies the complete source memory for some reason. This should fix it
Comment 2 Sebastian Dröge (slomo) 2013-08-13 11:37:09 UTC
Doesn't break the core unit tests at least, but please tell me if it fixes things for you :)
Comment 3 Jan Schmidt 2013-08-13 13:24:01 UTC
Review of attachment 251455 [details] [review]:

Yes, it fixes my problem with gstrtpjpegpay. The only part that gives me pause is creating a new block and copying in at offset 0. In cases where alignment is important, the output memory region may not have the correct data alignment if the offset isn't a multiple of the alignment.
Comment 4 Jan Schmidt 2013-08-13 13:33:03 UTC
Review of attachment 251455 [details] [review]:

Actually, it's not the only place where the offset is allowed to not be aligned (fallback_mem_copy does it too), so I guess it doesn't matter.
Comment 5 Sebastian Dröge (slomo) 2013-08-13 13:49:05 UTC
_sysmem_new_block() is making sure that the alignment of the new memory is correct.

commit 5d26d67ba7d8ec0565e81d2a22abb7b6ead2fed6
Author: Sebastian Dröge <slomo@circular-chaos.org>
Date:   Tue Aug 13 13:06:50 2013 +0200

    sysmem: Only copy the requested part of memory instead of the complete source memory
    
    https://bugzilla.gnome.org/show_bug.cgi?id=705678
Comment 6 Jan Schmidt 2013-08-13 13:50:55 UTC
Yes, the new memory block will be aligned. The data within it may not have the same alignment after the copy - for example, copy from a buffer containing uint32's with an offset of 3 - the new memory won't have aligned uint32s where the old buffer did.
Comment 7 Tim-Philipp Müller 2013-08-27 11:21:25 UTC
Cherry-picked into 1.0 branch.

I believe the alignment issue is a concern, but pre-existing and unrelated to this patch (?).