GNOME Bugzilla – Bug 754189
payloaders: memory performance
Last modified: 2018-11-03 11:40:53 UTC
Hello, I have done some profiling about some payloaders and I have realized that an important CPU percentage is used for memory management: 1 - allocating and freeing RTP buffers 2 - mapping/unmapping RTP buffers Possible solutions to improve efficiency: 1 - Could we use a kind of buffer pool instead of creating new buffers and freeing again and again? 2 - Could be a way of improve mapping/unmapping of RTP buffers? ---- PROFILE REPORT ---- VP8 payloader: gst-launch-1.5 videotestsrc num-buffers=100 ! vp8enc ! rtpvp8pay ! fakesink enable-last-sample=false Percentages of usage (with gst_rtp_base_payload_chain as root): 100 - gst_rtp_base_payload_chain (ir per call: 15283) 98.67 - gst_rtp_vp8_pay_handle_buffer 7.29 - gst_buffer_copy_region 4.18 - gst_buffer_append 29.67 - gst_rtp_vp8_create_header_buffer 17.94 - gst_rtp_buffer_new_allocate 14.57 - gst_rtp_buffer_allocate_data 3.16 - gst_buffer_new 35.24 - gst_rtp_base_payload_push_list 8.96 - gst_rtp_base_payload_prepare_push 6.65 - set_headers 4.24 - gst_rtp_buffer_map 1.42 - gst_rtp_buffer_unmap 22.41 - gst_base_sink_chain_list 12.38 - gst_mini_object_unref 7.13 - _gst_buffer_list_free 3.94 - _gst_buffer_free OPUS payloader gst-launch-1.5 audiotestsrc num-buffers=1000 ! opusenc ! rtpopuspay ! fakesink enable-last-sample=false Percentages of usage (with gst_rtp_base_payload_chain as root): 100 - gst_rtp_base_payload_chain 96.62 - gst_rtp_opus_pay_handle_buffer 8.38 - gst_buffer_append 22.56 - gst_rtp_buffer_new_allocate 16,83 - gst_rtp_buffer_allocate_data 5.31 - gst_buffer_new 63.42 - gst_rtp_base_payload_push 16.40 - gst_rtp_base_payload_prepare_push 12.98 - set_headers 8.04 - gst_rtp_buffer_map 2.92 - gst_rtp_buffer_unmap 37.62 - gst_base_sink_chain 20.98 - gst_mini_object_unref 18.97 - _gst_buffer_free
I remember a few patches to ensure depayloader only map the rtpbuffer once. Might be affecting the payloaders too. Patches are more the welcome. About the allocation, if the allocated size is dynamic, there is very little you can optimize (it's a case by case thing). Some depayloader allocate a bigger buffer and copy to it (which can be the fastest way, or not), and some allocate small chunk and append/prepend. Any patches are welcome of course, it's nice that you are looking at this.
You should be able to pool the buffers with the gstmemory for the headers with a bit of care, dropping all gstmemory except the first one, check if the size is right and if it is still writable, then you can drop the MEMORY_TAG flags on the buffers. The other problem is gst_rtp_buffer_map() does validation, unmap should be almost free.
For the map, because validation in a payloader is useless, we could add a GST_RTP_MAP_CHECK_NOTHING (inspired from pad check flags naming use to link without checks), passed as an extension of the GstMapFlags flag set.
Thanks you all for the proposals ;). @nicilas, In the case of video, usually some RTP buffers are generated per frame, so the most of them have the size of the "mtu" property. In the case of audio, we could allocate a bigger buffers to ensure that RTP packets fit into. @olivier, - Is there some element that already implements this? - if not, where can I find some refs or examples to use pools in the way you say?
It probably makes sense to try if using a mtu-sized buffer pool improves anything in the payloaders. Main problem here is going to be that you then actually need to *copy* the payload instead of creating "subbuffers" as many payloaders do now. Or you could create a buffer pool of RTP-header sized buffers and only attach the payloader "subbuffers". This needs careful testing to see how well it works and if it actually improves something in general or makes things worse in other situations. For the mapping, a GST_RTP_MAP_CHECK_NOTHING definitely makes sense in the payloaders. That probably improves performance a bit without adding any risk of making things worse.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/issues/218.