After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 754189 - payloaders: memory performance
payloaders: memory performance
Status: RESOLVED OBSOLETE
Product: GStreamer
Classification: Platform
Component: gst-plugins-base
1.5.2
Other Linux
: Normal enhancement
: git master
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2015-08-27 16:55 UTC by Miguel París Díaz
Modified: 2018-11-03 11:40 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Miguel París Díaz 2015-08-27 16:55:41 UTC
Hello,
I have done some profiling about some payloaders and I have realized that an important CPU percentage is used for memory management:
 1 - allocating and freeing RTP buffers
 2 - mapping/unmapping  RTP buffers

Possible solutions to improve efficiency:
 1 - Could we use a kind of buffer pool instead of creating new buffers and freeing again and again?
 2 - Could be a way of improve mapping/unmapping of RTP buffers?


---- PROFILE REPORT ----

VP8 payloader:
gst-launch-1.5 videotestsrc num-buffers=100 ! vp8enc ! rtpvp8pay ! fakesink enable-last-sample=false

Percentages of usage (with gst_rtp_base_payload_chain as root):
100 - gst_rtp_base_payload_chain (ir per call: 15283)
  98.67 - gst_rtp_vp8_pay_handle_buffer
    7.29 - gst_buffer_copy_region
    4.18 - gst_buffer_append
    29.67 - gst_rtp_vp8_create_header_buffer
      17.94 - gst_rtp_buffer_new_allocate
        14.57 - gst_rtp_buffer_allocate_data
        3.16 - gst_buffer_new
    35.24 - gst_rtp_base_payload_push_list
      8.96 - gst_rtp_base_payload_prepare_push
        6.65 - set_headers
          4.24 - gst_rtp_buffer_map
          1.42 - gst_rtp_buffer_unmap
      22.41 - gst_base_sink_chain_list
        12.38 - gst_mini_object_unref
          7.13 - _gst_buffer_list_free
          3.94 - _gst_buffer_free

OPUS payloader
gst-launch-1.5 audiotestsrc num-buffers=1000 ! opusenc ! rtpopuspay ! fakesink enable-last-sample=false

Percentages of usage (with gst_rtp_base_payload_chain as root):
100 - gst_rtp_base_payload_chain
  96.62 - gst_rtp_opus_pay_handle_buffer
    8.38 - gst_buffer_append
    22.56 - gst_rtp_buffer_new_allocate
      16,83 - gst_rtp_buffer_allocate_data
      5.31 - gst_buffer_new
    63.42 - gst_rtp_base_payload_push
      16.40 - gst_rtp_base_payload_prepare_push
        12.98 - set_headers
          8.04 - gst_rtp_buffer_map
          2.92 - gst_rtp_buffer_unmap
      37.62 - gst_base_sink_chain
        20.98 - gst_mini_object_unref
          18.97 - _gst_buffer_free
Comment 1 Nicolas Dufresne (ndufresne) 2015-08-27 18:17:44 UTC
I remember a few patches to ensure depayloader only map the rtpbuffer once. Might be affecting the payloaders too. Patches are more the  welcome.

About the allocation, if the allocated size is dynamic, there is very little you can optimize (it's a case by case thing). Some depayloader allocate a bigger buffer and copy to it (which can be the fastest way, or not), and some allocate small chunk and append/prepend. Any patches are welcome of course, it's nice that you are looking at this.
Comment 2 Olivier Crête 2015-08-27 20:36:33 UTC
You should be able to pool the buffers with the gstmemory for the headers with a bit of care, dropping all gstmemory except the first one, check if the size is right and if it is still writable, then you can drop the MEMORY_TAG flags on the buffers.

The other problem is gst_rtp_buffer_map() does validation, unmap should be almost free.
Comment 3 Nicolas Dufresne (ndufresne) 2015-08-27 20:48:06 UTC
For the map, because validation in a payloader is useless, we could add a GST_RTP_MAP_CHECK_NOTHING (inspired from pad check flags naming use to link without checks), passed as an extension of the GstMapFlags flag set.
Comment 4 Miguel París Díaz 2015-08-29 11:31:21 UTC
Thanks you all for the proposals ;).

@nicilas,
In the case of video, usually some RTP buffers are generated per frame, so the most of them have the size of the "mtu" property.
In the case of audio, we could allocate a bigger buffers to ensure that RTP packets fit into.

@olivier,
  - Is there some element that already implements this?
  - if not, where can I find some refs or examples to use pools in the way you say?
Comment 5 Sebastian Dröge (slomo) 2015-08-31 08:06:13 UTC
It probably makes sense to try if using a mtu-sized buffer pool improves anything in the payloaders. Main problem here is going to be that you then actually need to *copy* the payload instead of creating "subbuffers" as many payloaders do now. Or you could create a buffer pool of RTP-header sized buffers and only attach the payloader "subbuffers".

This needs careful testing to see how well it works and if it actually improves something in general or makes things worse in other situations.


For the mapping, a GST_RTP_MAP_CHECK_NOTHING definitely makes sense in the payloaders. That probably improves performance a bit without adding any risk of making things worse.
Comment 6 GStreamer system administrator 2018-11-03 11:40:53 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/issues/218.