GNOME Bugzilla – Bug 792825
rtph264pay: add support for STAP-A bundling
Last modified: 2018-11-03 15:25:41 UTC
Created attachment 367317 [details] [review] Patch implementing STAP-A for the RTP H.264 payloader RFC 6184 (https://tools.ietf.org/html/rfc6184#section-5.7) describes how to aggregate (bundle) small RTP packets (typically SPS/PPS); this can improve bandwidth in some cases. The rtph264depay already supports depayloading such bundled packets, but so far the payloader does not rtph264pay does not; attached patch attempts to change that. Some notes: - By default, no aggregation is done, but there's a property "do-aggregate" now to enable this - I've attempted to change the existing payloader as little as possible Patch is against git HEAD.
Nice!
Review of attachment 367317 [details] [review]: I really like the idea of this patchset and I wanted to have this for years! ::: gst/rtp/gstrtph264pay.c @@ +71,3 @@ #define DEFAULT_SPROP_PARAMETER_SETS NULL #define DEFAULT_CONFIG_INTERVAL 0 +#define DEFAULT_DO_AGGREGATE FALSE I think this should default to TRUE as I think it's almost always the best behaviour. My only worry is that it will break interop with lesser RTP stacks. @@ -990,0 +1039,53 @@ + +static void +gst_rtp_h264_reset_bundle (GstRtpH264Pay * rtph264pay) ... 50 more ... Don't leave code between "#if 0", you can use GST_CAT_MEMDUMP() or if (gst_debug_category_get_threshold (GST_CAT_DEFAULT) > GST_LEVLE_MEMDUMP) { and then print top the memdump level using the macros).
Created attachment 367569 [details] [review] Patch implementing STAP-A for the RTP H.264 payloader Thanks for the quick review! Here's an updated version of the patch, with the debug code removed and "do-aggregate" enabled by default. It's a bit hard to estimate how risky it is to enable by default; I haven't seen any problems, but who knows what's out there...
Is anything else needed for this to get merged? Thanks!
Not a requirement, but It Would Be Nice (tm) if there was a unit test addition to go along with it - just something simple that runs through the new code path, so that it gets valgrinded etc.
It seems tests/check/elements/rtp-payloading.c is already testing this (since bundling is on by default). Haven't figured out how to run with valgrind though; it seems that rtp-payloading.c is excluded ('VALGRIND_TESTS_DISABLE')
Right, fair enough :) $ cd tests/check/ $ make elements/rtp-payloading.check (run first to create registry, valgrind barfs on the forking iirc) $ make elements/rtp-payloading.valgrind Not sure why it's disabled. I get a caps leak, at least in the 1.14 branch (without this patch), but otherwise it seems clean.
With master (and with the patch) I see two cases of 'definitely lost' (see below), but they are in the _de_payloader, rather than the payloader. Running suite(s): rtp_data_test ==11909== 279 (72 direct, 207 indirect) bytes in 1 blocks are definitely lost in loss record 2,111 of 2,156 ==11909== at 0x4C2DBAB: malloc (vg_replace_malloc.c:299) ==11909== by 0x6B9A2B5: g_malloc (gmem.c:99) ==11909== by 0x6BB1E46: g_slice_alloc (gslice.c:1025) ==11909== by 0x5C5CD3A: gst_caps_new_empty (gstcaps.c:253) ==11909== by 0x5C5CF18: gst_caps_new_simple (gstcaps.c:326) ==11909== by 0x84DFD8D: gst_rtp_h264_set_src_caps (gstrtph264depay.c:324) ==11909== by 0x84E1873: gst_rtp_h264_depay_setcaps (gstrtph264depay.c:738) ==11909== by 0x875E83C: gst_rtp_base_depayload_setcaps (gstrtpbasedepayload.c:289) ==11909== by 0x875E83C: gst_rtp_base_depayload_handle_event (gstrtpbasedepayload.c:571) ==11909== by 0x5C8C136: gst_pad_send_event_unchecked (gstpad.c:5756) ==11909== by 0x5C8C62F: gst_pad_push_event_unchecked (gstpad.c:5409) ==11909== by 0x5C8CAD5: push_sticky (gstpad.c:3932) ==11909== by 0x5C8A6E7: events_foreach (gstpad.c:612) ==11909== by 0x5C959E6: check_sticky (gstpad.c:3992) ==11909== by 0x5C959E6: gst_pad_push_event (gstpad.c:5543) ==11909== by 0x875CD52: gst_pad_set_caps (gstcompat.h:59) ==11909== by 0x875CD52: gst_rtp_base_payload_negotiate (gstrtpbasepayload.c:1057) ==11909== by 0x875D427: gst_rtp_base_payload_chain (gstrtpbasepayload.c:627) ==11909== by 0x5C8E031: gst_pad_chain_data_unchecked (gstpad.c:4334) ==11909== by 0x5C8E031: gst_pad_push_data (gstpad.c:4590) ==11909== by 0x5C94A01: gst_pad_push (gstpad.c:4709) ==11909== by 0x5C7B45A: gst_proxy_pad_chain_default (gstghostpad.c:127) ==11909== by 0x5C8E031: gst_pad_chain_data_unchecked (gstpad.c:4334) ==11909== by 0x5C8E031: gst_pad_push_data (gstpad.c:4590) ==11909== by 0x5C94A01: gst_pad_push (gstpad.c:4709) ==11909== ==12298== 9,488 (816 direct, 8,672 indirect) bytes in 3 blocks are definitely lost in loss record 2,653 of 2,658 ==12298== at 0x4C2DBAB: malloc (vg_replace_malloc.c:299) ==12298== by 0x6B9A2B5: g_malloc (gmem.c:99) ==12298== by 0x6BB1E46: g_slice_alloc (gslice.c:1025) ==12298== by 0x5C53A4A: gst_buffer_new (gstbuffer.c:801) ==12298== by 0x5C557B4: gst_buffer_copy_with_flags (gstbuffer.c:662) ==12298== by 0x5C88477: gst_mini_object_make_writable (gstminiobject.c:315) ==12298== by 0x9049288: set_headers (gstrtpbasedepayload.c:716) ==12298== by 0x9049C6E: gst_rtp_base_depayload_prepare_push (gstrtpbasedepayload.c:754) ==12298== by 0x904A218: gst_rtp_base_depayload_push (gstrtpbasedepayload.c:785) ==12298== by 0x8DF4C35: gst_rtp_vorbis_depay_switch_codebook (gstrtpvorbisdepay.c:436) ==12298== by 0x8DF4C35: gst_rtp_vorbis_depay_process (gstrtpvorbisdepay.c:513) ==12298== by 0x904A624: gst_rtp_base_depayload_handle_buffer.isra.3 (gstrtpbasedepayload.c:429) ==12298== by 0x5C8E031: gst_pad_chain_data_unchecked (gstpad.c:4334) ==12298== by 0x5C8E031: gst_pad_push_data (gstpad.c:4590) ==12298== by 0x5C94A01: gst_pad_push (gstpad.c:4709) ==12298== by 0x8DF58EF: gst_rtp_vorbis_pay_flush_packet (gstrtpvorbispay.c:341) ==12298== by 0x8DF5E33: gst_rtp_vorbis_pay_payload_buffer (gstrtpvorbispay.c:635) ==12298== by 0x8DF6FA9: gst_rtp_vorbis_pay_handle_buffer (gstrtpvorbispay.c:876) ==12298== by 0x5C8E031: gst_pad_chain_data_unchecked (gstpad.c:4334) ==12298== by 0x5C8E031: gst_pad_push_data (gstpad.c:4590) ==12298== by 0x5C94A01: gst_pad_push (gstpad.c:4709) ==12298== by 0x4E73B2B: gst_audio_encoder_finish_frame (gstaudioencoder.c:991) ==12298== by 0x84BE335: gst_vorbis_enc_output_buffers (gstvorbisenc.c:992) ==12298== 100%: Checks: 49, Failures: 0, Errors: 0 Check suite rtp_payloading ran in 79.662s (tests failed: 0)
Bundling is required when sending a H.264 RTP stream through Janus to Chrome on MacOS or Android, as otherwise decoding will not start. Unfortunately, the simple SPS/PPS bundling here is not sufficient. I had already written my own much more aggressive bundling. I've merged the two implementations and will post patches shortly. Our stream now looks identical to the one sent by FFmpeg 4.0, except we do proper nal_ref_idc propagation. :-)
Created attachment 372863 [details] [review] rtph264pay: Cleanup in preparation of STAP-A bundling - Code refactoring. - Fix off-by-one error in MTU fitting. - Simplify the fragmentation loop.
Created attachment 372864 [details] [review] rtph264pay: Support STAP-A bundling Add a new property "do-aggregate"* to the H.264 RTP payloader which enables STAP-A aggregation as per [RFC-6184][1]. With aggregation enabled, packets are bundled instead of sent immediately, up until the MTU size. Bundles also end at access unit boundaries or when packets have to be fragmented. *: The property-name is kept generic since it might apply more widely, e.g. STAP-B or MTAP. [1]: https://tools.ietf.org/html/rfc6184#section-5.7
Created attachment 372885 [details] [review] rtph264pay: Support STAP-A bundling Send the queued bundle in more cases, like EOS and discont. Fixes bundling for byte-stream input.
Review of attachment 372885 [details] [review]: ::: gst/rtp/gstrtph264pay.c @@ +1088,3 @@ + GST_WRITE_UINT16_BE (size_header, gst_buffer_get_size (buf)); + gst_buffer_append_memory (outbuf, + gst_memory_new_wrapped (0, size_header, 2, 0, 2, size_header, g_free)); Just use gst_memory_new_allocate() instead, you'll have one allocation instead of 2. @@ +1127,3 @@ + if (ret != GST_FLOW_OK) + goto out; + } You may also want to send the bundle if the timestamp changes. The AUD is only present in byte-stream mode, but generally not in AVC1/3 modes.
Created attachment 372942 [details] [review] rtph264pay: Cleanup in preparation of STAP-A bundling More changes: - Split sending function into prechecks, single packet and fragmentation. Bundling is called after SPS/PPS injection while bundling only calls fragmentation or single packet sending. This unbreaks SPS/PPS injection in the presence of bundling. - Proper discont/delta-unit handling when injecting SPS/PPS. This should also work better with the checks in the bundling.
Created attachment 372943 [details] [review] rtph264pay: Support STAP-A bundling Reworked the code that handles sending the bundle before the current NALU. rtph264pay->bundle_size is now the total bundle size, including header. Use only one allocation for the size header memory. Only call the fragmentation code when we have a packet that can't be bundled.
Review please?
I only noticed this patch last week. Though it's in my to-do.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/issues/434.