GNOME Bugzilla – Bug 722370
rtprtxsend: push rtx buffers from a different thread to avoid long retransmission delays
Last modified: 2014-01-21 14:02:05 UTC
This is a follow-up on https://bugzilla.gnome.org/show_bug.cgi?id=711084 for one of the patches that were not commited. The problem essentially is that currently rtprtxsend pushes retransmission buffers from its chain() function, which requires chain() to be called, i.e. there need to be some "master stream" buffers ready to be sent in order to send rtx buffers for older requests. This introduces a retransmission delay that can be critical for stream reconstruction on the receiver end and is unnecessary. This patch attempts to fix this by having a separate thread that pushes retransmission buffers as soon as they are requested. On this attempt, the new thread only pushes retransmission buffers, as Wim mentioned it should in https://bugzilla.gnome.org/show_bug.cgi?id=711084#c68. Also, some unit tests that were fixed in order to have a deterministic output and not fail at random due to the new thread. Please note that there are still two unit tests failing at random: rtprtx::test_drop_multiple_sender and rtprtx::test_drop_one_sender. These unit tests are quite complex and weird, but despite having analyzed them fully, I am unable to find a reason for their failure. It looks like some rtx buffers are lost between rtprtxsend and rtprtxreceive in their pipelines, sometimes. At this point, I'm all for removing them, since they do not offer much in my opinion. The functionality is more or less also tested in the rest of the tests, which work fine. The code can be found here: http://cgit.collabora.com/git/user/gkiagia/gst-plugins-good.git/log/?h=rtxsend-new-thread
(In reply to comment #0) > Please note that there are still two unit tests failing at random: > rtprtx::test_drop_multiple_sender and rtprtx::test_drop_one_sender. These unit > tests are quite complex and weird, but despite having analyzed them fully, I am > unable to find a reason for their failure. It looks like some rtx buffers are > lost between rtprtxsend and rtprtxreceive in their pipelines, sometimes. At > this point, I'm all for removing them, since they do not offer much in my > opinion. The functionality is more or less also tested in the rest of the > tests, which work fine. Please don't remove the tests, the reason why they fail in this case is because they reveal a bug in the elements. The EOS event jumps over the RTX packets and causes the pipeline to stop early and racily. The rtx sender should probably place the EOS event in the queue and only push it forward when the queue is drained.
(In reply to comment #1) > Please don't remove the tests, the reason why they fail in this case is because > they reveal a bug in the elements. The EOS event jumps over the RTX packets and > causes the pipeline to stop early and racily. The rtx sender should probably > place the EOS event in the queue and only push it forward when the queue is > drained. Thanks for the hint. I had actually thought of something going wrong with the EOS event, so I tried to drop all EOS events from the pad probe that is installed right after the queue element and manually stop the pipeline a few seconds after the last expected EOS event was received (there are 4 sources, so 4 EOS events), but that didn't really help. But anyway, I tried to implement your suggestion and it seems that if I send the EOS event from the srcpad task *and* flush the queue afterwards so that no more rtx buffers can be sent, then it works. The patch is in the same branch.
commit 1a300eb50967e88ad73ffeeb44a84e268d4705b8 Author: George Kiagiadakis <george.kiagiadakis@collabora.com> Date: Tue Jan 21 13:42:38 2014 +0100 rtprtxsend: ensure that no rtx buffers are sent after EOS To do that, enqueue the EOS event to be sent from the srcpad task thread and flush the queue right afterwards, so that no more rtx buffers can be sent, even if there are more requests coming in. Fixes https://bugzilla.gnome.org/show_bug.cgi?id=722370 commit 133913a11ae35656797a7101ad0e01d484bea46b Author: George Kiagiadakis <george.kiagiadakis@collabora.com> Date: Wed Jan 15 09:46:14 2014 +0100 rtprtxsend: run a new GstTask on the src pad The reason behind this is to minimize the retransmission delay. Previously, when a NACK was received, rtprtxsend would put a retransmission packet in a queue and it would send it from chain(), i.e. only after a new buffer would arrive. This unfortunately was causing big delays, in the order of 60-100 ms, which can be critical for the receiver side. By having a separate GstTask for pushing buffers out of rtxsend, we can push buffers out right after receiving the event, without waiting for chain() to get called.
pushed the whole branch