GNOME Bugzilla – Bug 765933
rtpjitterbuffer: Fix stall when receiving already lost packet
Last modified: 2016-05-11 06:45:36 UTC
Created attachment 327210 [details] [review] test and fix When a packet arrives that has already been considered lost as part of a large gap the "lost timer" for this will be cancelled. If the remaining packets of this large gap never arrives, there will be missing entries in the queue and the loop function will keep waiting for these packets to arrive and never push another packet, effectively stalling the pipeline. The proposed fix consideres parts of a large gap definitely lost (since they are calculated from latency) and ignores the late arrivals. In practice the issue is rare since large gaps are sceduled immediately, and for the stall to happen the late arrival needs to be processed before this times out.
Review of attachment 327210 [details] [review]: Makes sense to me ::: gst/rtpmanager/gstrtpjitterbuffer.c @@ +1996,3 @@ + + if (test->num > 1 && test->type == TIMER_TYPE_LOST && + seqnum >= test->seqnum && seqnum < (test->seqnum + test->num)) { Shouldn't this take extended seqnums or somehow care for wraparounds?
Indeed. Will write an additional test and update the patch.
Created attachment 327228 [details] [review] Updated test and fix Updated the patch to properly handle seqnum wraparound. However, it was by chance working with the previous patch as well since it turns out the stall only happens when the first packet of a large gap arrives late, and for the first packet the wraparound is not an issue. For the other packets the lost event will actually be sent, and the late packets will be considered old when the loop function handles them and thus dropped. Anyway, the updated patch includes tests with wraparound and handles this case properly (and not by chance) in the jitterbuffer.
Comment on attachment 327228 [details] [review] Updated test and fix Patch looks good but a) the test part does not apply (we don't have a test_latency_changed_event test in master yet) and b) it times out here :) 72%: Checks: 22, Failures: 0, Errors: 6 elements/rtpjitterbuffer.c:1445:E:general:test_considered_lost_packet_in_large_gap_arrives:0: (after this point) Test timeout expired elements/rtpjitterbuffer.c:1445:E:general:test_considered_lost_packet_in_large_gap_arrives:1: (after this point) Test timeout expired [...]
(In reply to Sebastian Dröge (slomo) from comment #4) > Comment on attachment 327228 [details] [review] [review] > Updated test and fix > > Patch looks good but a) the test part does not apply (we don't have a > test_latency_changed_event test in master yet) and b) it times out here :) > > 72%: Checks: 22, Failures: 0, Errors: 6 > elements/rtpjitterbuffer.c:1445:E:general: > test_considered_lost_packet_in_large_gap_arrives:0: (after this point) Test > timeout expired > elements/rtpjitterbuffer.c:1445:E:general: > test_considered_lost_packet_in_large_gap_arrives:1: (after this point) Test > timeout expired > [...] I will rebase on latest master and investigate. (we are a bit out of sync ATM)
Created attachment 327364 [details] [review] rebased and updated test and fix Also fixed up a few spelling-errors
Comment on attachment 327364 [details] [review] rebased and updated test and fix Applies and does not timeout anymore but fails: Running suite(s): rtpjitterbuffer 81%: Checks: 22, Failures: 4, Errors: 0 elements/rtpjitterbuffer.c:482:F:general:test_considered_lost_packet_in_large_gap_arrives:2: 'expected_seqnum' (65536) is not equal to 'seqnum' (0) elements/rtpjitterbuffer.c:482:F:general:test_considered_lost_packet_in_large_gap_arrives:3: 'expected_seqnum' (65536) is not equal to 'seqnum' (0) elements/rtpjitterbuffer.c:482:F:general:test_considered_lost_packet_in_large_gap_arrives:4: 'expected_seqnum' (65537) is not equal to 'seqnum' (1) elements/rtpjitterbuffer.c:482:F:general:test_considered_lost_packet_in_large_gap_arrives:5: 'expected_seqnum' (65537) is not equal to 'seqnum' (1)
Created attachment 327369 [details] [review] updated (again) test and fix
commit 8f7962e1c3178bb6f84c7d9d06e3172ae7dfd273 Author: Havard Graff <havard.graff@gmail.com> Date: Tue May 3 11:45:01 2016 +0200 rtpjitterbuffer: Fix stall when receiving already lost packet When a packet arrives that has already been considered lost as part of a large gap the "lost timer" for this will be cancelled. If the remaining packets of this large gap never arrives, there will be missing entries in the queue and the loop function will keep waiting for these packets to arrive and never push another packet, effectively stalling the pipeline. The proposed fix conciders parts of a large gap definitely lost (since they are calculated from latency) and ignores the late arrivals. In practice the issue is rare since large gaps are scheduled immediately, and for the stall to happen the late arrival needs to be processed before this times out. https://bugzilla.gnome.org/show_bug.cgi?id=765933