GNOME Bugzilla – Bug 785733
rtpbin: ntp-sync does no sanity checking for clock sync
Last modified: 2017-09-15 10:34:33 UTC
The ntp-sync parameter of rtpbin synchronizes received streams based on the assumption that sender and receiver clocks are in sync. If they are not, an arbitrarily large time offset can be applied to each received buffer. Since NTP can take quite some time before all clocks are in sync, this leads to corrupt timestamps before all clocks have 'settled' in. A sanity check of the calculated time offset is thus needed. If to large it would be better to simply skip the offset. Furthermore, currently a positive time offset might be applied. But if a positive offset is calculated it must mean that the clocks are not in sync. A buffer will obviously always arrive 'late' at the receiver, so to get its creation time we need to remove, not add time. A timestamp offset sanity check could be if (ts_offset < 0 && > -SOME_LIMIT) apply offset else skip offset When synchronizing streams without ntp the rtpbin currently does a sanity check and it will not apply an offset larger than 3 sec.
The 3s limit was intentionally disabled here: commit 796c1d80292fb049b6b92c086a5b01dc0c2de6dc Author: Wim Taymans <wim.taymans@collabora.co.uk> Date: Wed Oct 17 12:24:22 2012 +0200 rtpbin: disable check for ntp-sync Disable the check for the ntp-sync method. It is expected that a rather larger offset needs to be applied with this method. Maybe there should be a configurable limit instead? I'm currently following the code to understand a bit more why / what / when these things will happen and what would make sense.
The limit was removed in relation to the "rtcp-sync-send-time" property IIRC. The offset can be rather large depending on latency, and whether the times in the RTCP packets are based on sending or capture time (of the RTP packets). But I'll follow up with more later.
Note, when I say "rendered" below this ignores the latency configured on the pipeline. It only considers the running time of the packets at the receiver, in relation to its clock time. Correct me if I'm wrong, but basically the ts-offset here is the offset we have to apply between the correct synchronization we now know via RTCP (this RTP time was supposed to be rendered at this NTP time) and what we did before the first RTCP packet. Before the first RTCP packet we would render it as soon as possible, so basically at the time it arrived. The ts-offset is always related to the very first packet that was rendered, as that defines what our running-time/clock-time/base-time relationship is, and the RTCP packets would tell us how to correct for the error we introduced there. For negative offsets that means that we rendered the first packet later than we should have, which seems like how it is expected to be. Simply said, the first packet is usually rendered $network_latency later than it was sent from the sender, so we have to subtract at least the network latency (which might be higher than 3s, especially if there are further processing steps or if the RTP/NTP timestamp relationship in the RTCP packets is based on the capture time instead of the sending time). Of course the receiver would have to have enough latency configured in the pipeline to be able to handle this. For positive offsets that would mean that we rendered the first packet earlier than it should've been. And I think you're right here, this would mean that we rendered a packet before it was sent (if RTCP timestamps are based on sending time) or even captured :) So ignoring positive offsets seems reasonable, ignoring big negative offsets not so much. What I think should happen here is that a) positive offsets are ignored for ntp-sync as you suggest, and b) negative offsets are ignored if they're higher than the value set via a new property (which defaults to "infinity" for now I guess). What do you think?
I agree, setting this value via a property would be the best solution since latency is different from case to case. I will provide a patch with this new property with infinity as default. Before I do it might be good to agree on the name though since it will be used in many places :) How about, max-ntp-sync-ts-offset ?
Hm but ideally that value would also be used for the non-ntp-sync case. But that's tricky to handle then, it would have to have different default values based on the other property (i.e. you need to remember if it was ever set to something). Name seems good otherwise
Ok, so it should look like this then? 1) new property max-ts-offset 2) if !ntp-sync default max-ts-offset = 3s else default max-ts-offset = 0 (infinity) 3) if max-ts-offset 'has been set by user' max-ts-offset = user-value 4) never allow ts_offset > 0
1-3) yes, 4) not sure. Can a positive ts-offset happen/make sense for the non-ntp-sync case? How does it align the streams? IIRC it takes their offsets to the latest stream and then adds a *positive* offset to all others (making them as late as the latest stream), or do I misremember?
You are correct, it even says so in one of the comments of the code. So (4) only applies for the ntp sync case then. I think I have enough now to make a patch for you to look at.
Great, thanks :)
Created attachment 359582 [details] [review] Proposed patch according to solution in comment #6-8
Review of attachment 359582 [details] [review]: Please explain the sanity checking in the commit message already (which cases are handled, and why are those invalid?) Also this changes behaviour in rtpbin. We would now always have the 3s limit, even if ntp-sync is enabled. Not when using rtspsrc as it works around that, but when using only rtpbin. ::: gst/rtpmanager/gstrtpbin.c @@ +1593,3 @@ + stream_set_ts_offset (bin, ostream, ts_offset, bin->max_ts_offset, + 4 * GST_MSECOND, TRUE); Where do these 4ms come from? Maybe make them a #define at the top
I only implemented the ntp-sync default behaviour in rtspsrc since I thought it was the only user of this rtpbin feature. Are there other rtpbin/ntp-sync users? If there are I could add the same logic in rtpbin as in rtspsrc and then only set max-ts-offset from rtspsrc if it differs from the value in rtpbin (since it has side affects) The 4 ms comes from the original code, and was only applied to the non ntp-sync case. Why, I don't know but I kept it to not change the behaviour. I can add a define for it.
(In reply to Patrick Radizi from comment #12) > I only implemented the ntp-sync default behaviour in rtspsrc since I thought > it was the only user of this rtpbin feature. Are there other rtpbin/ntp-sync > users? Every application using rtpbin directly and setting that property. At least for myself I know that I did this a few times in the past. > If there are I could add the same logic in rtpbin as in rtspsrc and then > only set max-ts-offset from rtspsrc if it differs from the value in rtpbin > (since it has side affects) Ack > The 4 ms comes from the original code, and was only applied to the non > ntp-sync case. Why, I don't know but I kept it to not change the behaviour. > I can add a define for it. Yeah, just add a #define for that and put the comment (the stuff about NTP/RTP timestamp rounding errors) next to the #define :)
Created attachment 359830 [details] [review] max-ts-offset patch Updated patch according to comment #13
commit 3de02445320fcf880bc57f93e2abc34e4b28ea58 (HEAD -> master) Author: Patrick Radizi <patrickr@axis.com> Date: Thu Sep 14 13:00:56 2017 +0200 rtpbin: add option for sanity checking timestamp offset Timestamp offsets needs to be checked to detect unrealistic values caused for example by NTP clocks not in sync. The new parameter max-ts-offset lets the user decide an upper offset limit. There are two different cases for checking the offset based on if ntp-sync is used or not: 1) ntp-sync enabled Only negative offsest are allowed since a positive offset would mean that the sender and receiver clocks are not in sync. Default vaule of max-ts-offset = 0 (disabled) 2) ntp-sync disabled Both positive and negative offsets are allowed. Default vaule of max-ts-offset = 3000000000 The reason for different default values is to be backwards compatible. https://bugzilla.gnome.org/show_bug.cgi?id=785733