GNOME Bugzilla – Bug 640859
basesink incorrectly categorizes timestamp jitter as drift
Last modified: 2014-12-23 11:28:10 UTC
Created attachment 179546 [details] [review] baseaudiosink: increase drift-tolerance to 100ms We (Nokia) found a mka vorbis clip with a drift of up to 90ms. All tested players (VLC, MPlayer, ffplay, Winamp, WMP, Android 2.3) play this clip just fine... Perhaps we should too. The attached patch should fix the issue.
Default is 40ms and thats a lot of 'drift' already (feel free to push the comment fix already). Wonder if we would only 'fix' that clip by chance with such a change. Please all add the clip here (and/or the tiemstamp log).
Comment on attachment 179546 [details] [review] baseaudiosink: increase drift-tolerance to 100ms >Subject: [PATCH] baseaudiosink: increase drift-tolerance to 100ms > > ... >Note that this value was 500ms previously. Hrm?
(In reply to comment #2) > (From update of attachment 179546 [details] [review]) > >Subject: [PATCH] baseaudiosink: increase drift-tolerance to 100ms > > > > ... > >Note that this value was 500ms previously. > > Hrm? http://cgit.freedesktop.org/gstreamer/gst-plugins-base/commit/?id=d8942e2 - /* we tollerate half a second diff before we start resyncing. This - * should be enough to compensate for various rounding errors in the timestamp - * and sample offset position. We always resync if we got a discont anyway and - * non-discont should be aligned by definition. */ - if (G_LIKELY (diff < ringbuf->spec.rate / DIFF_TOLERANCE)) { Note that DIFF_TOLERANCE is 2, and it was never removed. Anyway, you have increased it in the past: http://cgit.freedesktop.org/gstreamer/gst-plugins-base/commit/?id=848a7f2
(In reply to comment #1) > Default is 40ms and thats a lot of 'drift' already (feel free to push the > comment fix already). Wonder if we would only 'fix' that clip by chance with > such a change. Please all add the clip here (and/or the tiemstamp log). So, if virtually every player out there plays this clip (at least all the ones I've tried, and I've tried a lot), that means they all are wrong? GStreamer is the only one right? I don't think many people out there would agree with that.
Created attachment 179630 [details] Timestamp log Log showing the stored timestamp, the expected one, and the drift.
Could you provide the output of: gst-launch-0.10 -v filesrc location=... ! matroskademux ! identity name=beforedec ! vorbisdec ! fakesink Better yet if you can provide the clip. The problem most likely comes from elsewhere, this is just the visible part of the issue.
(In reply to comment #6) > Could you provide the output of: > gst-launch-0.10 -v filesrc location=... ! matroskademux ! identity > name=beforedec ! vorbisdec ! fakesink > > Better yet if you can provide the clip. > > The problem most likely comes from elsewhere, this is just the visible part of > the issue. From where? matroskademux? Anyway, here it is, I hope there are no issues with me posting it: http://people.freedesktop.org/~felipec/test.mka
(In reply to comment #4) > So, if virtually every player out there plays this clip (at least all the ones > I've tried, and I've tried a lot), that means they all are wrong? GStreamer is > the only one right? I don't think many people out there would agree with that. Have you considered the possibility that it's not about being right or wrong, and that both in fact may be "right", albeit approaching matters with a different focus, use-cases etc, and that in GStreamer's case it is at least configurable. In particular, this clip (or any one with broken [*] input timestamps), along with the fact that tolerance may have been 500ms at one point in time only suggests that one (= application) might consider increasing the tolerance to 100ms or 500ms or for that matter "infinity" if it really wants to support playback of broken _audio-only_ clips. Other considerations and available facts, such as all other non-broken stuff playing just fine with 40ms (= default) and a/v lip-sync sensitive to (afaik) 100ms variations indicate the present default is fine. In summary, the default, which by definition should Just Work (safely) for majority (-> not necessarily all) of use-cases and inputs is best to stay as it is, but it could be considered to have playbin2 tweak the drift-tolerance in some circumstances (e.g. single audio stream playback) (rather than relaying this to application using it, which can already configure so at present if desired). [*] definition of broken: providing an imperfect stream of timestamps, with imperfection in order of magnitude way beyond (single sample) rounding errors.
(In reply to comment #8) > (In reply to comment #4) > > So, if virtually every player out there plays this clip (at least all the ones > > I've tried, and I've tried a lot), that means they all are wrong? GStreamer is > > the only one right? I don't think many people out there would agree with that. > > Have you considered the possibility that it's not about being right or wrong, > and that both in fact may be "right", albeit approaching matters with a > different focus, use-cases etc, and that in GStreamer's case it is at least > configurable. Not really. There is such thing as right and wrong, and if you play this clip with Totem (or any other GStreamer player) it sounds wrong. So, somebody has to be wrong, either GStreamer, Totem, or the clip. > In particular, this clip (or any one with broken [*] input timestamps), along > with the fact that tolerance may have been 500ms at one point in time only > suggests that one (= application) might consider increasing the tolerance to > 100ms or 500ms or for that matter "infinity" if it really wants to support > playback of broken _audio-only_ clips. > > Other considerations and available facts, such as all other non-broken stuff > playing just fine with 40ms (= default) and a/v lip-sync sensitive to (afaik) > 100ms variations indicate the present default is fine. > > In summary, the default, which by definition should Just Work (safely) for > majority (-> not necessarily all) of use-cases and inputs is best to stay as it > is, but it could be considered to have playbin2 tweak the drift-tolerance in > some circumstances (e.g. single audio stream playback) (rather than relaying > this to application using it, which can already configure so at present if > desired). > > > [*] definition of broken: providing an imperfect stream of timestamps, with > imperfection in order of magnitude way beyond (single sample) rounding errors. Certainly 40ms is way beyond rounding errors, isn't it? So why 40ms? Why not 20ms, or 10ms? Because 40ms is needed for some clips, well, it turns out 100ms is needed for some clips too. So, what is "wrong" or unsafe with 100ms? Would it have any negative impact? Is anybody relying on this value being low? Well, it was 500ms at some point in time, so we know the world is not going to end. In fact, it has changed from 500ms to 20ms, to 40ms, with little explanation, and even mismatching comments. So, my bet is that nobody really cares about this value. Now, is the clip broken? Well, how can it be if every player out there plays it. You claim it's broken because the timestamps drift beyond rounding errors, but GStreamer is already playing broken clips (>10ms drift). So, this definition of broken doesn't seem to matter for people outside GStreamer, and it's not very practical in GStreamer either, as the drift tolerance has changed over the years, never matching this idealistic definition. So no, the clip is not broken for any practical purposes (maybe academic). Are the applications broken? Well, do you seriously expect all applications to change the drift tolerance in order to play these kinds of clips? (including gst-launch) I would say applications would find that preposterous. They would like GStreamer to just play the clip, just like every other framework. I think you should consider the possibility that the drift-tolerance is just too low, and that increasing it would not cause any harm (otherwise why was it 500ms before).
Created attachment 179642 [details] [review] vorbisdec: Don't use incoming timestamp if it's before the expected one Instead use the expected one.
(In reply to comment #10) > Created an attachment (id=179642) [details] [review] > vorbisdec: Don't use incoming timestamp if it's before the expected one > > Instead use the expected one. So, basically correct any negative drift. This of course means that now any negative drift would not cause resync; -500ms, -2000ms, etc. That is way more intrusive, but as I don't see the point in using the stored timestamp at all, that's ok for me. Also this needs to be done by all decoders. I guess this is not "UNCONFIRMED" any more.
Created attachment 179648 [details] [review] baseaudiosink: align when incoming ts is before expected one This does the same as Edwards's patch, but for all decoders.
I feel the above approach (variations) are problematic. Fiddling in the/a decoder has us (d)evolving from parameter tuning to adding stuff to a (all?) decoders. In particular, this code will have decoders impose their view on what is a correct timestamp and doing so independently in separate parts of the pipeline (e.g. consider audio and video case). Moreover, as it is more intrusive (as mentioned) a single glitch will have a decoder not giving a toss about demuxer timestamp and carry on on its own, while another decoder is possibly doing the same, though (ts wise) differently. All in all, no telling what it will do to a/v sync (as it removes information very much upstream, rather than giving downstream (sink) a chance to evaluate real info (e.g. demuxer ts) versus "natural next"). Doing similar stuff in baseaudiosink alleviates some of the above, but also plainly shows this is a nasty hack as it introduces inherent asymmetry. That is, if incoming ts is earlier than expected natural, we totally disregard it (= basically infinite drift-tolerance on this side), and otherwise we consider threshold. So what to do next when a clip comes by whose timestamps are way too far apart (such are available at hand). The same approach should then introduce infinite tolerance on the other side, and we are back where we started (should stay), i.e. tweaking with the tolerance parameter. Let's take a deep breath, shall we ...
(In reply to comment #13) > I feel the above approach (variations) are problematic. > > Fiddling in the/a decoder has us (d)evolving from parameter tuning to adding > stuff to a (all?) decoders. In particular, this code will have decoders impose > their view on what is a correct timestamp and doing so independently in > separate parts of the pipeline (e.g. consider audio and video case). Moreover, > as it is more intrusive (as mentioned) a single glitch will have a decoder not > giving a toss about demuxer timestamp and carry on on its own, while another > decoder is possibly doing the same, though (ts wise) differently. All in all, > no telling what it will do to a/v sync (as it removes information very much > upstream, rather than giving downstream (sink) a chance to evaluate real info > (e.g. demuxer ts) versus "natural next"). > > Doing similar stuff in baseaudiosink alleviates some of the above, but also > plainly shows this is a nasty hack as it introduces inherent asymmetry. That > is, if incoming ts is earlier than expected natural, we totally disregard it (= > basically infinite drift-tolerance on this side), and otherwise we consider > threshold. I agree. > So what to do next when a clip comes by whose timestamps are way too far apart > (such are available at hand). The same approach should then introduce infinite > tolerance on the other side, and we are back where we started (should stay), > i.e. tweaking with the tolerance parameter. > > Let's take a deep breath, shall we ... The issue is very simple; when shall a resync happen? The assumption so far has been that we do want resyncs to happen, but do we? If so, when? Presumably, when the audio is drifting too much. Personally, I find a drift of even 200ms barely noticeable. So, an a/v sync drift of 100ms might be preferable than an audio skip. Or maybe not _if_ there's only *one* skip. I tested a constant drift of 100ms, and that works fine, as only one resync would happen. The problem is when going in and out of the 40ms threshold; a sequence of drift 0, 40, 0, 40... Would cause the audio to be totally crappy. Strictly speaking there is no "drift". 0, 10, 20, 30, 40, 50... _That_ would be drift. I'm sending a patch that does differentiate drift from random timestamp variation.
Created attachment 179691 [details] [review] baseaudiosink: don't be hasty about discont detection Suppose a there's a time diff of 40, 0, 40, 0. The default drift tolerance of 40ms, would make this stream unplayable as the discontinuity would be detected on each even buffer. Strictly speaking, that's not drift. If instead we wait a certain amount of time (one second) before deciding that a drift has happened, and therefore a discontinuity, that timestamp sequence would play fine.
BTW. Now the discont-tolerance doesn't matter, as something even as low as 10ms works both for this clip, and real drifting streams.
Out of curiosity, does your problem file have real drift or is it broken timestamps ? If it's real drift, it may make sense to have the sink play it out at the rate the time timestamps say instead of the nominal rate (so lets say the timestamp really say 44050 instead of 44100, then we can tell pulseaudio 44050, it has to resample anyway, so it doesn't cost us anything).
Comment on attachment 179642 [details] [review] vorbisdec: Don't use incoming timestamp if it's before the expected one Ok, this was crack indeed. I apologize for even proposing it in the first place, should have thought more about it. That file is *broken*. If ever we want to put an acceptable fix anywhere, it should be in the matroskademuxer. * Detecting for example that the time provided for the lace has increased by too little (1-4ms) compared to the previous one for the nature of the stream (smallest compressed codec packets should be at least 10ms). * Looking over the next/previous 5/10 packets to determine that the lace time is incorrect. And whenever we detect an incorrect lace_time in the demuxer for the outgoing buffer we set its timestamp to CLOCK_TIME_NONE, which will either: * Make decoders timestamp it to the expected next timestamp based on the number of previously decoded samples * Have baseaudiosink do that for us. Fixing it there instead of breaking a behaviour that works fine for 99.9999999% of the files/use-cases is the proper way to go in the end.
And to answer Olivier, that file is busted indeed. Over chunks of 3-5 packets it only increases the timestamp by 1-3ms ... and then syncs to the expected timestamp (give or take a millisecond). Once you: * Only set timestamps on valid buffers (ones where the difference with the previous one is greater than a certain margin (1-4ms) * Set CLOCK_TIME_NONE on the invalid buffers (ones where the difference with the previous one is less than a certain margin (1-4ms) Then you end up with a resulting stream that jitters by less than a millisecond. I have no idea how that file was created but it was definitely created by a broken software, ergo => broken file.
we might also want to set the discont flag in the demuxer for those "re-synced" outgoing buffers.
(In reply to comment #18) > Fixing it there instead of breaking a behaviour that works fine for 99.9999999% > of the files/use-cases is the proper way to go in the end. How is any of my patches (attachment #179546 [details] or attachment #179691 [details]) breaking any behavior? Can you mention a use case and a way to reproduce it? Specially for attachment #179691 [details], if you have timestamps with a diff of 0, 40, 0, 40... or 40, 80, 40, 80... What would you expect baseaudiosink to do ideally? Resynchronize every buffer? How is that helping anyone?
Review of attachment 179648 [details] [review]: Since Edward rejects his patch too.
(In reply to comment #17) > Out of curiosity, does your problem file have real drift or is it broken > timestamps ? If it's real drift, it may make sense to have the sink play it out > at the rate the time timestamps say instead of the nominal rate (so lets say > the timestamp really say 44050 instead of 44100, then we can tell pulseaudio > 44050, it has to resample anyway, so it doesn't cost us anything). See the log (attachment #179630 [details]). The file plays fine everywhere with the defined rate, the problem is the timestamps which vary too much from the expected ones.
And here's another argument. Pick any mp3 file, convert it with GStreamer to matroska vorbis or flac, and try this: gst-launch-0.10 filesrc location=test.mka ! matroskademux ! flacdec ! pulsesink drift-tolerance=100 You would hear glitches all the time, because basesink would do resynchronization every buffer and apparently GStreamer timestamps have a jitter of around 100us, while FFmpeg has 10ms. Note that this is *not* drift, it's jitter, but basesink is wrongly detecting it as drift, messing up all the playback. With my patch now the playback would not be affected by jitter, you can even set a drift-tolerance of 1, and clips with any jitter would play fine. And streams with drift would also play fine, the re-sync would only be slightly delayed but it would also be detected.
(In reply to comment #24) > And here's another argument. Pick any mp3 file, convert it with GStreamer to > matroska vorbis or flac, and try this: > > gst-launch-0.10 filesrc location=test.mka ! matroskademux ! flacdec ! pulsesink > drift-tolerance=100 > > You would hear glitches all the time, because basesink would do > resynchronization every buffer and apparently GStreamer timestamps have a > jitter of around 100us, while FFmpeg has 10ms. > Congratulations, you've just proven that the default 40ms is a better choice than 100microseconds. I applause your logic. Wait, you had a point ? The moment you'll understand that drift-tolerance is called as such because it's the threshold at which we consider timestamp inconsistencies as *drift* and not as *jitter* you might be able to [censored] the [PG rated]. Both Olivier and Mark have proven their understanding of this problem in their comments, but you insist in childing around. P.S. Now please, as a courtesy to those actually trying to figure out an acceptable way to counteract this minor odd issue, just stop spamming these bug reports repeating yourself 500 times in as many different ways of you disliking how GStreamer works despite the fact that it handles A/V sync for 99.99999999% of the files/use-cases out there in a much better way than any of the tools you've mentioned. WE GOT THE POINT, WE UNDERSTAND THE ISSUE AT HAND ! It just doesn't help, makes up grumpy, and makes us not want to help you. Unlike kids, we prefer to ponder the issue at hand to try to find an *acceptable* fix which won't introduce regressions (and yes, we're thinking about it, the answer isn't in screwing up audio sync for everyone else but rather in detecting the busted files at the demuxer level). Would you (or your employer) prefer a rapid fix that fixes this marginal braindead file playback while screwing up all the other use-cases ? Good night (or morning).
(In reply to comment #25) > (In reply to comment #24) > > You would hear glitches all the time, because basesink would do > > resynchronization every buffer and apparently GStreamer timestamps have a > > jitter of around 100us, while FFmpeg has 10ms. > > Congratulations, you've just proven that the default 40ms is a better choice > than 100microseconds. I applause your logic. Wait, you had a point ? Wrong. 40ms is not _better_ than 100us, or 10ms. Compared to 10ms, 40ms should give you 4 times less resyncs, but each resync would be 4 times bigger. That's not what is happening. 10ms gives you total crap on some clips. I would gladly post a clip encoded with FFmpeg that shows that. > The moment you'll understand that drift-tolerance is called as such because > it's the threshold at which we consider timestamp inconsistencies as *drift* > and not as *jitter* you might be able to [censored] the [PG rated]. Both > Olivier and Mark have proven their understanding of this problem in their > comments, but you insist in childing around. Olivier didn't say anything, just asked questions, and Mark hasn't even commented anything after my last patch. Maybe you have some ad hominem thing going on here. Here is the proof that the issue has nothing to do with demuxers. I'm running a simple audiotestsrc ! dummyfilter ! pulsesink pipeline showing all the possible scenarios (I'll attach the source code). I am simply printing the calculated diff from expected and incoming timestamps, and when a discont happens. This also shows that my patch doesn't affect any other scenario negatively. == Currently == 1) No drift, low jitter (drift=0 jitter=100) diff=5 diff=0 diff=5 diff=0 Ok 2) High drift, low jitter (drift=20000 jitter=100) diff=877 diff=1764 discont diff=877 diff=1764 discont Ok (discont happening at the expected times) 3) No drift, high jitter (drift=0 jitter=40000) diff=1765 discont diff=1764 discont diff=1764 discont diff=1764 discont Total crap 4) High drift, high jitter (drift=20000 jitter=40000) diff=883 diff=1763 diff=881 diff=3527 discont Ok (the drift is hidden a bit, but catched eventually) (different values would make this fail as 2)) == Patch == 1) No drift, low jitter (drift=0 jitter=100) diff=5 diff=0 diff=5 diff=0 Ok (same) 2) High drift, low jitter (drift=20000 jitter=100) diff=877 diff=1764 diff=2641 discont diff=887 Ok (discont happens a bit later) 3) No drift, high jitter (drift=0 jitter=40000) diff=1765 diff=1 diff=1765 diff=1 Ok (Not crap any more) 4) High drift, high jitter (drift=20000 jitter=40000) diff=883 diff=1763 diff=881 diff=3527 diff=2645 discont Ok (the drift is again delated a bit) (6 buffers just to show that) (different values would not affect negatively) You keep saying that this somehow breaks the current behavior but you don't say how. These are the only 4 possible scenarios, and I've done all I can to reproduce them. It would be productive of you to at least mention a scenario that might get broken. Now, you can play around with this scenario and see for yourself. If basesink was behaving as expected I would assume that these two pipelines would produce very different results, but as a matter of fact, one works perfectly, while the other is total crap: gst-launch audiotestsrc ! dummyfilter drift=0 jitter=39999 ! pulsesink gst-launch audiotestsrc ! dummyfilter drift=0 jitter=40000 ! pulsesink And as I mentioned, my patch makes even extreme cases work (100ms jitter, 100us drift-tolerance): gst-launch audiotestsrc ! dummyfilter drift=0 jitter=100000 ! pulsesink drift-tolerance=100
Created attachment 179759 [details] dummyfilter Test filter to easily reproduce timestamp jitter and drift (just type 'make').
Looks like that breath was not deep enough or the air not quite fresh ... Anyway, here goes nothing ... There may be some merit in the "not being hasty" approach/patch, if only as its approach compares to the current one similarly to how http://gentrans.sourceforge.net/docs/head/gst-entrans-plugins/html/gst-entrans-plugins-stamp.html compares to stock videorate ;) However, some other thoughts: * after quite some comments given above on the "arbitrariness" of drift-tolerance, another arbitrary parameter has now been sneaked in; a non-configurable 1s (whereas e.g. stamp's sync-margin and sync-interval are configurable, as is basesink's drift-tolerance) * implementation problems (?): e.g. note the asymmetry in time - sink->priv->discont_time >= GST_SECOND so what if time is that busted that it is not even advancing enough? A better measure might be the amount of samples that have "passed", and maybe the current playback rate has to be kept in mind somewhere as well. In summary, I would not mind such an approach, provided safe implementation, and any new (few?) parameters are configurable, and preferably have defaults that maintain current behaviour [*] Also, btw, it is not quite hard to come up with broken timestamps configurations that are not handled by current or patched approach, other than maybe setting some parameters to "infinity" which pretty comes down to operating in sync=false (so it's all at least partially academic anyway). [*] even though some might consider the current one needs "fixing". However, the above has precisely illustrated that render behaviour can and will very well change, and while one might imagine it improves or has no ill effect in some cases (e.g. a/v playback sync), it may or may not have a practical impact in other cases (e.g. live voip). That practical impact also has a subjective aspect similar to sensitivity threshold for a/v sync etc, and as such feel any messing with it should err on the side of caution. Any application is free to configure other values and/or to run with patched up alternative defaults.
(In reply to comment #28) > Looks like that breath was not deep enough or the air not quite fresh ... > Anyway, here goes nothing ... > > There may be some merit in the "not being hasty" approach/patch, if only as its > approach compares to the current one similarly to how > http://gentrans.sourceforge.net/docs/head/gst-entrans-plugins/html/gst-entrans-plugins-stamp.html > compares to stock videorate ;) > > However, some other thoughts: > * after quite some comments given above on the "arbitrariness" of > drift-tolerance, another arbitrary parameter has now been sneaked in; a > non-configurable 1s (whereas e.g. stamp's sync-margin and sync-interval are > configurable, as is basesink's drift-tolerance) The problem with drift-tolerance is that we cannot really choose any value, if the value is below certain threshold, crap is ensured. Therefore it's understandable that it was made a parameter. However, would we really need to change this "drift-wait" (1s value)? Perhaps not if it depends on an already configurable value, like "buffer-time". IMO ideally both "drift-tolerance" and "drift-wait" should disappear from the user's view, and the stream would Just Work in any situations. > * implementation problems (?): e.g. note the asymmetry in > time - sink->priv->discont_time >= GST_SECOND > so what if time is that busted that it is not even advancing enough? A better > measure might be the amount of samples that have "passed", and maybe the > current playback rate has to be kept in mind somewhere as well. That's true. I would prefer a value that matches some user-visible elapsed time. > In summary, I would not mind such an approach, provided safe implementation, > and any new (few?) parameters are configurable, and preferably have defaults > that maintain current behaviour [*] I disagree. Perhaps the first patch would make sense to keep with the current behavior "drift-wait" = 0, but there's no point in keeping it at a value that we already know is broken. > Also, btw, it is not quite hard to come up with broken timestamps > configurations that are not handled by current or patched approach, other than > maybe setting some parameters to "infinity" which pretty comes down to > operating in sync=false (so it's all at least partially academic anyway). Well, what would those be? Patches welcome for my dummyfilter :) > [*] even though some might consider the current one needs "fixing". > However, the above has precisely illustrated that render behaviour can and will > very well change, and while one might imagine it improves or has no ill effect > in some cases (e.g. a/v playback sync), it may or may not have a practical > impact in other cases (e.g. live voip). That practical impact also has a > subjective aspect similar to sensitivity threshold for a/v sync etc, and as > such feel any messing with it should err on the side of caution. Any > application is free to configure other values and/or to run with patched up > alternative defaults. I disagree. Suppose a worst case scenario of 40ms drift per second, that is so bad, that you would hear a glitch of at least 40ms each second. Very bad, but it works (there could be worst scenarios, but I'm not considering them for this demonstration purpose, although I could if needed, the conclusion is the same). My patch would affect negatively by delaying the resync one second, and thus the glitch would be 80ms each 2 seconds. This could be reduced significantly if we use the buffer-time (200ms) instead, so the resync glitch would be 48ms each 1.2s. Now, if your objective is to never exceed the current 40ms drift-tolerance, we can set the drift-tolerance to 20ms and drift-wait to 500ms, and we would never exceed that. max_resync_time = drift-tolerance + drift-rate * drift-wait max_resync_interval = drift-tolerance / drift-rate + drift-wait Or even better, we can calculate this dynamically, so that max_resync_interval never exceeds the drift-tolerance, but in that case, we would need to dynamically calculate the average drift-rate.
(In reply to comment #29) > (In reply to comment #28) > > Also, btw, it is not quite hard to come up with broken timestamps > > configurations that are not handled by current or patched approach, other than > > maybe setting some parameters to "infinity" which pretty comes down to > > operating in sync=false (so it's all at least partially academic anyway). > > Well, what would those be? Patches welcome for my dummyfilter :) For the record/bonus points/posterity, consider a clip (mkv/wav/whatever) whose chunk timestamps are N times what they should "normally" be (for each N, then there exists some drift-tolerance/drift-wait that handles up to a point, only infinity == sync=false handles totally glitch free).
(In reply to comment #30) > (In reply to comment #29) > > (In reply to comment #28) > > > Also, btw, it is not quite hard to come up with broken timestamps > > > configurations that are not handled by current or patched approach, other than > > > maybe setting some parameters to "infinity" which pretty comes down to > > > operating in sync=false (so it's all at least partially academic anyway). > > > > Well, what would those be? Patches welcome for my dummyfilter :) > > For the record/bonus points/posterity, consider a clip (mkv/wav/whatever) whose > chunk timestamps are N times what they should "normally" be (for each N, then > there exists some drift-tolerance/drift-wait that handles up to a point, only > infinity == sync=false handles totally glitch free). Ah, that is true. However, that's not related to the issue at hand. In fact, just to be clear on what is the issue, I'm updating the summary.
We (Tandberg/Cisco) recently changed this value from 125ms to 225ms based on simulations of real-life data from users experiencing glitches in their audio. However, our use case is strictly live streaming which is much more timestamp-jitter-prone. As long as this value is configurable, I find the discussion of a default-value a bit meaningless? What I don´t like is the fact that drift_tolerance is one variable used for two completely different things. The relationship between a master-clock (driver AD/DA) and the element-clock has very little to do with the threshold for sacrificing time-accurate placement in the ringbuffer to allow alignment. But that´s a different bug I guess... :)
(In reply to comment #32) > We (Tandberg/Cisco) recently changed this value from 125ms to 225ms based on > simulations of real-life data from users experiencing glitches in their audio. Did you try my patch in attachment #179691 [details]? > However, our use case is strictly live streaming which is much more > timestamp-jitter-prone. Yeah, but jitter should not cause glitches based on drift-tolerance; that's for drift. Which is precisely my point. > As long as this value is configurable, I find the > discussion of a default-value a bit meaningless? The point is that you shouldn't need to configure it based the jitter. In fact, no configuration value should cause such horrible glitches as they happen right now. My patch improves that.
(In reply to comment #33) > (In reply to comment #32) > > We (Tandberg/Cisco) recently changed this value from 125ms to 225ms based on > > simulations of real-life data from users experiencing glitches in their audio. > > Did you try my patch in attachment #179691 [details]? > I am very happy to try out your patch, but could you make it apply against current git? I also think there is a bit of a syntax-mixup here. As I pointed out, the drift-tolerance variable is actually (IMO) used for two different things. One is the clock-slaving, where I mean it belongs, and the other is for a "aligment-threshold" when trying to figure out whether to align the current buffer with the previous one. Now, for the sake of argument, lets say that we make another configurable variable and name it "alignment-threshold", and it will be the maximum amount of time you will displace a buffer from its rightful timestamp-location in the ringbuffer in order to align it with the previous buffer. Now you could see how this variable is very useful for us. We don´t want to drift out of lipsync too much, and if we allow the "alignment-threshold" to be very big, we could potentially be accepting to play out buffers that are either too early or too late compared to video. However, since we deal with jittery networks, the timestamps on the buffers are sometimes very differently inter-spaced, and if we always placed the buffers according to their timestamps, we would have glitches-galore. Now I believe the playback-case is similar. You don´t want to allow yourself to drift too far from your timestamp (sync with video), but you certainly don´t want glitches in the sound. If I read your patch correctly, you are allowing for the timestamp to do a sort of a "back-and-forth" jump, without resynching on both accounts. This I think is a good idea. We often see one positive and one negative resync shortly after eachother in the logs. Starvation or bursting are typical causes for this. I have written a baseaudiosink testsuite where I feed it with actual real-life "problematic" data. (people who have had problems with audio, and logged it for me) If you clean up your patch, I could try to run it against the scenarios we have, and see if there is any improvements. Also, could you consider making the "jitter-window" configurable with getters and setters, as well as maybe creating this second "alignment-threshold" variable so that the confusion drift/jitter stops... :)
Created attachment 187024 [details] [review] baseaudiosink: don't be hasty about drift detection Updated patch on top of latest git.
Created attachment 187025 [details] [review] baseaudiosink: add drift-wait property Extra patch to make the amount of wait configurable.
Created attachment 187027 [details] [review] baseaudiosink: add drift-wait property Second try, with better documentation, and ability to set drift-wait to 0 in order to enable the old behavior.
(In reply to comment #34) > (In reply to comment #33) > > (In reply to comment #32) > > > We (Tandberg/Cisco) recently changed this value from 125ms to 225ms based on > > > simulations of real-life data from users experiencing glitches in their audio. > > > > Did you try my patch in attachment #179691 [details] [details]? > > I am very happy to try out your patch, but could you make it apply against > current git? Done. > I also think there is a bit of a syntax-mixup here. As I pointed out, the > drift-tolerance variable is actually (IMO) used for two different things. One > is the clock-slaving, where I mean it belongs, and the other is for a > "aligment-threshold" when trying to figure out whether to align the current > buffer with the previous one. True. > Now, for the sake of argument, lets say that we make another configurable > variable and name it "alignment-threshold", and it will be the maximum amount > of time you will displace a buffer from its rightful timestamp-location in the > ringbuffer in order to align it with the previous buffer. > > Now you could see how this variable is very useful for us. We don´t want to > drift out of lipsync too much, and if we allow the "alignment-threshold" to be > very big, we could potentially be accepting to play out buffers that are either > too early or too late compared to video. Yeap. > However, since we deal with jittery networks, the timestamps on the buffers are > sometimes very differently inter-spaced, and if we always placed the buffers > according to their timestamps, we would have glitches-galore. Not necessarily, it depends how bit the jitter is. If it's small, it might not be noticeable. > Now I believe the playback-case is similar. You don´t want to allow yourself to > drift too far from your timestamp (sync with video), but you certainly don´t > want glitches in the sound. > > If I read your patch correctly, you are allowing for the timestamp to do a sort > of a "back-and-forth" jump, without resynching on both accounts. This I think > is a good idea. We often see one positive and one negative resync shortly after > eachother in the logs. Starvation or bursting are typical causes for this. Exactly. > I have written a baseaudiosink testsuite where I feed it with actual real-life > "problematic" data. (people who have had problems with audio, and logged it for > me) If you clean up your patch, I could try to run it against the scenarios we > have, and see if there is any improvements. That would be great :) > Also, could you consider making the "jitter-window" configurable with getters > and setters, as well as maybe creating this second "alignment-threshold" > variable so that the confusion drift/jitter stops... :) I have made a property called drift-wait so that this amount of time can be configured, but I think renaming to "alignment-threshold" should be on a separate bug report, and I'm not sure people would be in favor of braking ABI in that way.
I have not yet tried out your patch, but I will soon. Coincidentally (?) we just got in a bugreport of someone getting +500ms diffs, and hence exceeding our resync-threshold, so as soon as I get my hands on that data, I will have an excellent failing bug that your patch might fix :) > > However, since we deal with jittery networks, the timestamps on the buffers are > > sometimes very differently inter-spaced, and if we always placed the buffers > > according to their timestamps, we would have glitches-galore. > > Not necessarily, it depends how bit the jitter is. If it's small, it might not > be noticeable. > Actually, only one sample of discontinuity can easily be audible. So unless the jitter is less than 20 microseconds (for 48KHz) it is audible. And the jitter is almost always bigger than that in VOIP due to RTP/RTCP time-calculations. > ... but I think renaming to "alignment-threshold" should be on a > separate bug report, and I'm not sure people would be in favor of braking ABI > in that way. Well, it used to be hard-coded, and then it just piggy-backed on another variable, so I think it getting its own would be welcomed. Also, is there any reason you have two patches? Should they not be merged?
I thought I would share the sort of behavior we see in logs of people with sound-problems at the moment: 2011-05-02 09:53:35,688 WARNING PID 4424 TID 6504 GStreamer audio-sink [gstbaseaudiosink.c(1404), gst_base_audio_sink_get_alignment()]-> Unexpected discontinuity in audio timestamps of +0:00:00.483520833, resyncing 2011-05-02 09:53:35,825 WARNING PID 4424 TID 6504 GStreamer audio-sink [gstbaseaudiosink.c(1404), gst_base_audio_sink_get_alignment()]-> Unexpected discontinuity in audio timestamps of -0:00:00.227854166, resyncing 2011-05-02 09:53:36,001 WARNING PID 4424 TID 6504 GStreamer audio-sink [gstbaseaudiosink.c(1404), gst_base_audio_sink_get_alignment()]-> Unexpected discontinuity in audio timestamps of -0:00:00.226895833, resyncing --- 2011-05-02 09:54:03,757 WARNING PID 4424 TID 6504 GStreamer audio-sink [gstbaseaudiosink.c(1404), gst_base_audio_sink_get_alignment()]-> Unexpected discontinuity in audio timestamps of +0:00:00.531645833, resyncing 2011-05-02 09:54:03,889 WARNING PID 4424 TID 6504 GStreamer audio-sink [gstbaseaudiosink.c(1404), gst_base_audio_sink_get_alignment()]-> Unexpected discontinuity in audio timestamps of -0:00:00.233062500, resyncing 2011-05-02 09:54:03,988 WARNING PID 4424 TID 6504 GStreamer audio-sink [gstbaseaudiosink.c(1404), gst_base_audio_sink_get_alignment()]-> Unexpected discontinuity in audio timestamps of -0:00:00.230000000, resyncing As you can see, it does one violent jump forwards, followed by two smaller jumps back, all within a timeframe of a few hundred milliseconds. Now if there was a window operating here, the sum of these jumps would fall under the alignment-threshold, and we would have glitch-free playback. I think this is a great feature for BaseAudioSink to have, and would be very useful for all use of GStreamer involving networks.
(In reply to comment #40) > I thought I would share the sort of behavior we see in logs of people with > sound-problems at the moment: [...] > As you can see, it does one violent jump forwards, followed by two smaller > jumps back, all within a timeframe of a few hundred milliseconds. Now if there > was a window operating here, the sum of these jumps would fall under the > alignment-threshold, and we would have glitch-free playback. > > I think this is a great feature for BaseAudioSink to have, and would be very > useful for all use of GStreamer involving networks. My thoughts exactly. I cringe a little every time I hear a glitchy call and see that it is caused by the typical +, -, +, -, ... adjustment pattern :(
I took the liberty to open a new bug, since this one is getting a bit big and have mutated quite a bit from the original. https://bugzilla.gnome.org/show_bug.cgi?id=650724
(In reply to comment #42) > I took the liberty to open a new bug, since this one is getting a bit big and > have mutated quite a bit from the original. > > https://bugzilla.gnome.org/show_bug.cgi?id=650724 I don't see the point of opening another bug, it's the same issue, and the fix is basically attachment #187024 [details]. The only difference is that you are doing other changes in your patch, mostly splitting drift-tolerance into alignment-threshold, which is unrelated to the issue at hand and I think should be tracked in a separate bug report.
Created attachment 188296 [details] [review] baseaudiosink: delay the resyncing of timestamp vs ringbuffertime v2: use GstClockTime instead, plus a few improvements in the comments by Havard Graff, and calculate instead of passing the time
Created attachment 188297 [details] [review] baseaudiosink: make discont-wait configurable v2: rename to discont-wait, better documentation by Havard Graff, add setter and getter, and shuffle a bit the code.
So? Would any of the "core GStreamer" devs care to comment? I think it's pretty clear this is a real issue, and the fix is good.
Yes, please commit, I run into this issue all the time trying to record audio from internet radio stations and IP cameras :(
Hellooo?
I have some trouble with seamless loops in buzztard. One noticable thing is: baseaudiosink gstbaseaudiosink.c:1374:gst_base_audio_sink_get_alignment:<player> Unexpected discontinuity in audio timestamps of +0:00:00.120000000, resyncing With the patch applied, I don't get the warning anymore, but still get a small break. Have not yet experienced any bad side effects from the patch either.
Commited now, thanks! (In reply to comment #49) > I have some trouble with seamless loops in buzztard. One noticable thing is: > > baseaudiosink > gstbaseaudiosink.c:1374:gst_base_audio_sink_get_alignment:<player> Unexpected > discontinuity in audio timestamps of +0:00:00.120000000, resyncing > > With the patch applied, I don't get the warning anymore, but still get a small > break. Have not yet experienced any bad side effects from the patch either. I would expect those warnings to be from a missing DISCONT on a buffer after looping..
*** Bug 650724 has been marked as a duplicate of this bug. ***
Unfortunately this commit broke playback when transitioning from Ogg to FLAC. Reverting the commit fixes it. Please see bug #680252.