GNOME Bugzilla – Bug 797125
[soundtouch] pitch breaks lip sync in Chrome
Last modified: 2018-11-03 14:31:34 UTC
Created attachment 373606 [details] Good and bad video recv delays (with scaletempo and pitch, respectively) It seems that the 'pitch' filter (in the soundtouch plugin, gst-plugins-bad) introduces some kind of wrong timestamp, clock skew, bad sequence, a combination of those, or maybe something different (but probably related). I'm seeing an accumulative delay in Chrome between video and audio in a WebRTC call (_not_ using the new GStreamer's WebRTC element, yet) where the source is sending a video+audio stream, and the audio is filtered with the pitch element. Chrome is unable to perform the lip sync successfully, and for some reason deduces that somehow the audio is lagging behind the video (which is actually not), so it delays the video indefinitely until the delay gets to Chrome's maximum, 10 seconds. The net effect of this issue is practically the same as what happened in this Chrome bug: https://bugs.chromium.org/p/webrtc/issues/detail?id=5456 (just check the screenshots) with 'googCurrentDelayMs' and 'goodMinPlayoutDelayMs' growing linearly. At that time it happened to be Chrome wrongly using the webcam's timestamp, which had a different clock rate than the system's timestamp. But, in this case I don't think it's a Chrome bug; I have verified that this is caused by the GStreamer's 'pitch' filter, by sourcing this simple test pipeline to my custom WebRTC source element: ... -> (raw audio) -> audioconvert -> audioresample -> pitch -> -> audioconvert -> audioresample -> WebRTC (Probably most, if not all of those audioconvert/audioresample elements are not needed, I added them just to fall on the safe side) This generates the mentioned delay in the video presentation handled by Chrome. However nothing of this happens if the 'pitch' element is removed and any other is used, e.g. an 'scaletempo' element: ... -> (raw audio) -> audioconvert -> audioresample -> scaletempo -> -> audioconvert -> audioresample -> WebRTC This produces a normal lip sync result in Chrome. Delay (latency) stays at around 100, 150 ms. Tried reading google's WebRTC code, wanting to know exactly what is the name of the value that is to blame: https://cs.chromium.org/chromium/src/third_party/webrtc/video/stream_synchronization.cc but finding out what is the correct function chain is difficult, and I'm still not sure of exactly *what* is making Chrome confused and wrongly assuming that the audio is behind the video, when it's not. I have cherry-picked and applied all commits that touched the file './ext/soundtouch/gstpitch.cc' into a custom built version of gst-plugins-bad (based on GStreamer 1.8), but the issue persists so I don't think it's a matter of trying the latest code (after a lot of time without changes, the pitch filter received some patches in June 2018 so I wanted to test if those helped...) I can provide any needed info (Chrome's webrtc-internals dump file, where the attached graphs are taken from; Chrome's diagnostic event and log captures, Wireshark capture, etc.) Attached graphs taken by myself, to show the effect of a good (as expected) behavior and bad one. I have been analyzing two issues with the pitch plugin; one is this, and the other is reported here and is possibly related: https://bugzilla.gnome.org/show_bug.cgi?id=797124
Some notes I took about the variables shown in the graphs: - googMinPlayoutDelayMs is used, together with the detected jitter, to calculate the googTargetDelayMs. The Target value must be between the Min and Max (Max is not shown in the graphs). The Min is constantly growing when the pitch element is used. - googTargetDelayMs is used to slowly move the googCurrentDelayMs, which seems to be the actual delay applied by Chrome. - googCurrentDelayMs is used here: https://cs.chromium.org/chromium/src/third_party/webrtc/modules/video_coding/timing.cc?rcl=711a31aead9007e42dd73c302c8ec40f9e931619&l=188 to calculate the actual Rendering Time (i.e. Presentation Time) for each video frame.
Looking at the code, and testing locally, pitch updates the latency quite often in the first couple of seconds. The forumulas seems strange, and they always result in 0ms being returned. Using the latency tracer, it would seem to introduce betweem .5 and .7ms, which can of course be rounded to zero. But the storm of latency even does cause audio breakage. I wonder if this isn't linked to your issue.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/issues/785.