GNOME Bugzilla – Bug 783842
[REGRESSION] encodebin: Transcoding of several media streams (to h264 in mp4) fail
Last modified: 2018-02-10 13:27:11 UTC
Since: > commit 25e0b77bbe596b8f53497d42f3fad4e5870a6d87 (origin/master, origin/HEAD) > Author: Arun Raghavan <arun@arunraghavan.net> > Date: Thu Jun 8 12:35:23 2017 +0530 > > encodebin: Don't try rate adjustment before the first buffer > > With both audiorate and videorate, it seems more sensible to apply rate > adjustments after the first buffer appears. For example, with v4l2src, > there is often a small delay before the first video buffer turns up, and > this can cause a stuttery start because of videorate trying to ensure a > perfect stream. Two validate tests started to fail: * validate.file.transcode.to_mp3_and_h264_in_mp4.fragmented_nonseekable_sink_mp4 * validate.file.transcode.to_mp3_and_h264_in_mp4.samples_multimedia_cx_asf_wmv_elephant_asf I started to investigate validate.file.transcode.to_mp3_and_h264_in_mp4.fragmented_nonseekable_sink_mp4, which can simply reproduced doing: $ gst-validate-1.0 uridecodebin uri=file://$HOME/gst-validate/gst-integration-testsuites/medias/defaults/mp4/fragmented_nonseekable_sink.mp4 ! videorate skip-to-first=true ! x264enc ! h264parse ! qtmux ! filesink location=$HOME/test.mp4 Basically qtdemux fails because an input buffer has an invalid timestamp (the one right after h264parse receives it SEGMENT). The pts outputed by x264enc looks correct and h264parse sets it back to NONE. It happens because for some reason baseparse sets the frame->buffer->pts = NONE at some point. I have not gone further in the investigation yet, will try to go back to it later (and want to put up my findings here to not forget :)).
So I had an extra look but am not sure what should happen concretly. Here is what happens: - x264enc outputs 2 frames with the same PTS (1000:00:01.933333333) (First frame has a size of 6324 bytes, second is really small, 41bytes) - baseparse sets its next_pts to CLOCK_TIME_NONE (as for any frame) - baseparse ignores input PTS because the PTS is the same here: https://phabricator.freedesktop.org/diffusion/GST/browse/master/libs/gst/base/gstbaseparse.c;d99c9d9944fd6dfa3470c25b8fde21ecc563cd5d$3191 I have to say I do not get that part of the code (ie. why is it ignoring timestamp if they are equal to the previous one? Attaching here a simple patch to change that behaviour, fixing that issue.
Created attachment 353917 [details] [review] baseparse: Don't set PTS to NONE if receiving 2 concecutive buffer with same timestamp.
And...it also broke `validate.file.transcode.to_mp3_and_h264_in_mp4.samples_multimedia_cx_asf_wmv_elephant_asf`: https://ci.gstreamer.net/job/GStreamer-master-meson-validate/lastCompletedBuild/testReport/validate.file.transcode/to_mp3_and_h264_in_mp4/samples_multimedia_cx_asf_wmv_elephant_asf/history/ Problem is that inspecting the resulting file (which is done with gst-validate-media-check) reports "layer=1" instead of the expected one (layer=3) I have not yet checked what is going on there.
Review of attachment 353917 [details] [review]: Make sense to me. Any objection ?
Review of attachment 353917 [details] [review]: Looking at the code it seems like someone went to a lot of trouble to not do exactly this - did anyone investigate what the reason for that is?
(In reply to Tim-Philipp Müller from comment #5) > Review of attachment 353917 [details] [review] [review]: > > Looking at the code it seems like someone went to a lot of trouble to not do > exactly this - did anyone investigate what the reason for that is? If by investigate you mean bisect that exact line, I did, but nothing conclusive, it seems to appear in: commit a40b993bc347ceb99e831a2c91f44f6887eebbb6 Author: Mark Nauwelaerts <mark.nauwelaerts@collabora.co.uk> Date: Tue Feb 14 19:33:50 2012 +0100 baseparse: remove dead code and superfluous loop level which is does not give any useful information :-) Apart from that, as previously stated, I do not really understand that code and was showing that patch rather in order to get input about it than getting it in. tbh, I think we might probably want to revert that commit in encodebin if no one is making any case for keeping it.
the real question is ... why is x264enc sending a frame with the same PTS as the previous one ? That frame, although very small, is a valid frame (contains AU and a slice).
Also: x264enc :0::<x264enc0@0x9e9610> invalid DTS: PTS is less than DTS
(In reply to Edward Hervey from comment #7) > the real question is ... why is x264enc sending a frame with the same PTS as > the previous one ? That frame, although very small, is a valid frame > (contains AU and a slice). It is libx264 doing, it does not contain a full frame afaics, might be the end of the previous one? not sure
Did you already look at the input to x264enc? (Sorry if I missed that)
(In reply to Thibault Saunier from comment #3) > Problem is that inspecting the resulting file (which is done with > gst-validate-media-check) reports "layer=1" instead of the expected one > (layer=3) > > I have not yet checked what is going on there. This one seems to be another issue. mediainfo reports it's indeed layer 3. The problem seems to be because of an initial gap in the file. mp3parse doesn't have any data, so it tries to negotiate to something... which ends up being layer 1 :)
I opened another bug #785027 which points out the root cause of the DTS/PTS issues.
The remaining transcoding issue is somewhere in mpegaudioparse OR lamemp3enc OR qtmux OR qtdemux OR the profile itself. gst-launch-1.0 -v audiotestsrc num-buffers=10 ! "audio/x-raw, layout=(string)interleaved, rate=(int)8000, format=(string)S16LE, channels=(int)1" ! lamemp3enc ! mpegaudioparse ! qtmux ! filesink location=test.mp4 ... the raw audio caps arrive in lamemp3enc ... /GstPipeline:pipeline0/GstLameMP3Enc:lamemp3enc0.GstPad:sink: caps = audio/x-raw, format=(string)S16LE, layout=(string)interleaved, rate=(int)8000, channels=(int)1 ... lamemp3enc says it will output mpegaudioversion=2 ... /GstPipeline:pipeline0/GstLameMP3Enc:lamemp3enc0.GstPad:src: caps = audio/mpeg, mpegversion=(int)1, mpegaudioversion=(int)2, layer=(int)3, channels=(int)1, rate=(int)8000 ... But mpegaudioparse says it's mpegaudioversion=3 ... /GstPipeline:pipeline0/GstMpegAudioParse:mpegaudioparse0.GstPad:src: caps = audio/mpeg, mpegversion=(int)1, mpegaudioversion=(int)3, layer=(int)3, rate=(int)8000, channels=(int)1, parsed=(boolean)true But wait ... it gets better ! gst-launch-1.0 -v filesrc location=test.mp4 ! qtdemux ! mpegaudioparse ! fakesink ... qtdemux says it's mpegaudioversion=1 ... /GstPipeline:pipeline0/GstMpegAudioParse:mpegaudioparse0.GstPad:sink: caps = audio/mpeg, layer=(int)3, mpegversion=(int)1, rate=(int)8000, channels=(int)1 ... which mpegaudioparse then says it's mpegaudioversion=3 :D ... /GstPipeline:pipeline0/GstMpegAudioParse:mpegaudioparse0.GstPad:src: caps = audio/mpeg, mpegversion=(int)1, mpegaudioversion=(int)3, layer=(int)3, rate=(int)8000, channels=(int)1, parsed=(boolean)true
FWIW, mediainfo is reporting that that file has Version 2.5 MPEG Audio (i.e. mpegaudioversion=3). So: * lame is returning nonsense * qtmux is storing nonsense *OR* qtdemux is extracting nonsense (but less likely)
Sadly the core issue is indeed the gap (causing discoverer to preroll before the final correct caps from mpegaudioparse appear).
And the issue is deeper. The resulting file can't even be played with decodebin2/playbin2 because of the following: * caps without mpegaudioversion arrive in mpegaudioparse * gap event arrives in mpegaudioparse * mpegaudioparse negotiates to something, which ends up having mpegaudioversion=1 * decodebin finds and a compatible decoder, in this case avdec_mp1float * mpegaudioparse eventually gets data and tries to negotiate to mpegaudioversion=3 * avdec_mp1float is not compatible with those caps * => NEGOTIATION FAILURE Note that the problem doesn't happen with decodebin3/playbin3 because it can reconfigure itself dynamically. Do we really want to have parsers output potentially-incompatible formats on GAP events ? Shouldn't subclasses be able to tell the base class that it should hold on negotiation until it has some fields ?
The transcoding issues should now have been fixed since this commit: commit 2e45926a96ec5298c6ef29bf912e5e6a06dc3e0e Author: Edward Hervey <edward@centricular.com> Date: Wed Dec 13 11:35:37 2017 +0100 qtdemux: Don't push GAP event if first buffer is within 1s If we saw empty segments, we previously unconditionally pushed a GAP event downstream regardless of the duration of that empty segment. In order to avoid issues with initial negotiation of downstream elements (which would negotiate to something before receiving any data due to that initial GAP event), check if there's at least a second of difference (like we do for other GAP-related checks in qtdemux) before deciding to push a GAP event downstream.
Validate hasn't been complaining about those issues for some time. Closing.