After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 679443 - [0.11] x264enc produces wrong timestamps
[0.11] x264enc produces wrong timestamps
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-ugly
0.11.x
Other All
: Normal normal
: 0.11.x
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2012-07-05 13:17 UTC by Matej Knopp
Modified: 2013-05-19 18:59 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch (1.18 KB, patch)
2012-07-05 13:17 UTC, Matej Knopp
none Details | Review

Description Matej Knopp 2012-07-05 13:17:15 UTC
Created attachment 218080 [details] [review]
Patch

x264 sometimes outputs keyframe DTS lower than PTS and DTS of next frame equal to keyframe PTS. Videoencoder changes the keyframe DTS to PTS, which results in non-monotonous DTS causing decoding problems.

Also x264 outputs following timestamps
Frame1:   PTS 0, DTS -66666666
Frame2:   PTS: 20000000, DTS: 0

Videoencoder turns the first frame to PTS:0, DTS:0 so there are two frames with same DTS resulting in muxer warning
Comment 1 Matej Knopp 2012-07-05 13:19:21 UTC
The first paragraph is little bit convoluted, example of timestamps generated by x264

Keyframe:   PTS: 10000, DTS: 9000
Next frame: PTS 11000, DTS 10000

encoder turns this into

Keyframe:   PTS: 10000, DTS: 10000
Next frame: PTS 11000, DTS 10000
Comment 2 Matej Knopp 2012-07-05 17:01:17 UTC
Actually, this is not entirely correct. X264 outputs negative DTS when b-frames are enabled, gstreamer is currently not able to handle negative DTS.
Comment 3 Mark Nauwelaerts 2012-09-07 15:45:15 UTC
IMO, there is no reason to hack a particular codec lib's DTS/PTS interpretation into the base classes.  Rather, one can accept that (in gstreamer sense/specs) PTS and DTS as set on buffers should be equal for keyframes (as defined by the baseclass and also long since assumed/established in e.g. gst-ffmpeg).

In that sense, this is a x264enc bug and following commit fixes things to have it come up with proper timestamps which will play along nicely elsewhere (e.g. not suprise muxers etc):

-ugly:
commit 839e75d5d145d486a5c114f8f9f7ea362d12866f
Author: Mark Nauwelaerts <mark.nauwelaerts@collabora.co.uk>
Date:   Fri Sep 7 17:38:18 2012 +0200

    x264enc: handle possibly negative DTS provided by codec
    
    ... by arranging for an offset such that DTS == PTS for keyframes,
    which is expected elsewhere to go along with semantics of PTS and DTS.
    
    Fixes https://bugzilla.gnome.org/show_bug.cgi?id=679443

Also, for good measure, it's not very 'polite' to overwrite DTS if already set by baseclass (and then only for keyframe), so let's not do that in baseclass, but warn at least if they are then not equal as expected:

-base:
commit d4c1b160ef2efa037eaed203e3d73ba5a86e3cd2
Author: Mark Nauwelaerts <mark.nauwelaerts@collabora.co.uk>
Date:   Fri Sep 7 17:41:27 2012 +0200

    videoencoder: only set invalid DTS equal to PTS for keyframe
    
    Also add a bit more debug.
    
    See also https://bugzilla.gnome.org/show_bug.cgi?id=679443
Comment 4 Matej Knopp 2013-04-23 22:31:28 UTC
I don't think that everyone agrees that PTS = DTS on keyframes. Your patch made it impossible to mux proper MPEG TS (where you need the exact DTS values from encoder otherwise the decoder buffer can underflow). 

About negative DTS:
https://bugzilla.gnome.org/show_bug.cgi?id=679466
Comment 5 Matej Knopp 2013-04-23 22:33:33 UTC
If you really insist on current DTS semantics then perhaps there should be a way to restore original values (i.e. sticky event).
Comment 6 Matej Knopp 2013-04-23 22:41:28 UTC
for reference:
ds: I am strongly against any method that does not put the correct DTS value in the DTS field.
matej_k: ds, that's happening now
matej_k: depends on how you define correct DTS value I guess
ds: The MP4 method is not correct.
ds: The correct dts is the time when the decoder should start decoding a frame.  Decoding is assumed to be instantaneous, so you can have dts==pts for some frames
ds: but never dts > pts.
ds: it has to be exactly right to mux correct MPEG-TS streams, otherwise hardware decoders will fail
Comment 7 Mark Nauwelaerts 2013-05-19 16:41:18 UTC
It may very well (unfortunately) have introduced other problems/bug/drawbacks, but "at least" it is consistent with what videoencoder was/is doing/expecting, which was the intention here.  But feel free to revert (or have it reverted), though afaics that will also not fix the original problem reported here (including videoencoder behaviour/assumptions).

There are indeed problems wrt PTS/DTS timestamps, as in the referenced report (and in a -devel post), such as negative timestamps but IMO also what to do in case of a missing/unknown PTS and/or DTS, and all sorts of interpolation that it entails (a major pain in e.g. baseparse and probably others).  In that respect, we seem to have shifted some fantastic DTS/PTS guesswork coding in e.g. qtmux to other places (to deal with missing DTS or PTS, e.g. baseparse, videoencoder, -decoder).  Hopefully, it is at least a bit more reusable in these latter places now.

In fact, fwiw, the (I feel) still messy PTS/DTS situation in 1.0 makes me a bit worried about how reliable some ts is at any point (in pipeline).  Also, btw, note that no longer assuming PTS == DTS is probably going to make things even harder for some PTS/DTS guesstimating heuristics around.

Suffice to say, any particular spec/agreement probably works, if at least care is taken so it all "Just Works" in videoencoder-/decoder and baseparse (including guesstimating etc).
Comment 8 Matej Knopp 2013-05-19 18:59:44 UTC
Yeah, guesstimating DTS is tricky. Even if you infer it, it still depends on codec to make it at least approximately correct (i.e. you need to know if you have b frames or b-pyramid so that you shift DTS x frames), so maybe the functionality should be moved to codec parsers?

So there are 2 major problems right now if I understand it correctly.

1) It's impossible to represent negative DTS (needed when PTS start from 0). Also in turn negative stream/running time.

2) Some code assumes that PTS = DTS for keyframes. Which classes do this exactly? Is it just videoencoder? Videodecoder? Baseparser? 

I think in ideal world the encoder should be responsible for producing the proper DTS there shouldn't be need for trying to guess what it should be. Same goes for demuxer, except it gets tricky with matroska and avi. I still don't get why there are no decoding timestamps in matroska.