GNOME Bugzilla – Bug 740575
Fixing DTS in GStreamer
Last modified: 2015-10-21 07:23:44 UTC
DTS handling in GStreamer is currently messy. The semantics are not clear, different elements expect and assume different things. We are been getting bitten by this over and over and it might be time to do something about it. This are just some thoughts and maybe a half-assed proposal, but any feedback is very welcome. 1. Semantics It should always be assumed that DTS <= PTS. It doesn't make sense any other way. Unless I misunderstand time, it's not possible to display buffer first and then decode it. (This is actually true even for DTS coming from MP4 container, I'll get to that later). Every element that produces DTS must produce DTS <= PTS; Every element that consumes DTS should expect DTS <= PTS. DTS may come before segment.start. This happen for streams with B-Frames where first sample PTS = segment.start; This is a perfectly valid scenario and it needs to be handled correctly. DTS before segment start should not result in buffer getting clipped. It should also not be discarded when converting to stream/running time (see muxers). Negative DTS are a real thing. Acting like they don't exist will not make it so. Unfortunately given that GST_CLOCK_TIME is signed, there is no easy way to represent such thing in GStreamer. One solution would be to offset both DTS and PTS and add the offset to stream start and stop, so that running and stream time is unaffected. This also preserves the DTS/PTS delta, which can be important for certain decoders. Q: Does anyone see problem with his? Is there a caveat that I'm missing? 2. Necessary changes to GStreamer 2.1 Video Decoders Decoders should be mostly (if not at all) unaffected. 2.2 Video Encoders When underlying encoder produces negative DTS, we can't simply pass these downstream. Both PTS and DTS will need to be adjusted so that DTS starts from 0. The adjustment also needs to be applied to segment.start/stop, so that stream and running remain the same, otherwise there would be A/V sync issues. The end result should be that PTS/DTS delta is preserved for every sample. First few DTS will be before segment start, but as mentioned before, that needs to be considered a valid thing. 2.3 Muxers Muxers need PTS and DTS converted to running time. Right now it is not possible to represent negative running time. This needs to be possible, because it is valid to have DTS before segment start. If muxer is calling gst_segment_to_running_time manully, it can do that for the PTS only and then subtract the PTS/DTS delta from the running time. It needs to be aware that the subtraction may result in negative number and handle that. Some muxers use gst_collect_pads_clip_running_time. Here it gets bit tricky. since GstClockTime on buffer can't represent negative DTS, it means that DTS that's before segment start will simply be lost. This is not acceptable for any muxer that needs DTS. We'll need a way to represent negative DTS on the buffer. This might be the only place where we actually need negative DTS, or at least to have a way to represent DTS as difference from PTS. Maybe it could be set as meta on buffer? and if the muxer is interested in it, it will get the meta from buffer. Yes, it is hacky. Any other idea is welcome. matroskamux probably doesn't care, there are no DTS in matroska. avimux (anyone still using this?) probably does care to a degree. I don't feel strongly about this, it was never meant to contain streams with bframes and I don't care much about the hacks needed to get this working mp4mux cares. DTS is primary timestamp for every sample, it must be monotonous and starts from 0. If there is initial offset it can be represented in editlist atom. In any case, we will need to offset DTS running time (so that it starts from 0), however we can shift PTS backwards by same amount. It is legal to have negative CTTS (which represents DTS/PTS delta) in both MOV and ISO MP4. We could also keep the PTS delta instead and offset other streams using edit list. But it would probably be easier to just shift CTTS back by same amount as we adjust DTS. If we do this, the version in CTTS table should be set to 1. flvmux - no idea. the CTTS probably can't be negative so other streams will need to be shifted as well to preserve A/V sync. tsmuxer: This should be fairly simple. The actual mpeg ts timestamps never start from zero, there is enough space to accomodate negative DTS. 2.4 Demuxers avidemux - no difference here; IIRC DTS all start from 0 and there are no PTS matroska - doesn't care. no DTS tsdemux, asfdemux, psdemux - if they set segment->start to value of first PTS, then it is perfectly possible that there will be DTS that is before segment->start; If segment->start is 0, then the DTS would have to be negative. Easiest way around this would be to offset everything (DTS, PTS, segment->start) by arbitrary time (i.e. GST_SECOND) so that the DTS would never need to be negative. mp4demux: Here is where it gets interesting. Every video sample has a DTS (actually, it's stored as a duration of previous sample but that's not important) and CTTS. PTS(sample) = DTS(sample) + CTTS(sample). So far so good, except original QuickTime specification states that CTTS can be negative. It can also be negative with CTTS version 1 if MP4 specification. https://developer.apple.com/library/mac/documentation/QuickTime/QTFF/QTFFChap2/qtff2.html#//apple_ref/doc/uid/TP40000939-CH204-SW40 Does this mean that the demuxer should produce PTS < DTS? If you look at QT specs for Composition Shift Least Greatest Atom, it says that the atom contains the offset that needs to be applied to composition offset so that PTS >= DTS. If the atom is missing, the demuxer is supposed to look at the CTTS sample and find a value big enough so that the offsets are adjusted and resulting PTS >= DTS. This might be a bit tricky to implement since the IRRC demuxer never sees the entire CTTS table. still, hopefully for most files part of the table should be enough to determine the offset. So contrary to popular opinion, even for MP4 the demuxer should not produce DTS > PTS.
errata: GstClockTime is unsigned of course. If it was signed there would be lot less to worry about :)
Negative DTS for encoders could be handled in the base class. For example gst_video_encoder_set_min_pts(encoder, GST_SECOND) would ensure that handle_frame will never be passed PTS that is smaller than one second. If the base needs to adjust timestamps in order to accommodate this, it would also adjust the output segment start/stop accordingly, so that the change would be neutral in regards to stream/running time. This would mean that x264enc would simply tell the base class that it needs PTS at least one second and would not need to do anything special for negative DTS, as there wouldn't be any.
Created attachment 291595 [details] [review] collectpads: Add GstDtsOffsetMeta Allows retrieving DTS offset from buffer clipped to running time
Created attachment 291596 [details] [review] videoencoder: add set_min_pts Allows specifying smallest PTS that can be passed to handle_frame
Created attachment 291597 [details] [review] x264enc: use gst_video_encoder_set_min_pts instead of manually adjusting timestamps
Created attachment 291598 [details] [review] qtmux: handle DTS before segment start properly
Created attachment 291599 [details] [review] mpegtsmux: handle DTS before segment start properly
Created attachment 291689 [details] [review] mpegtsmux: handle DTS before segment start properly fixed problem with unstamped incoming buffer (i.e. paff stream) resulting in invalid outgoing timestamp
Created attachment 291800 [details] [review] mpegtsmux: handle DTS before segment start properly Fix returning TSMUX_NO_TS for pid
Review of attachment 291595 [details] [review]: ::: libs/gst/base/gstcollectpads.h @@ +396,3 @@ +struct _GstDtsOffsetMeta { + GstMeta meta; + GstClockTime dts_offset; // DTS(buffer) = PTS(buffer) - dts_offset; dts_offset may be greater than PTS Don't use C++ style comment please.
Review of attachment 291800 [details] [review]: ::: gst/mpegtsmux/tsmux/tsmuxstream.c @@ +103,3 @@ /* PTS & DTS associated with the contents of this buffer */ gint64 pts; + gint64 dts_offset; // if not TSMUX_NO_TS, dts = pts - dts_offset C++ style comment. ::: gst/mpegtsmux/tsmux/tsmuxstream.h @@ +223,3 @@ void tsmux_stream_add_data (TsMuxStream *stream, guint8 *data, guint len, + void *user_data, gint64 pts, gint64 dts_delta, + gboolean random_access); Oops, indent got broken here.
Honestly, I like what I'm seeing in these patches and I like the approach. There is nothing too hackish or even wrong with this method. I'll merge these in my local branches and start doing more testing. Some comments from those who haven't found the time (or haven't notice) would be greatly appreciated.
We've been using this for a while, even shipped software with this and it works well for us. One thing I'm not sure about is set_min_pts in GstVideoEncoder. There are multiple ways to approach this (shifting timestamps and segment start in encoder), this one is simplest to use in the encoder, but it might not be the best one.
To me this is simple, and allow picking an offset that makes it easy to debug in traces (like multiple of hours). When we try to apply the exact offset one needs, we endup with case where this offset might need to be re-adjusted, adding unneeded complexity. Keeping the changes minimal and simple is also important since this is compensating an API limitation in GStreamer. In the future (2.0) we will want to revisit this, fixing the API. It will be much easier to remove undeeded code if it's simple. Having a method is also nice, since the compiler will greatly help in porting (just remove the function).
Please don't merge this into 1.5 without explicit ACKs.
Some additions: - Add a unit test to ensure stream-time/running-time is respected after the shift - Add FLV support.
We discussed a bit around this and we came up with the idea that we should store the DTS/PTS delta in the GstBuffer directly. With DTS/PTS delta available this way, we don't need to offset. If the DTS goes negative, we can just set it to TIME_NONE. Element that are aware of possible negative DTS would simply only use the delta.
Store it in the buffer where?
A preliminary example, hopefully that helps. + GST_BUFFER_TS_DELTA(buf) .. @@ -252,6 +268,9 @@ struct _GstBuffer { + + /* more timestamp */ + GstClockTime ts_delta;
There is no padding in the buffer, can you do this? Also, when you offset segment start everything will just work. You'll get negative DTS working transparently in all components, and be able to represent those as GstClockTime. Whereas if you add another field to buffer then a) you have redundancy b) all components that so much as touch dts need to be aware of new field. Also, having TS delta instead of actual time is very inconvenient in most places. If you want to do some calculations, interpolation, etc, you'll have to convert it to actual time, which will have to be signed, so you won't be able to represent it a GstClockTime.
Yes, GstBuffer can't be derived from or allocated on the stack, we can add more fields to it, it will be backwards-compatible.
(In reply to Matej Knopp from comment #20) > b) all components that so much as touch dts need to be aware of new field. > Also, having TS delta instead of actual time is very inconvenient in most > places. If you want to do some calculations, interpolation, etc, you'll have > to convert it to actual time, which will have to be signed, so you won't be > able to represent it a GstClockTime. At the same time, elements that care, store this information in the form of a delta. Using signed integer in muxers is not avoidable as your patches demonstrate.
Yes. In *muxers*. Your approach would put this on *every* element that so much as looks at DTS. decoders, encoders, parsers. There's no such thing as element that cares. Every element that handles DTS would need to check for the delta. Parser can't throw away negative DTS just because it "doesn't care". So instead of GstClockTime dts = GST_BUFFER_DTS(buf) you would need to gint64 dts; if (GST_BUFFER_TS_DELTA_IS_VALID(buf) && GST_BUFFER_PTS_IS_VALID(buf)) dts = GST_BUFFER_PTS(buf) - GST_BUFFER_TS_DELTA(buf); else if (GST_BUFFER_DTS_IS_VALID(buf) dts = GST_BUFFER_DTS(buf) else dts = SOME_VALUE_REPRESENTING_INVALID_DTS_WHICH_CAN_NOT_BE_MINUS_1. And this would need to be done in parsers, decoders, encoders, etc, not just muxers. Would you also add ts_delta to GstVideoCodecFrame? This is an order of magnitude more disruptive than just shifting segment start. And it create very ugly redundancy at core of GStreamer.
My first reaction was that we can't do that... but: It's not allowed to sub-class GstBuffer, so that's not a problem. It's not viable to statically allocate a GstBuffer, because they're really a GstBufferImpl internally. It's not allowed to re-implement GstBuffer, and in any case the internal GstBufferImpl isn't public. So actually, unlike most GStreamer public structs, I don't think we need padding. However, the rest of Matej's points in comment #20 are valid, and I think storing dts offset makes a lot of calculations more complex. Particularly, having the ts_delta doesn't help with the muxer case where you want to convert to running time and the DTS is outside the segment. DTS running time isn't a fixed delta from the running PTS time when segment rate is taken into account. What we really need there is some new GstSegment API that allows calculating values outside the segment and that preserves the DTS <= PTS relationship. I think that API would remove the need for GstDtsOffsetMeta. If it weren't for GST_CLOCK_TIME_NONE = -1, it'd be easy to just use the unsigned DTS and ignore the wraparound, but that's not possible - so the offset approach Matej proposes seems good to me.
Jan, I agree 100% with the need of segment api. However when gst_collect_pads_clip_running_time is used, it returns buffer that has PTS and DTS converted to running time. So the muxers need some way to get the PTS/DTS delta from buffer. Muxers don't usually call the segment api directly.
Let's scratch this GstBuffer change idea. The issue raised with the current implementation is Iis that the resulting delta isn't scaled according to the rate. So for rate that are not 1.0, it will not generate valid filed. It's also common feeling here that the GstMeta is not the nicest way to extend CollectPads. Though, as Jan pointed me, there is padding in the collect data that we can consume, so this is more cosmetic issue.
Adding a new meta to incoming buffers to track that fields is weird, and not really what buffer metas are for. I think it would be better to use the padding in GstCollectData to store the extra field, and retrieve it after gst_collect_pads_clip_running_time() in the muxer implementations that care. We'll still want new GstSegment API - just storing the PTS - DTS is not correct if the input segment has rate != 1.0.
I'll be first to admit that GstDtsOffsetMeta is a horrible inelegant hack. I'm all for proper segment api and a way on collectpads to retrieve the delta. So. Who's going to do it? :)
So, gst_segment_offset_running_time() allows you to shift the timestamps backwards all the way negative using the offset field of the segment. When you get any running-time before 0, you'll get -1, meaning invalid. I'll add a method on the segment to actually get the unclipped negative timestamps, which you should then be able to use in elements that can deal with those negative timestamps.
(In reply to Wim Taymans from comment #29) > So, gst_segment_offset_running_time() allows you to shift the timestamps > backwards all the way negative using the offset field of the segment. When > you get any running-time before 0, you'll get -1, meaning invalid. > > I'll add a method on the segment to actually get the unclipped negative > timestamps, which you should then be able to use in elements that can deal > with those negative timestamps. That would be really appreciated, thanks.
The API I thought of was gint64 gst_segment_to_relative_running_time (const GstSegment *segment, GstFormat format, guint64 reference, guint64 position) which would return a running time as a delta from the passed reference time. Normally the user would use the PTS running time for that reference, but they could also pass 0 which would be mostly equivalent to gst_segment_to_running_time with a reduced gint64 range. That way, the caller chooses a gint64 range to operate on. Maybe that's overkill, and an unclipped version of gst_segment_to_running_time() that returns a gint64 is all that's needed.
I think this is useful only if there is a trap (some possible common error) when calculating CTTS from the obtained signed timestamp. Though, I do believe that if you have two running time, the remaining is a simple subtraction.
How about this: It returns some more info when clipping happened and when the returned value is negative (it still returns a positive guint64 that is to be negated). The negation is awkward but we need to keep the full guint64 range for running-time. commit 8c8b3818e44198d2369fbbbf103bc889083ba87b Author: Wim Taymans <wtaymans@redhat.com> Date: Wed Mar 18 11:31:51 2015 +0100 segment: add helper to get negative running-time Add a helper method to get a running-time with a little more features such as detecting if the value was before or after the segment and negative running-time. API: gst_segment_to_running_time_full() Fixes https://bugzilla.gnome.org/show_bug.cgi?id=740575
The only remaining problem is the segment. In this patchset, we shift foward the segment, so TS can be positive. It remains that DTS falls before the segment, in which case the new method will still return -1. What should the segment be ? The first PTS is expected to match running time 0, shall we also set "offset" so the segment starts in the negative values, but includes the first DTS ? I'm a bit confuse on how to include the DTS but avoid introducing a rendering delay.
Review of attachment 291596 [details] [review]: This patch is not fully correct yet. We need the first DTS to be inside the segment start/stop and the first PTS need to have running-time 0, while the first DTS would have a negative running time. Currently the first DTS is outside the segment, hence would get clipped.
Why do you need DTS to be inside segment start / stop?
Also, if you use offset for this (instead of adjusting segment start), will this not break stream time? I assume the stream time for PTS should start from 0, but it won't be the case anymore, will it?
So, a point was raised on IRC, that DTS must always be within segment boundaries. My patches would of course break this assumptions. Can someone comment on this? Why is this a requirement? I see DTS as something relative to PTS, which is sort of implementation detail of the codec. So when throwing away buffer because it's not within segment boundaries, you don't really care about DTS < segment.start as long as PTS >= segment.start, do you? (except for DTS only VFW streams of course). So why require that DTS >= segment.start?
GStreamer segments relate to presentation time. There's no requirement that DTS be within the segment - if there were none of this would be necessary.
So if DTS don't need to be inside segment start / stop, why use segment offset for the shift instead of just moving segment start time? if you use the offset field it will (at least for current implementation of gst_segment_to_stream_time) also move stream time, while if you just adjust segment start, stream time stays unaffected?
Though, gst_segment_to_running_time_full() would not work, we'd need to change our mind of the added GstSegment API.
(In reply to Matej Knopp from comment #40) > So if DTS don't need to be inside segment start / stop, why use segment > offset for the shift instead of just moving segment start time? if you use > the offset field it will (at least for current implementation of > gst_segment_to_stream_time) also move stream time, while if you just adjust > segment start, stream time stays unaffected? Good remark, I guess we could also allow DTS outside of the segment, It's not much different from adjusting the offset. We would need to add something to gst_segment_to_running_time_full() or make a new method to return the extrapolated time instead of returning _BEFORE and _AFTER. Is it worth it?
(In reply to Wim Taymans from comment #42) > (In reply to Matej Knopp from comment #40) > > So if DTS don't need to be inside segment start / stop, why use segment > > offset for the shift instead of just moving segment start time? if you use > > the offset field it will (at least for current implementation of > > gst_segment_to_stream_time) also move stream time, while if you just adjust > > segment start, stream time stays unaffected? > > Good remark, I guess we could also allow DTS outside of the segment, It's > not much different from adjusting the offset. > > We would need to add something to gst_segment_to_running_time_full() or make > a new method to return the extrapolated time instead of returning _BEFORE > and _AFTER. Is it worth it? When synching on DTS (e.g. decoding sync, or identity sync=1) it would be unfortunate if we would be dropping a buffer because it's outside of the segment. There is no way to actually know, so if it's before, we need to render it from now on. I'm not sure if that is even consistent in any direction across the code. In any case, it is needed, otherwise we can't calculate the offset, which must be in running time. The easy way is to get the DTS and PTS in running time, and substract it (if DTS the absolute of a negative value, add it).
> When synching on DTS (e.g. decoding sync, or identity sync=1) it would be > unfortunate if we would be dropping a buffer because it's outside of the > segment. There is no way to actually know, so if it's before, we need to > render it from now on. I'm not sure if that is even consistent in any > direction across the code. Yes, so in that case it's better to use the offset and not allow DTS before the segment start. If you get a valid DTS that is negative, you would render it ASAP, which is what the current behaviour is. Allowing valid DTS before the segment start is probably not a good idea on second thought. > > In any case, it is needed, otherwise we can't calculate the offset, which > must be in running time. The easy way is to get the DTS and PTS in running > time, and substract it (if DTS the absolute of a negative value, add it). You can do that with the current gst_segment_to_running_time_full(). What are you suggesting we do differently? Or did you mean to say that you would need a different method if we would ever allow invalid DTS being treated as valid?
Hold on a second :) In current code, every time you seek you'll get DTS that is before segment start (assuming stream with bframes of course). That's not new. No buffer should get dropped because DTS is before segment start. Code that drops such buffers should be fixed. Also, DTS running time of first X buffers must be negative, if PTS running time of first buffer is 0. Whatever element sync on DTS running time, it should be able to handle situation with negative running time.
(In reply to Wim Taymans from comment #44) > > > > In any case, it is needed, otherwise we can't calculate the offset, which > > must be in running time. The easy way is to get the DTS and PTS in running > > time, and substract it (if DTS the absolute of a negative value, add it). > > You can do that with the current gst_segment_to_running_time_full(). What > are you suggesting we do differently? Or did you mean to say that you would > need a different method if we would ever allow invalid DTS being treated as > valid? Sorry, in fact that second statement is only if DTS are outside the segment.
In any case, if you do need DTS to be within the segment and use offset for this, maybe gst_segment_to_stream_time needs to be fixed to deal offset as well.
(In reply to Matej Knopp from comment #45) > Hold on a second :) > > In current code, every time you seek you'll get DTS that is before segment > start (assuming stream with bframes of course). That's not new. It's not that consistent, x264enc was actually shifting it back and fort (causing a lot of backward jumps). Also, I'm not sure demuxers are very consistent either. I think what you get right now is a bit "random". > > No buffer should get dropped because DTS is before segment start. Code that > drops such buffers should be fixed. Well, if it was expected to be inside the segment, we could actually drop the one that should be dropped instead of wasting time and catching up very slowly. We could of course badger with a second check on the PTS (if present). So both ways, there is ways to make it more robust. > > Also, DTS running time of first X buffers must be negative, if PTS running > time of first buffer is 0. Whatever element sync on DTS running time, it > should be able to handle situation with negative running time. Of course, this does not change. Remains that we need a way to get the negative value of the DTS in running time, which is what we are trying to address. And that means we need to make a choice if DTS should be inside or outside of the segment. The current gst_segment_to_running_time() (and _full() variant) currently requires the the passed TS to be inside of the segment. I don't know if it was made this way for mathematical reason or simply for robustness. (In reply to Matej Knopp from comment #47) > In any case, if you do need DTS to be within the segment and use offset for > this, maybe gst_segment_to_stream_time needs to be fixed to deal offset as > well. It's likely a bug yes. You said you are using stream time for some use case you have. You could make a test case from it, using pad offset to demonstrate the issue.
How about this: it adds an option to allow DTS outside of the segment in both directions. Maybe with the separate clip method we don't actually ever have to check for clipping here and we can do away with the GstSegmentResult and simply return the sign as a gint... If would still be nice to know where the clipping happened but we could make a gst_segment_clip_full() that returns the sign as well. commit edf484ab6b10d04c334f2b32eb043cb749dc3af8 Author: Wim Taymans <wtaymans@redhat.com> Date: Thu Mar 19 17:36:36 2015 +0100 segment: add option to disable clipping Add a clip argument to gst_segment_to_running_time_full() to disable the checks against the segment boundaries. This makes it possible to generate an extrapolated running-time for timestamps outside of the segment. See https://bugzilla.gnome.org/show_bug.cgi?id=740575
So, now about the CollectPads. We need to have a way to represent negative DTS on buffer that gets into the muxer (after the collect pads clipping function is done with it). Could we maybe negate the DTS and set a special flag on buffer? i.e. instead of doing GST_BUFFER_DTS(buf) = -3, do GST_BUFFER_DTS(buf) = 3 and GST_BUFFER_FLAG_SET(buf, GST_BUFFER_FLAG_NEGATE_DTS) The caveat here is that the flag would only be set by CollectPads. No elements other than muxers should be expected to deal with negated DTS.
I think this is nicer API. commit bc282da83cbc8147b7c0ed59e3bd5a6011e90eb2 Author: Wim Taymans <wtaymans@redhat.com> Date: Fri Mar 20 09:00:47 2015 +0100 segment: remove the bounds check from _to_running_time_full() Do not do any checks for the start/stop in the new gst_segment_to_running_time_full() method, we can let this be done by the more capable gst_segment_clip() method. This allows us to remove the enum of results and only return the sign of the calculated running-time. We need to put the old clipping checks in the old gst_segment_to_running_time() still because they work slightly differently than the _clip methods. See https://bugzilla.gnome.org/show_bug.cgi?id=740575
Created attachment 300917 [details] [review] collectpads: Add new clip to running time function This new method does not clip the DTS to the segment. This allow having DTS outside the segment and handle negative DTS through the new member ABI.abi.dts_sign in GstCollectData structure.
Created attachment 300918 [details] [review] videoencoder: Add gst_video_encoder_set_min_pts() For streams with reordered frames this can be used to ensure that there is enough time to accomodate first DTS, which may be less than first PTS
Created attachment 300919 [details] [review] x264enc: Use gst_video_encoder_set_min_pts This method replace the manual adjustment of PTS and DTS to avoid negative DTS issues. Using this method will also update the segment so we don't loos sync.
Created attachment 300922 [details] [review] qtmux: handle DTS before segment start properly This is blindly ported to the new API, but it seems wrong to me, and produced muxed seems to agree with me.
Review of attachment 300922 [details] [review]: This patch does not work. It produce broken files. I don't think because CTTS can be negative that it's fine to make it negative. Also, I notice this first_pts code that should have been removed when ported to collect pad running time clip function. Running starts at zero, no need to reset to zero again. After this cleanup it should be eayser to integrate negative PTS, note that the sign should be passed along with the buffer somehow.
So I'm off for vacation till 21st,I which I could have finished this work before, sorry about that. So muxers still need to be ported, qtmux needed a little cleanup. One thing I notice, is that formats like QT and FLV (which enforce that first DTS is 0, need to be doubled check) will need to setup a segment that shift back the first PTS to running time zero. Otherwise, all streams will start with a delay.
Created attachment 300924 [details] [review] qtmux: CTTS is signed in version 1
Created attachment 300925 [details] [review] qtdemux: Handle DTS with negative running time As QT works with duration, simply bring back first DTS to 0 and shift forward the PTS of the same amount.
Created attachment 300926 [details] [review] qtmux: CTTS is signed in version 1
Created attachment 300927 [details] [review] qtmux: Handle DTS with negative running time As QT works with duration, simply bring back first DTS to 0 and shift forward the PTS of the same amount.
As Thiago pointed that fact to me, and now it seems to work indeed. See you in two weeks !
Comment on attachment 300926 [details] [review] qtmux: CTTS is signed in version 1 Author need to be set back to Matej before merging. I can't say if this is right or wrong, hence I've ask Thiago to look at it.
Created attachment 300929 [details] [review] collectpads: Add new clip to running time function This new method does not clip the DTS to the segment. This allow having DTS outside the segment and handle negative DTS through the new member ABI.abi.dts_sign in GstCollectData structure. Update: Also handle negative DTS in compare func
The compare function become a little bit obscure though, but it's still manageable. Some input about that would be nice.
I looked at the qt muxer patch briefly and it looks okay, but I'll have to test it I don't have time for that right now. I assume the mpegts muxer patch will need some work to port to new collect pads api.
Yes, I ran out of time for this one (and FLVMux). It's still a bit of an experiment to get rid of the GstMeta. A second thought was that instead of storing the sign, which make it a little harder. We could actually store the running time for PTS and DTS in the CollectData, storing it as gint64. As running time start at 0, we still support quite long runs. It's a probably more work in the deprecated collect pads, but it seems more convenient then having the sign stored (with risk of using the negtative value as being positive). This way we could reuse the same clip function instead of having a new one, since we could maintain the old behaviour while storing a little more information (running time as gint64). We would need to pick a gint64 value to represent invalid though (could be G_MAXINT64 like we do in the mpegtsmux patch). A couple of macro to help accessing these ABI.abi in the CollectData and for checking valid running time would also be nice.
Created attachment 301148 [details] [review] qtmux: Handle DTS with negative running time Minor fix to the patch. It still needs a little tweaking to properly use PTS to store the first_ts to be able to compare it at the end to add gaps to streams that need it. At the moment it is using the DTS but this is unrelated to this patch as the issue is also present without the DTS fixes.
Created attachment 301149 [details] [review] tests: qtmux: test for muxing with DTS outside the segment A simple unit test to help debugging qtmux/qtdemux
So I'm back now. Before I continue I'd like some feedback Shall we store the running time in gint64 instead of the sign ? That would remove a bit of complexity in the muxers at least, and remove the need for a new clip function. GstBuffer DTS/PTS value would be left as before, and we would find in CollectData: gint64 ABI.abi.pts gint64 ABI.abi.dts As signed running time.
How would GST_CLOCK_TIME_NONE be represented in ABI.abi.dts?
(In reply to Matej Knopp from comment #71) > How would GST_CLOCK_TIME_NONE be represented in ABI.abi.dts? I thought we could use G_MININT64. I think this is what you are using in mpegts patch.
I'm not sure if there is any progress on this, I didn't have much time to follow; There is issue with my qtmux patches. It looks like CTTS version 1 is not as widely supported as I thought. Despite being introduced to standard in 2009, it is not supported on iOS 7 or QT7 player. That probably means we can't use negative CTTS in MP4 :-/ (negative CTTS works with version=0 on Apple players, but that's probably just a quicktime legacy)
(In reply to Wim Taymans from comment #49) > How about this: it adds an option to allow DTS outside of the segment in > both directions. Maybe with the separate clip method we don't actually ever > have to check for clipping here and we can do away with the GstSegmentResult > and simply return the sign as a gint... If would still be nice to know where > the clipping happened but we could make a gst_segment_clip_full() that > returns the sign as well. > > commit edf484ab6b10d04c334f2b32eb043cb749dc3af8 > Author: Wim Taymans <wtaymans@redhat.com> > Date: Thu Mar 19 17:36:36 2015 +0100 > > segment: add option to disable clipping > > Add a clip argument to gst_segment_to_running_time_full() to disable > the checks against the segment boundaries. This makes it possible to > generate an extrapolated running-time for timestamps outside of the > segment. > > See https://bugzilla.gnome.org/show_bug.cgi?id=740575 Maybe this function could have simply returned a gint64 (gboolean to indicate an error maybe, or G_MIN64 to avoid having gint64* output param).
Review of attachment 301149 [details] [review]: In this patch you say you add a test, but also remove two others, which are still referenced. I suppose it's a mistake ?
Created attachment 304896 [details] [review] collectpads: Add negative DTS support Make gst_collect_pads_clip_running_time() function also store the signed DTS in the CollectData. This signed DTS value can be used by muxers to properly handle streams where DTS can be negative initially.
Created attachment 304897 [details] [review] qtmux: CTTS is signed in version 1
Created attachment 304898 [details] [review] qtmux: Handle DTS with negative running time As QT works with duration, simply bring back first DTS to 0 and shift forward the PTS of the same amount.
Comment on attachment 304897 [details] [review] qtmux: CTTS is signed in version 1 @matej, so this one should be dropped ? It does not seems strictly needed tbf.
So, there are three ways to handle it. Keep atom version 1 and signed (initially negative) CTTS; The file will conform to specification, but iOS 7 and QuickTime player 7 will not be able to play it. Not great. Change version back to 0 and use signed CTTS anyway. The files will not strictly conform to MP4 specification, but most players will likely play them, because CTTS was signed in original QuickTime specification. So mov files can contain negative CTTS and atom version = 0; This is what I'm doing right now, since we produce files that are mostly "transient" and are targeting Apple devices exclusively. If you don't want to have negative CTTS, you need to offset it by appropriate amount, but then you also need to add that amount to other stream edit lists (elst), so that you keep the A/V sync. Given that by the time CTTS is written you have all timestamps, the amount should be easy to determine. This is best option because it produces conforming files and doesn't need ctts atom version = 1; But it needs some changes and I don't currently have time to implement and test it right now. :-/
Created attachment 305012 [details] [review] flvmux: Add negative runtime DTS support This is done by using new feature of the CollectPad clip function which sets the DTS as a gint64 in the collected data. It also simplify the code a bit.
Created attachment 305013 [details] [review] mpegtsmux: Properly detect backward DTS There was code to detect backward dts, but the marker min_dts was never set. Setting it enable this feature that prevents potential integer overflow when generating TS.
Created attachment 305014 [details] [review] tsmux: Remove uneeded cast and cast macro
Created attachment 305015 [details] [review] tsmux: Add negative DTS support Use the saved DTS, make it signed and pass that to the stream muxer. This preserves the running time sign. All usage of -1 as invalid TS are now replaced with G_MININT64. Negative values will be seen as wrap-around point, but the delta between PTS and DTS will remain correct. Demuxers don't care about absolute values, they only cares about deltas.
Created attachment 305016 [details] [review] tsdemux: Segment start should match first PTS The segment should start at first PTS, and the vairable name lower_pts state so correctly. Though we where using the first DTS instead. This could lead to small desynchronization of video stream.
Created attachment 305018 [details] [review] collectpads: Add negative DTS support Make gst_collect_pads_clip_running_time() function also store the signed DTS in the CollectData. This signed DTS value can be used by muxers to properly handle streams where DTS can be negative initially.
Created attachment 305019 [details] [review] collectpads: Don't initially send an invalid DTS Sending a possibly invalid DTS may confuse the muxers, which will then think the DTS is going backward.
This is mostly it. I think flvmux sends a bad segment (start = first DTS instead of first PTS). Arguably this is a bug can can be fixed separately. Would be nice to visit Matroska to see if anything need to be done there and finally I need to resolve the QTMux thing. I don't see why we need sign/unsign value really considering it's CTS based like FLV. As Thiago mention, qtdemux might need a segment that account for the initial gap properly. Addtionnally: - Add some unit tests for tsmux, flvmux - Fix Thiago's test for qtmux - Maybe add a macro to access cdata.ABI.abi.dts ? I notice through the multiview work some discussion about usage of union and padding data. If I'm doing it wrong, I would appreciate advises here.
> I notice through the multiview work some discussion about usage of union and > padding data. If I'm doing it wrong, I would appreciate advises here. The padding/ABI union/struct stuff looks good to me.
Created attachment 305090 [details] [review] tests: qtmux: test for muxing with DTS outside the segment Updated patch for test that doesn't remove other tests.
Review of attachment 304898 [details] [review]: ::: gst/isomp4/gstqtmux.c @@ +3173,3 @@ + * MP4 however, thus we need to offset DTS so that it starts from 0. + * We can leave PTS unchanged, because negative CTTS are valid in MP4. + */ Ok, that comment explains the story about negative CTTS. It's a mistake here to assume that PTS can be smaller then DTS. If this was the case, we'd have a bug upstream, or the packet is terribly late (QT is not a streaming format, so that last sentence does not apply). I'll remove that comment, since it make no sense, and will add a fence for that case. Because PTS >= DTS, offsetting TS to make it positive will also ensure that PTS is positive (we shift both).
Review of attachment 304897 [details] [review]: Let's skip this one, as it's potentially unsafe. Feel free to bring back the subject in another bug.
Review of attachment 305090 [details] [review]: This test seems correct, but it requires qtdemux to be correct. Did you have a patch to fix the initial segment in QTDemux ?
Created attachment 305098 [details] [review] qtmux: Handle DTS with negative running time As QT works with duration, simply bring back first DTS to 0 and shift forward the PTS of the same amount.
Created attachment 305099 [details] [review] qtdemux: Adjust segment according to ctts offset In presence of a CTTS, the segment start/stop must be offset so the segment start/stop include the PTS. This is needed since the PTS cannot be negative in this format. This fixes issues where the running time of the first buffer isn't at the start.
(In reply to Nicolas Dufresne (stormer) from comment #88) > This is mostly it. I think flvmux sends a bad segment (start = first DTS > instead of first PTS). Replying to myself. The flvdemux element was already creating an appropriate segment using PTS.
Created attachment 305158 [details] [review] collectpads: Add negative DTS support Make gst_collect_pads_clip_running_time() function also store the signed DTS in the CollectData. This signed DTS value can be used by muxers to properly handle streams where DTS can be negative initially.
Review of attachment 305098 [details] [review]: This only seems to adjust one stream PTS/DTS. Won't this introduce A/V sync issues? If you move video presentation timestamps, you will need to delay other streams as well (using the edit list atoms).
(In reply to Matej Knopp from comment #98) > Review of attachment 305098 [details] [review] [review]: > > This only seems to adjust one stream PTS/DTS. Won't this introduce A/V sync > issues? If you move video presentation timestamps, you will need to delay > other streams as well (using the edit list atoms). True, and this was the part I kept from you original patch ;-P Just like flv, the offset should have been put somewhere shared between streams. Will fix, thanks for spotting.
So offsetting everything makes qtmux send gaps on all "other" streams, it could in fact be better to offset all the segments. I think it's a bit silly though, and it make more sense to be that in presence of CTTS only we do shift the segment on the stream that has it. I think that because it make no sense to me to have to rely on EDL support to be able to handle B-Frame streams. I'll stick with the initial idea for now, it's no worst then before, and we can revisit later. There would be no API break required. At least there is no shift when played from GStreamer.
So I think I better understand the idea of the signed CTTS in version 1 of that spec. Here's what it looks like: For video: CT(0) = DT(0) + CTTS(0) For audio: CT(0) = DT(0) Where DT(0) is defined as 0. The problem is that there is no way unless CTTS(0) is 0 to synchronize these streams. I'm not sure how they came up with that weird idea, but making CTTS signed may help solving this. It directly breaks the rule that DT(n) <= CT(n) though. Using that feature would mean that DT(n) is no longer the decoding time as stated in the spec. Instead it would be shifted forward a fixed value. The problem is that they forgot that this shift is hard to later on. The demuxer would have to traverse the samples to find the lowest negative CTTS value and that before producing anything. For this reason, I think the implemented method is nice since it covers the old spec (change that most muxer may have never notice). I think there is way our muxer can accommodate both muxer behaviour, but I would prefer to just keep moving forward and this and fix it properly though another bug. (if all CTTS are positive, it's quite clear the old muxing scheme was used).
Created attachment 305179 [details] [review] clock: Add signed time utilities Add utility to print signed value of time. This is useful to trace running time values in gint64 or GstClockTimeDiff values. Additionally, define GST_CLOCK_STIME_NONE to indicate an invalid signed time value and validation macro. New macros are: GST_CLOCK_STIME_NONE GST_CLOCK_STIME_IS_VALID GST_STIME_FORMAT GST_STIME_ARGS
Created attachment 305181 [details] [review] clock: Add new signed time macro to the doc
Created attachment 305182 [details] [review] collectpads: Add new macro to the doc
Attachment 305019 [details] pushed as 51549bf - collectpads: Don't initially send an invalid DTS Attachment 305158 [details] pushed as b5e4f7b - collectpads: Add negative DTS support Attachment 305179 [details] pushed as 5cab2e1 - clock: Add signed time utilities Attachment 305181 [details] pushed as 8e53031 - clock: Add new signed time macro to the doc Attachment 305182 [details] pushed as 7037341 - collectpads: Add new macro to the doc
Attachment 300918 [details] pushed as dc7b254 - videoencoder: Add gst_video_encoder_set_min_pts()
Attachment 305012 [details] pushed as 2274ca7 - flvmux: Add negative runtime DTS support Attachment 305090 [details] pushed as 74dcd85 - tests: qtmux: test for muxing with DTS outside the segment Attachment 305098 [details] pushed as 12181ef - qtmux: Handle DTS with negative running time Attachment 305099 [details] pushed as 135e516 - qtdemux: Adjust segment according to ctts offset
Attachment 305013 [details] pushed as 91cbaa5 - mpegtsmux: Properly detect backward DTS Attachment 305014 [details] pushed as 8432116 - tsmux: Remove uneeded cast and cast macro Attachment 305015 [details] pushed as e000a6f - tsmux: Add negative DTS support Attachment 305016 [details] pushed as 338f7e8 - tsdemux: Segment start should match first PTS
Attachment 300919 [details] pushed as 43d6ca8 - x264enc: Use gst_video_encoder_set_min_pts
I suppose there is a lot more that can be enhance, but let's have split bug for them when we find it.