GNOME Bugzilla – Bug 677535
isomp4/qtdemux: Add support for MPEG DASH tfdt box
Last modified: 2012-08-28 14:35:13 UTC
Created attachment 215710 [details] qtdemux DASH tfdt support MPEG DASH has defined a set of new boxes to specify duration, indexes and offsets of ISOBMFF fragments. The Track Fragment Base Media Decode Time (tfdt) Box can in particular be included inside a traf box to specify the absolute decode time, measured on the media timeline, of the first sample in decode order in the track fragment. This information can be used by the qtdemux to find out the current position of an MP4 fragment in the timeline and assign the correct timestamp to the output buffers (the current code always assumes that the offset is zero). This patch adds code for parsing and dumping tfdt boxes, and modifies qtdemux fragmented MP4 to add the tfdt decode time to the output buffers timestamps.
The new MP4 boxes are described in a amendment of ISO/IEC 14496-12:2008 that can be found there: www.3gpp.org/ftp/Inbox/LSs_from_external_bodies/ISO_IEC_JTC1_SG29_WG11/29n12310.zip
Created attachment 217030 [details] [review] Properly formatted patch (with also a fix on pending segment start time) Here is a properly formatted patch addressing the same issue. I have also added a fix to the start time of the segment sent when the fragment starts playing.
Sorry, two other modifications have crept in to allow event and queries to be forwarded upstream if the qtdemux cannot react/answer.
Created attachment 217170 [details] [review] Updated-patch-without-irrelevant-modifications Here is the updated patch (without the two irrelevant changes)
(In reply to comment #4) > Created an attachment (id=217170) [details] [review] > Updated-patch-without-irrelevant-modifications > > Here is the updated patch (without the two irrelevant changes) Looks like the decode time is only used at the very start of the stream (which could make sense since it should be sum of the previous durations according to spec otherwise), though one would expect the initial one to be 0 anyway unless the file in question is a partial one of some sort. However, there is also a // style comment, and more importantly, the lower level parsing code should not be fiddling with segments, which are quite intricate in qtdemux (in view of possible edit lists etc). And this segment is one only used in push mode, and is a global one that is being adjusted on stream specific values. Also, adjusting the segment that way basically mutes the 'decode time' effect altogether; generally it makes no real difference to send buffers with ts starting at 0 (and a segment as well) or sending buffers with ts starting at decode_time (and a segment as well). All in all, makes it feel this patch has some very specific (push-based) setting/context/use-case in mind above and beyond merely support for tfdt in qtdemux. If so, then e.g. adjusting the push-based segment elsewhere (e.g. in chain) based on the first buffer's timestamp or so might be a better approach.
(In reply to comment #5) > > However, there is also a // style comment, and more importantly, the lower > level parsing code should not be fiddling with segments, which are quite > intricate in qtdemux (in view of possible edit lists etc). And this segment is > one only used in push mode, and is a global one that is being adjusted on > stream specific values. Also, adjusting the segment that way basically mutes > the 'decode time' effect altogether; generally it makes no real difference to > send buffers with ts starting at 0 (and a segment as well) or sending buffers > with ts starting at decode_time (and a segment as well). The current trend (see HLSdemux) for dealing with fragmented (adaptive) content is to create a new decoding chain every time a new bitrate is selected. For DASH, this means that every time a new representation is selected, a new isomp4 demux is instantiated, with a stream composed of a moov box provided by the DASH initialization segment, and followed by several consecutive moof boxes starting at a specific offset (typically _not_ zero). Without parsing the tfdt box, the qtdemux will indeed start ts and segment as zero, and the playback will not be blocked. However, and it is maybe a bug, the downstream chain will lose the knowledge of the current playback position. > > All in all, makes it feel this patch has some very specific (push-based) > setting/context/use-case in mind above and beyond merely support for tfdt in > qtdemux. If so, then e.g. adjusting the push-based segment elsewhere (e.g. in > chain) based on the first buffer's timestamp or so might be a better approach. Would you mean fixing the segment right after the transition to QTDEMUX_STATE_MOVIE and before actually sending it ?
(In reply to comment #6) > (In reply to comment #5) > > > > However, there is also a // style comment, and more importantly, the lower > > level parsing code should not be fiddling with segments, which are quite > > intricate in qtdemux (in view of possible edit lists etc). And this segment is > > one only used in push mode, and is a global one that is being adjusted on > > stream specific values. Also, adjusting the segment that way basically mutes > > the 'decode time' effect altogether; generally it makes no real difference to > > send buffers with ts starting at 0 (and a segment as well) or sending buffers > > with ts starting at decode_time (and a segment as well). > > The current trend (see HLSdemux) for dealing with fragmented (adaptive) content > is to create a new decoding chain every time a new bitrate is selected. For > DASH, this means that every time a new representation is selected, a new isomp4 > demux is instantiated, with a stream composed of a moov box provided by the > DASH initialization segment, and followed by several consecutive moof boxes > starting at a specific offset (typically _not_ zero). > > Without parsing the tfdt box, the qtdemux will indeed start ts and segment as > zero, and the playback will not be blocked. However, and it is maybe a bug, the > downstream chain will lose the knowledge of the current playback position. Note that position tracking is done by the 'time/position' member in a segment (event), so rather than a segment (S, NONE, S) and buffer times starting at S, it is also possible (and maybe easier/cleaner) to use (0, NONE, S) and buffer times starting at 0. > > > > > All in all, makes it feel this patch has some very specific (push-based) > > setting/context/use-case in mind above and beyond merely support for tfdt in > > qtdemux. If so, then e.g. adjusting the push-based segment elsewhere (e.g. in > > chain) based on the first buffer's timestamp or so might be a better approach. > > Would you mean fixing the segment right after the transition to > QTDEMUX_STATE_MOVIE and before actually sending it ? (indeed) Somewhere around when _chain has the buffer timestamp at hand (and if it is the one of the first buffer needing a newsegment before sending the buffer). However, in combination with above remark it should suffice to have atom parsing dump the tfdt decode/position in some state variable, which is then used as the position parameter for the (initial) newsegment event (created in the _STATE_HEADER handling).
Created attachment 221121 [details] [review] isomp4: Add DASH tfdt box support
> Note that position tracking is done by the 'time/position' member in a segment > (event), so rather than a segment (S, NONE, S) and buffer times starting at S, > it is also possible (and maybe easier/cleaner) to use (0, NONE, S) and buffer > times starting at 0. ... > However, in combination with above remark it should suffice to have atom > parsing dump the tfdt decode/position in some state variable, which is then > used as the position parameter for the (initial) newsegment event (created in > the _STATE_HEADER handling). Thanks for pointing that out: this is makes indeed more sense adjusting only playback position (and it is alos simpler). I have still however a few questions about how it could be implemented. First, where would it make more sense to store the parsed decode/position ? In the qtdemux object as a new member ? Second, the new segment event is currently created when the moov has been parsed, but the tfdt is retrieved from the moof, leaving two options: A - Move event creation code so that it executes after the moof has been parsed B - Leave event creation code as is and Adjust time/position when the moov has been parsed. Which option would you suggest ? I have attached a temporary updated patch where I implement option B, but still leaving the event adjustment in the low-lvel parsing code (which I agree is bad design).
Not particularly overjoyed with having the segment stuff in the lower levels, but I guess it will do rather than messing around with such details ... commit 3dd40e3127ef0fabe062c5dfbdb7012f4967a82b Author: David Corvoysier <david.corvoysier@orange.com> Date: Tue Jul 3 17:50:24 2012 +0200 isomp4: add DASH tfdt box support MPEG DASH has defined a set of new boxes to specify duration, indexes and offsets of ISOBMFF fragments. The Track Fragment Base Media Decode Time (tfdt) Box can in particular be included inside a traf box to specify the absolute decode time, measured on the media timeline, of the first sample in decode order in the track fragment. This information can be used by the isomp4 demux to find out the current position of an MP4 fragment in the timeline. This patch adds code to isomp4 to: - parse the tfdt box - adjust the time/position member of the new segment sent when playback starts Fixes https://bugzilla.gnome.org/show_bug.cgi?id=677535