GNOME Bugzilla – Bug 729371
dash/qtdemux: Reported video size does not correspond to what avdec_h264 states
Last modified: 2017-04-12 09:35:32 UTC
When playbing http://rdmedia.bbc.co.uk/dash/ondemand/bbb/avc3/1/client_manifest-common_init.mpd , qtdemux states that the video has the following caps: width=(int)1920, height=(int)1080, framerate=(fraction)100/1 but avdec_h264 changes them to: width=(int)512, height=(int)288,framerate=(fraction)100/1 In gst-validate running: gst-validate-1.0 playbin uri=http://rdmedia.bbc.co.uk/dash/ondemand/bbb/avc3/1/client_manifest-common_init.mpd we get the follwoing issue reported: critical : a field in caps has an unexpected value Detected on <avdec_h264-0:src> at 0:00:00.285289601 Details : Field width from setcaps caps 'video/x-raw, format=(string)I420, width=(int)512, height=(int)288, pixel-aspect-ratio=(fraction)1/1, interlace-mode=(string)progressive, chroma-site=(string)mpeg2, colorimetry=(string)bt601, framerate=(fraction)100/1' is different from expected value in caps 'pending-fields, width=(int)1920, height=(int)1080, framerate=(fraction)100/1, pixel-aspect-ratio=(fraction)1/1;' Description : fields set on a sink pad should be propagated downstream via set caps Also I am wondering if framerate=100/1 is for real...
The problem seems to be because it uses a "common initialization fragment" that has the values for the highest bitrate, regardless of what bitrate is used. So qtdemux is outputing correct data AFAIK and avdec_h264 too. I still haven't found a way to fix this.
FWIW it seems this is a valid initialization segment according to some DASH profile.
adding more people to CC. Do you remember where/which spec mentions the unified initialization segment? I can't find it in the main spec.
There's text about common initialization segments in the DVB DASH profile: https://www.dvb.org/resources/public/standards/a168_dvb-dash.pdf (See sections 5.1.2 and 5.1.4).
Thanks. So I think we have a bug in the sample provided by BBC RD. "Initialization segments being common means that all Representations in an Adaption Set will have identically the same 'stsd' box. There will be one entry in the 'stsd' box for each Representation. Representations encoded with different "parameters" will use the sampledescription_index in the Track Fragment Header to identify which of the sample entries in the 'stsd' box is applicable to them." So the 'stsd' box in the moov needs to have multiple entries, while in this sample it has only one. Nonetheless I don't think we handle multi-stsd entries just yet. Fixing the sample is just the first step.
I think the link to the BBC stream has changed to: http://rdmedia.bbc.co.uk/dash/ondemand/bbb/2/client_manifest-common_init.mpd The index page is: http://rdmedia.bbc.co.uk/dash/ondemand/bbb/
I think Thiago got it right back in 2014 - dash/qtdemux will report the largest, but avdec_h264 will be reporting the actual values from the last SPS/PPS that it has decoded. I think that 5.1.4 of DVB DASH is the relevant section: 5.1.4 Signalling within the AVCSampleEntry in an initialization segment The AVCSampleEntry contained within the initialization segment of an H.264/AVC track contains a number of values indicating information about the encoded video. These values shall always be set such that the encoded video in all Representations using this initialization segment does not exceed the size, profile, level or compatibility indicated within the initialization segment, however the values are not required to indicate the exact size or nature of the specific stream.
Yes, but the stsd box needs to have multiple entries. One for each representation, this is how the initialization segment can actually have information on all of them. If you read the qtff or the mpeg4 file format specification you can see that the stsd box can contain multiple format definitions. The same trak can have different formats/resolutions/... I believe the sample needs to be fixed. I've started looking at qtdemux to check how to properly handle this. We have 2 options: 1) Create one qtdemux QtDemuxStream for each trak and then inside this have a new entity that contains the different formats. Seems more aligned with what the format allows one to do but what happens when we need to push a sample from another format in the same trak? Push a caps event and hope for the best? Do a pad group switch? Wait for decodebin3? 2) Expose each stsd entry as a different stream and pad and handle them independently. The drawback here is that the samples table is the same so it would be messy to keep track of positions in the different pads using a single samples list. Also pushing gaps to cover for empty spaces. Sounds unnatural. I'll start coding option 1 unless someone has a better alternative.
https://cgit.freedesktop.org/~thiagoss/gst-plugins-good/log/?h=qtdemux-stsd-all-entries Keeping the ongoing code here
Found the related bug: https://bugzilla.gnome.org/show_bug.cgi?id=607471
(In reply to Thiago Sousa Santos from comment #8) > Yes, but the stsd box needs to have multiple entries. One for each > representation, this is how the initialization segment can actually have > information on all of them. > > If you read the qtff or the mpeg4 file format specification you can see that > the stsd box can contain multiple format definitions. The same trak can have > different formats/resolutions/... > > I believe the sample needs to be fixed. > > > I've started looking at qtdemux to check how to properly handle this. We > have 2 options: > > 1) Create one qtdemux QtDemuxStream for each trak and then inside this have > a new entity that contains the different formats. Seems more aligned with > what the format allows one to do but what happens when we need to push a > sample from another format in the same trak? Push a caps event and hope for > the best? Do a pad group switch? Wait for decodebin3? > > 2) Expose each stsd entry as a different stream and pad and handle them > independently. The drawback here is that the samples table is the same so it > would be messy to keep track of positions in the different pads using a > single samples list. Also pushing gaps to cover for empty spaces. Sounds > unnatural. > > I'll start coding option 1 unless someone has a better alternative. I really wanna vote your option 1 (keep the # of pad based on trak #, and change only caps). Actually, I've encountered this issue quite for a long time, but did not have very nice way to solve this problem. So, I've dealt with this issue using work-around way (which is not suitable for share with other people....). Is there some progress on top of current work on gst-plugins-good master branch? Or, some reference which can be used for test? It's very helpful for me to implement gstreamer based DASH playback environment. Anyway, I hope that this issue is still on progress.
Thiagoss, it would be great if we could include those patches. Any chance you could rebase them against latest master ?
Patches updated. Running gst-validate locally doesn't seem to indicate any regressions. Will run some more tests but it seems to be working so far.
commit 26f9869668ec5f18e011cd6f936b6f62778b051f Author: Edward Hervey <edward@centricular.com> Date: Wed Apr 12 11:03:24 2017 +0200 qtdemux: Add out-of-bound check Make sure we don't read invalid memory commit 9ac3861a9bd91d534ba29933f05595e7416dfab7 Author: Thiago Santos <thiagoss@osg.samsung.com> Date: Wed Apr 27 12:17:37 2016 -0300 qtdemux: move parsing of tkhd out of stsd entry loop It needs only to be read once. commit cf6733a55eacf03dd096d95f74f224d8bc7d427b Author: Thiago Santos <thiagoss@osg.samsung.com> Date: Thu Apr 7 12:23:35 2016 -0300 qtdemux: check for a different stsd entry before pushing a sample Before pushing a sample, check if there was a change in the current stsd entry. This patch also assumes that the first stsd entry is used as default for the first sample. It might cause an uneeded caps renegotiation when this isn't the case. commit 86b427dc70562f891a551ffc9f96cefe1cafcddd Author: Thiago Santos <thiagoss@osg.samsung.com> Date: Wed Apr 6 12:55:18 2016 -0300 qtdemux: parse all stsd entries stsd can have multiple format entries, parse them all. This is required to play DVB DASH profile that uses multiple entries to identify the different available bitrates/options on dash streams The stream format-specific data is not stored into QtDemuxStreamStsdEntry commit 54e252e0955287bf6971fba89666e9fa4901e118 Author: Thiago Santos <thiagoss@osg.samsung.com> Date: Tue Apr 5 14:34:00 2016 -0300 qtdemux: rework stsd sample entries access Instead of using the stsd as a base pointer, use the actual stsd entry as the stsd can have multiple entries. This is rarely used for file playback but is a possible profile with in DVB DASH specs. This still doesn't support stsd with multiple entries but makes it easier to do so. commit bd32bcc36cc7b08a0aecbad3bcc5dbd7308b9e19 Author: Thiago Santos <thiagoss@osg.samsung.com> Date: Tue Apr 5 18:00:10 2016 -0300 qtdemux: get stsd child by index instead of type There might be multiple children with the same type