After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 729371 - dash/qtdemux: Reported video size does not correspond to what avdec_h264 states
dash/qtdemux: Reported video size does not correspond to what avdec_h264 states
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-bad
git master
Other Linux
: Normal normal
: 1.11.91
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks: 778337
 
 
Reported: 2014-05-02 08:58 UTC by Thibault Saunier
Modified: 2017-04-12 09:35 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Thibault Saunier 2014-05-02 08:58:36 UTC
When playbing http://rdmedia.bbc.co.uk/dash/ondemand/bbb/avc3/1/client_manifest-common_init.mpd , qtdemux states that the video has the following caps:

   width=(int)1920, height=(int)1080, framerate=(fraction)100/1

but avdec_h264 changes them to:

   width=(int)512, height=(int)288,framerate=(fraction)100/1

In gst-validate running:

  gst-validate-1.0 playbin uri=http://rdmedia.bbc.co.uk/dash/ondemand/bbb/avc3/1/client_manifest-common_init.mpd

we get the follwoing issue reported:

  critical : a field in caps has an unexpected value
             Detected on <avdec_h264-0:src> at 0:00:00.285289601
             Details : Field width from setcaps caps 'video/x-raw, format=(string)I420, width=(int)512, height=(int)288, pixel-aspect-ratio=(fraction)1/1, interlace-mode=(string)progressive, chroma-site=(string)mpeg2, colorimetry=(string)bt601, framerate=(fraction)100/1' is different from expected value in caps 'pending-fields, width=(int)1920, height=(int)1080, framerate=(fraction)100/1, pixel-aspect-ratio=(fraction)1/1;'
             Description : fields set on a sink pad should be propagated downstream via set caps

Also I am wondering if framerate=100/1 is for real...
Comment 1 Thiago Sousa Santos 2014-05-02 12:31:45 UTC
The problem seems to be because it uses a "common initialization fragment" that has the values for the highest bitrate, regardless of what bitrate is used. So qtdemux is outputing correct data AFAIK and avdec_h264 too. I still haven't found  a way to fix this.
Comment 2 Thiago Sousa Santos 2015-01-20 13:25:13 UTC
FWIW it seems this is a valid initialization segment according to some DASH profile.
Comment 3 Thiago Sousa Santos 2016-03-31 20:42:06 UTC
adding more people to CC.

Do you remember where/which spec mentions the unified initialization segment? I can't find it in the main spec.
Comment 4 Chris Bass 2016-03-31 21:29:39 UTC
There's text about common initialization segments in the DVB DASH profile:

https://www.dvb.org/resources/public/standards/a168_dvb-dash.pdf

(See sections 5.1.2 and 5.1.4).
Comment 5 Thiago Sousa Santos 2016-03-31 21:34:02 UTC
Thanks.

So I think we have a bug in the sample provided by BBC RD.

"Initialization segments being common means that all Representations in an Adaption Set will have identically the same 'stsd' box. There will be one entry in the 'stsd' box for each Representation. Representations encoded with different "parameters" will use the sampledescription_index in the Track Fragment Header to identify which of the sample entries in the 'stsd' box is applicable to them."

So the 'stsd' box in the moov needs to have multiple entries, while in this sample it has only one.

Nonetheless I don't think we handle multi-stsd entries just yet. Fixing the sample is just the first step.
Comment 6 A Ashley 2016-04-01 13:12:15 UTC
I think the link to the BBC stream has changed to:
http://rdmedia.bbc.co.uk/dash/ondemand/bbb/2/client_manifest-common_init.mpd

The index page is: 
http://rdmedia.bbc.co.uk/dash/ondemand/bbb/
Comment 7 A Ashley 2016-04-01 13:23:17 UTC
I think Thiago got it right back in 2014 - dash/qtdemux will report the largest, but avdec_h264 will be reporting the actual values from the last SPS/PPS that it has decoded.

I think that 5.1.4 of DVB DASH is the relevant section:

5.1.4 Signalling within the AVCSampleEntry in an initialization segment

The AVCSampleEntry contained within the initialization segment of an H.264/AVC track contains a number of values indicating information about the encoded video. These values shall always be set such that the encoded video in all Representations using this initialization segment does not exceed the size, profile, level or compatibility indicated within the initialization segment, however the values are not required to indicate the exact size or nature of the specific stream.
Comment 8 Thiago Sousa Santos 2016-04-07 02:03:36 UTC
Yes, but the stsd box needs to have multiple entries. One for each representation, this is how the initialization segment can actually have information on all of them.

If you read the qtff or the mpeg4 file format specification you can see that the stsd box can contain multiple format definitions. The same trak can have different formats/resolutions/... 

I believe the sample needs to be fixed.


I've started looking at qtdemux to check how to properly handle this. We have 2 options:

1) Create one qtdemux QtDemuxStream for each trak and then inside this have a new entity that contains the different formats. Seems more aligned with what the format allows one to do but what happens when we need to push a sample from another format in the same trak? Push a caps event and hope for the best? Do a pad group switch? Wait for decodebin3?

2) Expose each stsd entry as a different stream and pad and handle them independently. The drawback here is that the samples table is the same so it would be messy to keep track of positions in the different pads using a single samples list. Also pushing gaps to cover for empty spaces. Sounds unnatural.

I'll start coding option 1 unless someone has a better alternative.
Comment 9 Thiago Sousa Santos 2016-04-07 17:46:21 UTC
https://cgit.freedesktop.org/~thiagoss/gst-plugins-good/log/?h=qtdemux-stsd-all-entries

Keeping the ongoing code here
Comment 10 Thiago Sousa Santos 2016-04-08 20:33:33 UTC
Found the related bug: https://bugzilla.gnome.org/show_bug.cgi?id=607471
Comment 11 Seungha Yang 2016-06-27 16:40:03 UTC
(In reply to Thiago Sousa Santos from comment #8)
> Yes, but the stsd box needs to have multiple entries. One for each
> representation, this is how the initialization segment can actually have
> information on all of them.
> 
> If you read the qtff or the mpeg4 file format specification you can see that
> the stsd box can contain multiple format definitions. The same trak can have
> different formats/resolutions/... 
> 
> I believe the sample needs to be fixed.
> 
> 
> I've started looking at qtdemux to check how to properly handle this. We
> have 2 options:
> 
> 1) Create one qtdemux QtDemuxStream for each trak and then inside this have
> a new entity that contains the different formats. Seems more aligned with
> what the format allows one to do but what happens when we need to push a
> sample from another format in the same trak? Push a caps event and hope for
> the best? Do a pad group switch? Wait for decodebin3?
> 
> 2) Expose each stsd entry as a different stream and pad and handle them
> independently. The drawback here is that the samples table is the same so it
> would be messy to keep track of positions in the different pads using a
> single samples list. Also pushing gaps to cover for empty spaces. Sounds
> unnatural.
> 
> I'll start coding option 1 unless someone has a better alternative.

I really wanna vote your option 1 (keep the # of pad based on trak #, and change only caps). Actually, I've encountered this issue quite for a long time, but did not have very nice way to solve this problem. So, I've dealt with this issue using work-around way (which is not suitable for share with other people....). Is there some progress on top of current work on gst-plugins-good master branch? Or, some reference which can be used for test? 
It's very helpful for me to implement gstreamer based DASH playback environment.
Anyway, I hope that this issue is still on progress.
Comment 12 Edward Hervey 2017-04-11 08:47:27 UTC
Thiagoss, it would be great if we could include those patches. Any chance you could rebase them against latest master ?
Comment 13 Thiago Sousa Santos 2017-04-12 06:46:43 UTC
Patches updated. Running gst-validate locally doesn't seem to indicate any regressions. Will run some more tests but it seems to be working so far.
Comment 14 Edward Hervey 2017-04-12 09:35:32 UTC
commit 26f9869668ec5f18e011cd6f936b6f62778b051f
Author: Edward Hervey <edward@centricular.com>
Date:   Wed Apr 12 11:03:24 2017 +0200

    qtdemux: Add out-of-bound check
    
    Make sure we don't read invalid memory

commit 9ac3861a9bd91d534ba29933f05595e7416dfab7
Author: Thiago Santos <thiagoss@osg.samsung.com>
Date:   Wed Apr 27 12:17:37 2016 -0300

    qtdemux: move parsing of tkhd out of stsd entry loop
    
    It needs only to be read once.

commit cf6733a55eacf03dd096d95f74f224d8bc7d427b
Author: Thiago Santos <thiagoss@osg.samsung.com>
Date:   Thu Apr 7 12:23:35 2016 -0300

    qtdemux: check for a different stsd entry before pushing a sample
    
    Before pushing a sample, check if there was a change in the current
    stsd entry. This patch also assumes that the first stsd entry is
    used as default for the first sample. It might cause an uneeded
    caps renegotiation when this isn't the case.

commit 86b427dc70562f891a551ffc9f96cefe1cafcddd
Author: Thiago Santos <thiagoss@osg.samsung.com>
Date:   Wed Apr 6 12:55:18 2016 -0300

    qtdemux: parse all stsd entries
    
    stsd can have multiple format entries, parse them all.
    
    This is required to play DVB DASH profile that uses multiple entries
    to identify the different available bitrates/options on dash streams
    
    The stream format-specific data is not stored into QtDemuxStreamStsdEntry

commit 54e252e0955287bf6971fba89666e9fa4901e118
Author: Thiago Santos <thiagoss@osg.samsung.com>
Date:   Tue Apr 5 14:34:00 2016 -0300

    qtdemux: rework stsd sample entries access
    
    Instead of using the stsd as a base pointer, use the actual stsd
    entry as the stsd can have multiple entries. This is rarely used
    for file playback but is a possible profile with in DVB DASH specs.
    
    This still doesn't support stsd with multiple entries but makes it
    easier to do so.

commit bd32bcc36cc7b08a0aecbad3bcc5dbd7308b9e19
Author: Thiago Santos <thiagoss@osg.samsung.com>
Date:   Tue Apr 5 18:00:10 2016 -0300

    qtdemux: get stsd child by index instead of type
    
    There might be multiple children with the same type