Bug 646327 – h264parse: Drop data before receiving the first PPS/SPS

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 646327 - h264parse: Drop data before receiving the first PPS/SPS


Summary:	h264parse: Drop data before receiving the first PPS/SPS


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gst-plugins-bad
Version:	git master
Hardware:	Other Linux

Importance:	Normal enhancement
Target Milestone:	1.1.2
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Duplicates:	699302 704156 (view as bug list)
Depends on:
Blocks:	659489

Reported:	2011-03-31 12:47 UTC by Zaheer Abbas Merali
Modified:	2016-05-28 19:06 UTC

See Also:	699302
GNOME target:	---
GNOME version:	---

Attachments
patch to add property (3.41 KB, patch) 2011-03-31 12:48 UTC, Zaheer Abbas Merali	rejected	Details \| Review
Patch to queue data before SPS/PPS instead of dropping it (1.57 KB, patch) 2013-09-09 15:55 UTC, Matej Knopp	none	Details \| Review

Description Zaheer Abbas Merali 2011-03-31 12:47:49 UTC

Without this property, it means muxers that rely on codec-data being in caps to write initial header will be broken on input from a byte-stream -> avc h264parse.

Comment 1 Zaheer Abbas Merali 2011-03-31 12:48:23 UTC

Created attachment 184774 [details] [review]
patch to add property

Comment 2 Sebastian Dröge (slomo) 2011-03-31 12:51:04 UTC

Wouldn't it be better to keep the now dropped buffers around and push them before everything else when the codec-data is known... as discussed on IRC yesterday this would be best implemented in baseparse.

Comment 3 Tim-Philipp Müller 2011-03-31 13:04:44 UTC

Why a property? It sounds like this should be default behaviour, and it should do the right thing based on downstream caps (ie. stream-format=xyz etc.)

Comment 4 Mark Nauwelaerts 2011-03-31 13:33:59 UTC

One could keep those buffers and then send them (even as baseparse functionality), but they are probably only useful if they happen to start with an IDR-frame, in which case there is also a good chance codec-data is also around from the start (depending on alignment).  That is not to say some GST_BASE_PARSE_FLOW_DELAY (and corresponding mechanics) are useful.

So, assuming a typical case with codec-data and IDR around from the start, downstream (muxer) caps should already arrange proper behaviour (since arranging for alignment=au implies that SPS/PPS are aggregated into the first unit and therefore full codec-data is in caps by the first output buffer).

In remaining more exotic cases, one "should know more what one is doing", though it might not hurt to have this enabled by default.  However, having it as a property is useful so h264parse can at least still be used for debugging purposes to see what is happening with NAL units (rather than hard-coded discarding some).

Comment 5 Zaheer Abbas Merali 2011-04-04 10:16:11 UTC

So I know very little about h264, reason I didn't do it the queuing way is that I wasn't sure they buffers before we get nal units with pps/sps necessarily will have the same sps/pps as the buffers after so it is safer to drop them...But as I said, my h264 knowledge is low/non-existent.

Reason for a property is Mark said on irc that the intention was to send out buffers asap so I made a property that would drop buffers until sps/pps was gotten and codec-data could be constructed with a default of not doing any dropping.

Comment 6 Sebastian Dröge (slomo) 2011-04-05 13:31:51 UTC

Does it make sense to send out buffers asap even if you don't have the codec_data yet? What can be done with them downstream?

Comment 7 David Schleef 2011-06-01 00:04:29 UTC

One use case I can think of is if you are transmuxing to MPEG-TS, and hit a glitch that causes the parser (or anything, really) to reset.  If after the reset, you start sending buffers immediately instead of waiting for a sync point (headers, plus IDR.  usually happens simultaneously), then a downstream listener to the MPEG-TS stream may be able to glitch less.  A little far-fetched, i suppose.

My general rule is to not coddle decoders, since they absolutely *must* handle garbage in the stream.  Better to test and fix rather than work around.

IMO, by default, drop all buffers before you get headers and IDR.  If you have an application that can do something useful with pre-header buffers, maybe then add a property.

Comment 8 Tim-Philipp Müller 2012-10-26 18:25:13 UTC

It looks like the original issue with the codec_data in the caps is fixed now.

Not sure about the skip-stuff-until-we-have-PPS/SPS-and/or-IDR

Comment 9 Sebastian Dröge (slomo) 2013-05-07 08:37:15 UTC

How should we proceed here? I agree with what David said in comment 7. There's also bug #699302 which has a patch to just skip all NALs until SPS and PPS are received.

Comment 10 Tim-Philipp Müller 2013-05-07 10:35:00 UTC

I think that's what we should do. It's the behaviour that's most compatible with the variety of different use cases. For specialist use cases people can disable it via the property then, if we add one (or we just wait until someone requests one). That's what we are doing for mpegvideoparse too now.

Comment 11 Sebastian Dröge (slomo) 2013-05-07 12:51:57 UTC

Now we just need to decide which patch looks better :)

I would prefer to not add this property until someone actually has a valid use case for that.

Comment 12 Andoni Morales 2013-05-08 00:01:03 UTC

By default I wouldn't drop buffers at all.
H264 streams using intra refresh instead of regular IDR's does not have IDR's only  one at the beginning.
I have also see some HLS streams where, contrary to what the spec recommends, fragments starts with an IDR instead of SPS/PPS+IDR and the SPS/PPS is sent later.

Comment 13 Sebastian Dröge (slomo) 2013-05-08 06:58:36 UTC

How are these frames before SPS/PPS usable then?

Comment 14 Tim-Philipp Müller 2013-05-08 09:48:56 UTC

There are two parts to this:

- not outputting any data before we are able to output proper (complete) caps

- whether to drop all data before then, or send it delayed, or send it immediately


What we're trying to do here is get some really really basic stuff right that doesn't work properly at all at the moment, like decoding and transmuxing from MPEG-TS/PS or RTP and such.

For decoding on the desktop with libav it's less problematic if our caps are incomplete at first, but that doesn't work well in many other scenarios and is also not really right.

Andoni: am I correct to assume that even for intra-refresh streams you still get regular SPS/PPS? If so, how regular? If I'm not mistaken we'd just start outputting things a little later, which doesn't seem unreasonable, esp. with intra-refresh. I think it's reasonable to e.g. wait a full refresh cycle before starting to output images - so if we can detect that (can we?) then we could just cache the data and send it delayed. .

Comment 15 Andoni Morales 2013-05-08 09:54:13 UTC

(In reply to comment #13)
> How are these frames before SPS/PPS usable then?

Usually SPS/PPS won't change within the same stream so the first GOP without SPS/PPS can be decoded once you receive them.

Comment 16 Andoni Morales 2013-05-08 10:04:47 UTC

(In reply to comment #14)
> Andoni: am I correct to assume that even for intra-refresh streams you still
> get regular SPS/PPS? If so, how regular? If I'm not mistaken we'd just start
> outputting things a little later, which doesn't seem unreasonable, esp. with
> intra-refresh. I think it's reasonable to e.g. wait a full refresh cycle before
> starting to output images - so if we can detect that (can we?) then we could
> just cache the data and send it delayed. .

Correct, SPS/PPS will be inserted in-band at the same frequency. The frequency can vary a lot but it's usually in the order of 1 each 10 frames for live streams.
Delaying outputting data could for instance fix the case in which you get a complete GOP and the PPS/SPS later instead of having to drop it.
But dropping data should be an option in all cases.

Comment 17 Tim-Philipp Müller 2013-05-08 11:08:59 UTC

Ok, so we're talking more about finetuning here.

We can make dropping optional, but it should be the default IMHO.

Maybe waiting/dropping should be handled separately.

Comment 18 Sebastian Dröge (slomo) 2013-05-08 12:56:57 UTC

*** Bug 699302 has been marked as a duplicate of this bug. ***

Comment 19 Tim-Philipp Müller 2013-06-07 10:10:16 UTC

I prefer the patch from bug #699302 and think we should apply it, but maybe move the if have_foo checks to the beginning of the if() and niceify the GST_WARNING debug string (should it be a warning? It's expected really..)

Reasons: (a) KISS, (b)  it's what we do in mpegvideoparse.

If there's a use case / need to not drop this data, that can be added later. But if you have byte-stream as input, dropping some data is par for the course imho.

Comment 20 Sebastian Dröge (slomo) 2013-06-07 10:11:46 UTC

Pushed the patch from bug #699302 as it's much simpler and also the same as done in other parsers.

commit ef0e051e7d20bcc55a523115510915f5ae7a4b65
Author: Ilya Smelykh <ilya.smelykh@gmail.com>
Date:   Fri Jun 7 12:10:08 2013 +0200

    h264parse: Wait until SPS/PPS before outputting any data
    
    https://bugzilla.gnome.org/show_bug.cgi?id=646327


Let's keep this open to fine-tune or just close?

Comment 21 Sebastian Dröge (slomo) 2013-06-07 10:14:16 UTC

<slomo> __tim: would you agree that this bug can be closed and if anybody wants a property to select this and has a real use case for it can open a new bug?
<__tim> yes, please !

Comment 22 Matej Knopp 2013-07-13 18:29:08 UTC

*** Bug 704156 has been marked as a duplicate of this bug. ***

Comment 23 Sebastian Dröge (slomo) 2013-07-15 06:35:05 UTC

Matej, is this still happening for you with latest git master?

Comment 24 Matej Knopp 2013-07-15 06:55:21 UTC

Seems to be fixed for me, except this minor thing:

22:32:11 WARN  gst.h264parse gsth264parse.c:898 - - - - - - - - no SPS/PPS yet, nal Type: 9, Size: 2 will be dropped

I'm not sure if parser should complain about dropping AU delimiter.

Comment 25 Matej Knopp 2013-09-09 13:43:49 UTC

Actually, this might not be such a good idea after all. I have stream that has slices that reference other slices from before the SPS. With the parser dropping data before first SPS, the stream starts to play, but after few seconds there are glitches. I think it would be better to buffer the data until we have SPS/PPS, but not throw it away.

Comment 26 Matej Knopp 2013-09-09 14:01:26 UTC

Sample video  https://s3.amazonaws.com/MatejK/Samples/1.001.ts

I'm not sure if it does this with ffmpeg decoder, but with hardware decoder (apple devices) it decodes fine with the frames before first SPS present, but if the data is dropped, there are artifacts at 02s. Sometimes the decoder won't even recover and will just hang.

Comment 27 Ilya Smelykh 2013-09-09 15:07:33 UTC

Matej,

Looks like there is a permission restriction, 
I have a "Permission Denied" xml response.

Comment 28 Matej Knopp 2013-09-09 15:11:53 UTC

Sorry, should be fixed.

Comment 29 Matej Knopp 2013-09-09 15:55:23 UTC

Created attachment 254498 [details] [review]
Patch to queue data before SPS/PPS instead of dropping it

Rather experimental, just for testing

Comment 30 Matej Knopp 2013-09-09 16:01:59 UTC

Also perhaps there shoud be a cap on baseparse so that really broken streams won't OOM.

Comment 31 Ilya Smelykh 2013-09-09 16:45:40 UTC

I think it would be useful to have a h264parse's param about whether it should drop or buffering.

Comment 32 Matej Knopp 2013-09-09 20:24:01 UTC

Upon more testing this file is probably not the best to demostrate the problem. Hardware decoders still seems to have problem with it. So while I still think there could be a case where keeping the frames before SPS helps, I'm not sure this is it.

Comment 33 Matej Knopp 2013-09-09 20:33:54 UTC

One of the problems why might be that there seem to be field encoded frames with H264 parser doesn't currently handle very well ( https://bugzilla.gnome.org/show_bug.cgi?id=704214 )

Comment 34 Sebastian Dröge (slomo) 2013-09-10 08:42:07 UTC

It might make sense to allow that I guess, but then most decoders are going to drop anything before the first keyframe anyway. And like Ilya said there should also be a limit of frames we queue until we error out if no SPS/PPS was received.

Not sure if the newly introduced problems are going to be bigger than the problems that are solved by this.

Comment 35 Matej Knopp 2013-09-10 12:04:31 UTC

In my experience most software decoders that I tried will happily start decoding before they get IDR frame, even if they output artifacts. However I wasn't able to decode stream without at least single IDR on iOS devices, not even with proper recovery point. Interesting thing is that it is enough to have just one IDR at the beginning of stream and it works (even seeking, etc).

Comment 36 Tim-Philipp Müller 2016-05-28 19:06:09 UTC

I think we should close this again. Please open a new bug for any specific issues that still exist.

For streams where SPS/PPS is in-stream and not signaled out of band, we can't really just assume that the sps/pps will be valid for the data before it (it might, it might not). If you have a case where a slice refers to data before the SPS/PPS then there's clearly not an IDR, maybe that means we have to choose our sync point more strictly.

In any case, I don't see us doing anything here after 2.5 years without any more progress.