Bug 606662 – h264: add stream-format to output caps

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 606662 - h264: add stream-format to output caps


Summary:	h264: add stream-format to output caps


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gst-plugins-ugly
Version:	git master
Hardware:	Other All

Importance:	Normal critical
Target Milestone:	0.10.14
Assigned To:	Thiago Sousa Santos
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2010-01-11 19:18 UTC by Thiago Sousa Santos
Modified:	2012-02-10 14:18 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Adds the stream-format to x264enc (2.02 KB, patch) 2010-01-11 19:19 UTC, Thiago Sousa Santos	rejected	Details \| Review
Adds new fields to x264enc (2.04 KB, patch) 2010-01-28 17:49 UTC, Thiago Sousa Santos	none	Details \| Review
adds new fields to the rtp (de)payloader (5.20 KB, patch) 2010-01-28 17:49 UTC, Thiago Sousa Santos	needs-work	Details \| Review
qtmux: Restrict h264 caps (2.35 KB, patch) 2010-01-28 18:58 UTC, Thiago Sousa Santos	committed	Details \| Review
rtph264pay, rtph264depay: add new stream-format fields to caps (updated) (5.32 KB, patch) 2010-01-28 19:21 UTC, Thiago Sousa Santos	none	Details \| Review
qtdemux patch (1.57 KB, patch) 2010-01-29 12:51 UTC, Thiago Sousa Santos	committed	Details \| Review
rtph264depay: replace property with caps negotiation (9.65 KB, patch) 2010-01-29 12:52 UTC, Thiago Sousa Santos	none	Details \| Review
matroskamux: Only accept avc, au aligned h264 data (1.31 KB, patch) 2010-02-01 12:18 UTC, Thiago Sousa Santos	none	Details \| Review
matroskademux: Adds new h264 fields to srcpad caps (1.62 KB, patch) 2010-02-01 12:18 UTC, Thiago Sousa Santos	committed	Details \| Review
x264enc: Replace 'avc-sample' with 'avc' in caps (1.40 KB, patch) 2010-02-22 23:00 UTC, Thiago Sousa Santos	committed	Details \| Review
h264parse: Replace 'avc-sample' with 'avc' in caps (1.48 KB, patch) 2010-02-22 23:02 UTC, Thiago Sousa Santos	committed	Details \| Review
qtmux: Rename 'avc-sample' to 'avc' in caps (689 bytes, patch) 2010-02-22 23:02 UTC, Thiago Sousa Santos	committed	Details \| Review
0001-mpegtsdemux-tsdemux-Add-byte-stream-to-h264-caps.patch (1.32 KB, patch) 2011-04-24 23:44 UTC, David Schleef	committed	Details \| Review
0001-matroskamux-Add-stream-format-to-h264-caps.patch (863 bytes, patch) 2011-04-24 23:46 UTC, David Schleef	committed	Details \| Review
0001-avimux-Add-stream-format-to-h264-caps.patch (854 bytes, patch) 2011-04-25 00:07 UTC, David Schleef	committed	Details \| Review

Description Thiago Sousa Santos 2010-01-11 19:18:35 UTC

Following Bug 604925 that added stream-format to aac, we could add the same to h264 caps. To specify if the stream has bytestream or avc format.

I'm opening this bug for discussion on what elements needs updating and the formats/names we are going to add.

Comment 1 Thiago Sousa Santos 2010-01-11 19:19:44 UTC

Created attachment 151182 [details] [review]
Adds the stream-format to x264enc

I'm attaching this patch for comments in the stream-format names before adding them to other elements.

Comment 2 Mark Nauwelaerts 2010-01-12 16:59:11 UTC

Current patch has 'bytestream' and 'avc-format' as options.

Since the field is already called stream-format, I would be more inclined to go for 'byte-stream' and 'avc-sample' as options.
[the former hyphen may be a matter of linguist opinion or symmetry preferences, but apparently x264enc's current option is called byte-stream].

Comment 3 Thiago Sousa Santos 2010-01-13 14:15:08 UTC

Module: gst-plugins-bad
Branch: master
Commit: eba1357244181d96ceff7f8e52980b2bd6e99c41
URL:    http://cgit.freedesktop.org/gstreamer/gst-plugins-bad/commit/?id=eba1357244181d96ceff7f8e52980b2bd6e99c41

Author: Thiago Santos <thiago.sousa.santos@collabora.co.uk>
Date:   Wed Jan 13 09:42:12 2010 -0300

h264parse: Add stream-format to caps

Adds the new stream-format field to h264parse caps

Fixes #606662

Module: gst-plugins-ugly
Branch: master
Commit: c77dfa7b9fd350825ab5de445fafc5ace10ddd17
URL:    http://cgit.freedesktop.org/gstreamer/gst-plugins-ugly/commit/?id=c77dfa7b9fd350825ab5de445fafc5ace10ddd17

Author: Thiago Santos <thiago.sousa.santos@collabora.co.uk>
Date:   Mon Jan 11 16:14:44 2010 -0300

x264enc: Add stream-format to output caps

Adds stream-format to output caps of x264enc that
informs if the stream is in bytestream of avc format.

Fixes #606662

Comment 4 Thiago Sousa Santos 2010-01-13 14:16:24 UTC

Comment on attachment 151182 [details] [review]
Adds the stream-format to x264enc

Commited a different version changing avc-format to avc-sample and bytestream to byte-stream

Comment 5 Mark Nauwelaerts 2010-01-13 20:57:36 UTC

It turns out that while previous commits bring more clarity, it leaves some cases open.  Specifically, while avc sample spec in fact requires each 'chunk'/buffer to be a complete AU (and does not make much sense if it is not), byte-stream (annex b) does not care at all.

So, 'avc-sample' specifies everything (in particular, also the current h264parse properties properties to make it so).  'byte-stream', on the other hand, still leaves open whether it is NALU-aligned, AU-aligned or not aligned at all
[note that this also sorts of conflicts with the parser incantation attribute parsed/frame, for which true or false may or may not really make semantic sense in combination with a stream-format choice].

It would be nice to get to a situation where we can drop h264parse's most recent properties and do it all caps-magical (such was the intent of 'stream-format').
Btw, it would also be nice that the default behaviour is at least changed to AU-merging (maybe even avc-sample).

To do so, (non-exhaustive) possibilities might be:
* also allow for 'byte-stream-au' and 'byte-stream-nalu' options
* add yet some other field that defines the 'level of parseness' (e.g. 'nalu', 'au', 'none')
* use a parsed/framed that is not simply true/false (so use parsed/framed as the  field in above option)

In either/any case, combination with (legacy) parsed/framed might not make sense, but then again:
* that field seems more used to magically summon a parser (or keep one at bay) rather than actually describing a data characteristic
* e.g. for aac, 'stream-format=raw' parsed='false' is sort-of not realistic either

Comment 6 Thiago Sousa Santos 2010-01-15 12:03:07 UTC

Setting it to blocker to get it solved before the next release cycle.

Comment 7 Felipe Contreras (banned) 2010-01-15 12:42:04 UTC

(In reply to comment #2)
> Current patch has 'bytestream' and 'avc-format' as options.
> 
> Since the field is already called stream-format, I would be more inclined to go
> for 'byte-stream' and 'avc-sample' as options.

I agree on 'byte-stream' as it's defined as "byte stream" in the H.264 spec. However, "avc-sample" doesn't ring any bell.

The non-bystestream format is defined in MPEG-4 part 15 [aka Advanced Video Coding (AVC) file format], therefore I agree with Thiago's original proposal of "avc-format".

Comment 8 Tim-Philipp Müller 2010-01-15 12:52:15 UTC

How about just "avc"? If the field is called "stream-format" already, we don't really need to repeate th "-format" bit in the name, do we? (no strong opinions either way, just thinking out loud)

Comment 9 Felipe Contreras (banned) 2010-01-15 13:06:54 UTC

(In reply to comment #5)
> It turns out that while previous commits bring more clarity, it leaves some
> cases open.  Specifically, while avc sample spec in fact requires each
> 'chunk'/buffer to be a complete AU (and does not make much sense if it is not),
> byte-stream (annex b) does not care at all.
> 
> So, 'avc-sample' specifies everything (in particular, also the current
> h264parse properties properties to make it so).  'byte-stream', on the other
> hand, still leaves open whether it is NALU-aligned, AU-aligned or not aligned
> at all

AFAICR the standard specifies that the decoder must always work with NALUs. So the only options are NALU-aligned, or AU-aligned.

All decoders that I've seen work on full access-units, however, for efficient transmissions over the network, a slice support would be nice. In such cases the decoder would not be receiving full access-units.
See: http://bul.ece.ubc.ca/VideoOverWLAN_Presentation_v1wbk.pdf

So I would say a boolean "slice-mode" would make sense. Most decoders would be slice-mode=false though, but if a depayloader can negotiate slice-mode=true, then it can work it's magic to make better use of the network. Of course this would require some way to set the actual slice size, not sure if that should go in the caps or a property.

Comment 10 Felipe Contreras (banned) 2010-01-15 13:08:49 UTC

(In reply to comment #8)
> How about just "avc"? If the field is called "stream-format" already, we don't
> really need to repeate th "-format" bit in the name, do we? (no strong opinions
> either way, just thinking out loud)

Good point; I change my vote to "avc".

Comment 11 Mark Nauwelaerts 2010-01-15 13:13:53 UTC

Other than (as mentioned previously and above) finding twice -format a bit (too) much, I have no particular strong opinion.

[so, just for the record/posterity, the motivation behind 'avc-sample':
Yes, it is defined in MPEG-4 part 15, more specifically in 
5.2.3 'AVC sample structure definition'
(which defines the layout of what should go into 1 (mp4) avc sample as using size prefixes etc), hence quite simply such kind of layout is an 'avc-sample' 'stream-format']

Comment 12 Mark Nauwelaerts 2010-01-15 20:12:05 UTC

(In reply to comment #9)
> (In reply to comment #5)
> > It turns out that while previous commits bring more clarity, it leaves some
> > cases open.  Specifically, while avc sample spec in fact requires each
> > 'chunk'/buffer to be a complete AU (and does not make much sense if it is not),
> > byte-stream (annex b) does not care at all.
> > 
> > So, 'avc-sample' specifies everything (in particular, also the current
> > h264parse properties properties to make it so).  'byte-stream', on the other
> > hand, still leaves open whether it is NALU-aligned, AU-aligned or not aligned
> > at all
> 
> AFAICR the standard specifies that the decoder must always work with NALUs. So
> the only options are NALU-aligned, or AU-aligned.

These are likely indeed the only ones that a decoder likes.  However, it is unlikely that either of these will happen to be in a stream of filesrc buffers taken from some raw h264 file, and that needs to have caps as well, not only the options that decoders would appreciate.

> 
> All decoders that I've seen work on full access-units, however, for efficient
> transmissions over the network, a slice support would be nice. In such cases
> the decoder would not be receiving full access-units.
> See: http://bul.ece.ubc.ca/VideoOverWLAN_Presentation_v1wbk.pdf
> 
> So I would say a boolean "slice-mode" would make sense. Most decoders would be
> slice-mode=false though, but if a depayloader can negotiate slice-mode=true,
> then it can work it's magic to make better use of the network. Of course this
> would require some way to set the actual slice size, not sure if that should go
> in the caps or a property.

In view of the above, other comments and some consistency with e.g. AAC case, it might then make sense to let 'framed' assume more values than just 'true' or 'false',  e.g. 'nalu', 'au' and 'false'/'none' whatever.  Arguably, 'nalu' or 'false' would not be "very legal" in 'avc[-whatever]' mode, but neither is parsed=false really legal for 'raw' AAC.

W.r.t. 'slice' note that not all NALUs represent a slice, which may as such introduce some confusion.

Comment 13 David Schleef 2010-01-15 21:10:36 UTC

H.264 in bytestream format is the same as, say, MPEG-2 video, which we've had no problem using 'parsed=true/false'.

If you want to do something for sub-frame video, creating a spec with no implementations is probably not the best way to go about it.

Decoders of bytestream format do not care whether the stream is parsed or not.  That's kind of the point.

Comment 14 Felipe Contreras (banned) 2010-01-16 11:30:01 UTC

(In reply to comment #13)
> H.264 in bytestream format is the same as, say, MPEG-2 video, which we've had
> no problem using 'parsed=true/false'.

Not really, H.264 can be in bytestream format and not parsed (raw H.264 data). I'm not sure if it's possible to have it in AVC format and unparsed; the format definitely wasn't intended to be used that way, but I think it's possible.

> If you want to do something for sub-frame video, creating a spec with no
> implementations is probably not the best way to go about it.

If you are referring to the slice mode proposal, AFAIK it's defined in the H.264 spec and we will definitely be pushing for that in Maemo 6.

> Decoders of bytestream format do not care whether the stream is parsed or not. 
> That's kind of the point.

Oh but they do: decoders out there don't work properly when parsed=false.

Comment 15 David Schleef 2010-01-16 18:26:09 UTC

(In reply to comment #14)
> (In reply to comment #13)
> > H.264 in bytestream format is the same as, say, MPEG-2 video, which we've had
> > no problem using 'parsed=true/false'.
> 
> Not really, H.264 can be in bytestream format and not parsed (raw H.264 data).

That's what I said.

> > If you want to do something for sub-frame video, creating a spec with no
> > implementations is probably not the best way to go about it.
> 
> If you are referring to the slice mode proposal, AFAIK it's defined in the
> H.264 spec and we will definitely be pushing for that in Maemo 6.

Pushing for what?  You already decode slices, because that's what frames are made up of.

> > Decoders of bytestream format do not care whether the stream is parsed or not. 
> > That's kind of the point.
> 
> Oh but they do: decoders out there don't work properly when parsed=false.

Then fix your decoder.  It's about 20 lines of code.

Comment 16 Felipe Contreras (banned) 2010-01-16 19:09:37 UTC

(In reply to comment #15)
> (In reply to comment #14)
> > Not really, H.264 can be in bytestream format and not parsed (raw H.264 data).
> 
> That's what I said.

So you agree that "stream-format" and "parsed" are orthogonal; good.

> Pushing for what?

Support to feed individual slices to the decoder; right now both FFmpeg and Texas Instrument's decoders assume they will receive whole access-units.

> Then fix your decoder.  It's about 20 lines of code.

Or we define the format that most decoders out there are already expecting.

Comment 17 David Schleef 2010-01-16 20:00:24 UTC

ffdec_h264 most definitely handles non-parsed bytestream h.264.  ewh264dec handles non-parsed bytestream h.264.  And looking beyond h.264, ffdec_mpeg2video, mpeg2dec, flumpeg2vdec, and ewmpeg2dec all handle unparsed MPEG-2.

And since these decoders all have internal parsers, you can send them individual slices instead of entire pictures.  Some even decode correctly if you send corrupted slices or skip some slices completely.

It is my opinion that you are designing API based on faulty assumptions.

Comment 18 Felipe Contreras (banned) 2010-01-17 11:04:38 UTC

(In reply to comment #17)
> ffdec_h264 most definitely handles non-parsed bytestream h.264.

That's because ffdec_h264 is wrongly adding a parser (only on byte-stream mode), see gst_ffmpegdec_open(). It is possible to generate a muxed clip that contains trimmed access-units, it would play fine on totem, but not ffplay (mplayer, vlc, and pretty much everything out there).

It is wrong to add an unnecessary parser because:
a) the demuxer already generates the right format
b) the deplayloader should generate the right format
c) if a parser is needed for whatever reason, it should be determined by the caps, and certainly not based on the stream-format

This is precisely the reason why rtph264dec+ffdec_h264 only works on bytestream mode:

/* FIXME, non-bytestream handling is freaking out ffmpeg. Apparently we need to
 * group all NAL units belonging to one frame together */

The proper solution would be:
 * remove the extra parser in ffdec_h264
 * add slice_mode=TRUE to the caps (or similar)
 * let rtph264dec buffer full access-units when such mode is on

Then both avc and byte-stream formats will work correctly.

> And since these decoders all have internal parsers, you can send them
> individual slices instead of entire pictures.  Some even decode correctly if
> you send corrupted slices or skip some slices completely.
> 
> It is my opinion that you are designing API based on faulty assumptions.

Note that when I say "decoder" I meant the actual decoder (FFmpeg's), not the "decoder wrapper" (gst-ffmpeg).

IMO you are making assumptions based on gst-ffmpeg, and not the real decoders.

Comment 19 David Schleef 2010-01-17 20:07:08 UTC

(In reply to comment #18)
> (In reply to comment #17)
> > ffdec_h264 most definitely handles non-parsed bytestream h.264.
> 
> That's because ffdec_h264 is wrongly adding a parser (only on byte-stream
> mode), see gst_ffmpegdec_open().

"wrongly"?  Not in the current way of doing things.

> It is possible to generate a muxed clip that
> contains trimmed access-units, it would play fine on totem, but not ffplay
> (mplayer, vlc, and pretty much everything out there).

I don't know what this means.  If this means "totem plays broken files better than ffplay", I'm ok with that.

> This is precisely the reason why rtph264dec+ffdec_h264 only works on bytestream
> mode:

Actually, I think it only works in bytestream mode because H.264 over RTP is only specified in bytestream mode.  Unless there's an RFC after 3984 that I don't know about.  It seems unlikely, though, since AVC format is fundamentally incapable of resynchronization after dropping packets.

> /* FIXME, non-bytestream handling is freaking out ffmpeg. Apparently we need to
>  * group all NAL units belonging to one frame together */

Of course.  That is how AVC format is defined.

Comment 20 Felipe Contreras (banned) 2010-01-18 00:30:02 UTC

(In reply to comment #19)
> (In reply to comment #18)
> > That's because ffdec_h264 is wrongly adding a parser (only on byte-stream
> > mode), see gst_ffmpegdec_open().
> 
> "wrongly"?  Not in the current way of doing things.

Yes, wrongly, because it should not be needed.

> > It is possible to generate a muxed clip that
> > contains trimmed access-units, it would play fine on totem, but not ffplay
> > (mplayer, vlc, and pretty much everything out there).
> 
> I don't know what this means.  If this means "totem plays broken files better
> than ffplay", I'm ok with that.

It means gst-ffmpeg is doing something nobody is doing. Therefore we shouldn't rely on that behavior.

> > This is precisely the reason why rtph264dec+ffdec_h264 only works on bytestream
> > mode:
> 
> Actually, I think it only works in bytestream mode because H.264 over RTP is
> only specified in bytestream mode.  Unless there's an RFC after 3984 that I
> don't know about.  It seems unlikely, though, since AVC format is fundamentally
> incapable of resynchronization after dropping packets.

RFC 3984 on section 1.1 clearly states that neither byte-stream nor avc are relevant:

---
Annex B of H.264 defines an encapsulation process to transmit such NAL units
over byte-stream oriented networks.  In the scope of this memo, Annex B is not
relevant.
---

This means it's up to the decoder/depayloader to agree on an interface: could be byte-stream, could be avc, could be something custom. It doesn't matter because the RTP stream is different anyway.

> > /* FIXME, non-bytestream handling is freaking out ffmpeg. Apparently we need to
> >  * group all NAL units belonging to one frame together */
> 
> Of course.  That is how AVC format is defined.

Right, and the decoders out there assume the same for byte-stream format.

Comment 21 Thiago Sousa Santos 2010-01-25 15:26:35 UTC

I guess we still have no decision on this.

Guess everyone agrees on 'avc' for avc sample/format.

The remaining problem is with byte-stream, that might be parsed in AU or NALU "formats".

I'll try to summarize here what I understood from the discussion:


* Option 1:
Use byte-stream-nalu and byte-stream-au, the problem here is what would mean having those formats with parsed=false.

This solution seems to be having the following combinations as 'valid':
byte-stream, parsed=false
byte-stream-nalu/au, parsed=true.


* Option 2:
Use format=byte-stream and allow parsed to be 'none', 'au', 'nalu'. This works as well, but it breaks our usual parsed=false/true logic.


* Option 3 (hybrid):
Use format=byte-stream, keep parsed=true/false but add a new alignment=au/nalu field.


I particularly don't like option 2 because parsed=true/false is already used for other formats and is kind of convention. But I have no strong opinion about 1 or 3, though I prefer 3.

Comment 22 Wim Taymans 2010-01-25 15:40:58 UTC

Regardless of what is currently done where and if it's right or wrong is rather irrelevant. What we want here is to describe the various ways H264 streams can be formatted and put into buffers.

At the lowest level there are NAL units which may be separated from eachother with:

- start codes (annex B). start codec between NAL units must be found to know
the length of the NAL unit.
- length prefix (AVC). each NAL unit is prefixed with the length. The size of
this length prefix is transmitted in an 'AVC Decoder Configuration
Record' that we pass along in codec_data in the caps.

Currently we implicitely assume AVC mode when we see a codec_data in the caps. AVC mode is easier to parse as you don't need to look for start codes. The codec_data is however more difficult to handle.

Decoders may or may not work at the NAL unit level. Usually decoders work at the AU (Access Unit) level, which is a collection of NALs that make up a complete frame. If the decoder doesn't handle NALs directly, a parser needs to be inserted to group NALs into AUs for the decoder.

The typical cases that we have to deal with are these:

- reading bytesteam H264 from a file, no NAL or AU grouping is done. NALs are
separated by startcodes
- mpegtsdemux/mpegpsdemux: bytestream format, essentially like a raw file no
AU or NAL grouping is done.
- qtdemux: AVC format, NALs are grouped into AU

IMO we need at least 6 reasonable possibilities:

- avc-raw: ungrouped length prefixed NALs (not a good idea as resync after a
lost packet is not possible).
- avc-nal: a GstBuffer contains one or more complete length prefixed NALs (a
depayloader could natuarally do this)
- avc-au: a GstBuffer contains one complete AU (qtdemux, ...)
- bytestream-raw: a buffer contains ungrouped startcode separated (incomplete)
NALs (like from a filesrc, mpegtsdemux, ...)
- bytestream-nal: startcoded prefixed complete NALs (a depayloader could
naturally do this)
- bytestream-au: one startcode prefixed complete AU

So what about:

format = (string enum) "byte-stream", "avc"
alignment = (string enum) "none", "nal", "au"

There is no point to have a 'parsed' option (if needed for backward compat, we can set this to TRUE). Tim tells me that we currently don't use parsed for h264, even.

Comment 23 Felipe Contreras (banned) 2010-01-25 15:44:23 UTC

(In reply to comment #21)
> * Option 3 (hybrid):
> Use format=byte-stream, keep parsed=true/false but add a new alignment=au/nalu
> field.

I like this option because as David agreed, 'parsed' is orthogonal.

However, I'm not sure about 'alignment'; decoders obviously would want to receive NALUs, so alignment=nalu is kind of redundant and can be easily changed to au-aligned=false. But I'm not sure about the 'au-aligned' name, few people would understand what au (or even access-unit means), perhaps 'frame-aligned'. Anyway I think it should be a boolean.

Comment 24 Felipe Contreras (banned) 2010-01-25 15:50:41 UTC

(In reply to comment #22)
> IMO we need at least 6 reasonable possibilities:
> 
>  - avc-raw: ungrouped length prefixed NALs (not a good idea as resync after a
>              lost packet is not possible).
>  - avc-nal: a GstBuffer contains one or more complete length prefixed NALs (a
>              depayloader could natuarally do this)

If it's not grouped into AU's, then it's not avc. AFAIU the spec is very clear about this. So only avc-au makes sense.

>  - avc-au:  a GstBuffer contains one complete AU (qtdemux, ...)

Comment 25 Wim Taymans 2010-01-25 16:10:57 UTC

(In reply to comment #23)
> (In reply to comment #21)
> > * Option 3 (hybrid):
> > Use format=byte-stream, keep parsed=true/false but add a new alignment=au/nalu
> > field.
> 
> I like this option because as David agreed, 'parsed' is orthogonal.
> 
> However, I'm not sure about 'alignment'; decoders obviously would want to
> receive NALUs, so alignment=nalu is kind of redundant and can be easily changed
> to au-aligned=false. But I'm not sure about the 'au-aligned' name, few people
> would understand what au (or even access-unit means), perhaps 'frame-aligned'.
> Anyway I think it should be a boolean.

I would like to keep the possibility to pass buffers around that are NALU but not AU aligned. Mainly because that is what a depayloader will output (so we need to be able to express that).

Comment 26 Felipe Contreras (banned) 2010-01-25 16:29:02 UTC

(In reply to comment #25)
> I would like to keep the possibility to pass buffers around that are NALU but
> not AU aligned.

That would be: parsed=true, frame-aligned=false.

> Mainly because that is what a depayloader will output (so we
> need to be able to express that).

Currently yes, but in Maemo we are using a patch from Mark to aggregate access-units; the depayloader could support both.

Comment 27 Tim-Philipp Müller 2010-01-25 19:08:57 UTC

I've been trying to make sense of this discussion, but a lot of it seems to be a bit off track, bogged down in specific details that don't really seem particularly decisive to the discussion in terms of what we should be able to express with our caps, and how to express that.

I'm quite in favour of wtay's suggestion of 

 format = (string enum) "byte-stream", "avc"
 alignment = (string enum) "none", "nal", "au"

Not all combinations may make sense - that's fine too IMHO.

I much prefer strings to booleans, and I especially dislike a parsed=true/false property. The above string enums are just so much more expressive and comprehensible than misc. booleans, and also easier to expand. I don't think it makes any difference in terms of performance or memory usage.

I also think it makes sense to be able to output and express NALs, since they are an integral part of the spec, and there may be disadvantages in aggregating them, or situations where that is not desirable.

Comment 28 Thiago Sousa Santos 2010-01-25 20:21:07 UTC

I also like Wim's idea.

Comment 29 Thiago Sousa Santos 2010-01-25 20:23:57 UTC

(In reply to comment #28)
> I also like Wim's idea.

Let me elaborate on that:

I find that it is like option 3 that was my current option, but removes the parsed attribute in favor of the 'none' alignment option, making it simpler. And also adds the alignment options for avc, that sort of makes sense too.

I agree with Tim that even though some combinations are not defined on specs, they make sense and could exist and help us in some situations.

Comment 30 Mark Nauwelaerts 2010-01-25 20:36:17 UTC

FWIW, ++ for latest suggestion and elaborations given above.

Comment 31 Thiago Sousa Santos 2010-01-28 17:49:10 UTC

Created attachment 152504 [details] [review]
Adds new fields to x264enc

Adds the new fields to x264enc.

Comment 32 Thiago Sousa Santos 2010-01-28 17:49:42 UTC

Created attachment 152505 [details] [review]
adds new fields to the rtp (de)payloader

Comment 33 Thiago Sousa Santos 2010-01-28 17:54:42 UTC

Now there is h264parse, IIRC it outputs NAL aligned buffers for either format, but I'd like confirmation on this.

And muxers:

qtmux needs avc, I'm not sure about the alignment.
matroskamux - ?
avimux - ?


Any other elements?

Comment 34 Wim Taymans 2010-01-28 18:07:20 UTC

qtmux needs AVC aligned at AU boundaries
I would guess both matroska and avi require something aligned to AU boundaries.

Comment 35 Mark Nauwelaerts 2010-01-28 18:15:41 UTC

h264parse outputs NAL aligned or AU aligned depending on the 'access-unit'
property.  It might/should be considered to remove the 'output-format' and
'access-unit' property in favour of caps negotiation (not yet released IIRC),
and possibly also 'split-packetized' (later on).

Comment 36 Mark Nauwelaerts 2010-01-28 18:21:54 UTC

Review of attachment 152505 [details] [review]:

The depayloader outputs 'au' or 'nal' aligned depending on the access-unit property
(as opposed to always 'nal').

If rtph264depay has not yet been released with the access-unit property, it might be removed in favour of caps-negotiation
(but that could/should perhaps be a separate patch).

Comment 37 Thiago Sousa Santos 2010-01-28 18:58:59 UTC

Created attachment 152509 [details] [review]
qtmux: Restrict h264 caps

Be more restrictive adding the new h264 caps fields,
qtmux only accepts AVC, AU aligned data

Fixes #606662

Comment 38 Thiago Sousa Santos 2010-01-28 19:09:49 UTC

(In reply to comment #36)
> Review of attachment 152505 [details] [review]:
> 
> The depayloader outputs 'au' or 'nal' aligned depending on the access-unit
> property
> (as opposed to always 'nal').
> 
> If rtph264depay has not yet been released with the access-unit property, it
> might be removed in favour of caps-negotiation
> (but that could/should perhaps be a separate patch).

No, it hasn't, just checked.

Comment 39 Thiago Sousa Santos 2010-01-28 19:21:35 UTC

Created attachment 152510 [details] [review]
rtph264pay, rtph264depay: add new stream-format fields to caps (updated)

Updated h264 patch, writing a new one to remove the property.

Comment 40 Stefan Sauer (gstreamer, gtkdoc dev) 2010-01-28 20:48:48 UTC

AVI does not really know about H264, I belive it wants a NALU bitstream according to annex B

Comment 41 Thiago Sousa Santos 2010-01-29 12:51:51 UTC

Created attachment 152563 [details] [review]
qtdemux patch

Comment 42 Thiago Sousa Santos 2010-01-29 12:52:24 UTC

Created attachment 152564 [details] [review]
rtph264depay: replace property with caps negotiation

Comment 43 Thiago Sousa Santos 2010-02-01 12:18:25 UTC

Created attachment 152721 [details] [review]
matroskamux: Only accept avc, au aligned h264 data

Adds new fields to templatecaps to only accept h264
if it is in avc format, and with au aligned frames

Fixes #606662

Comment 44 Thiago Sousa Santos 2010-02-01 12:18:51 UTC

Created attachment 152722 [details] [review]
matroskademux: Adds new h264 fields to srcpad caps

matroska should contain h264 in AVC format, with AU aligned data. We
should set that it in the output pad.

Fixes #606662

Comment 45 Thiago Sousa Santos 2010-02-22 23:00:52 UTC

Created attachment 154454 [details] [review]
x264enc: Replace 'avc-sample' with 'avc' in caps

Patch that should be applied before the release

Comment 46 Thiago Sousa Santos 2010-02-22 23:02:18 UTC

Created attachment 154455 [details] [review]
h264parse: Replace 'avc-sample' with 'avc' in caps

Patch to be applied before the release

Comment 47 Thiago Sousa Santos 2010-02-22 23:02:59 UTC

Created attachment 154456 [details] [review]
qtmux: Rename 'avc-sample' to 'avc' in caps

Another one to be applied before the release

Comment 48 Thiago Sousa Santos 2010-02-22 23:05:59 UTC

These 3 patches should be applied before the release so that we don't get stuck with 'avc-sample' when we agreed that 'avc' would be better.

We might as well revert the patches that added the caps for this release, too.


After the releases we can starting pushing the real patches.

Comment 49 Sebastian Dröge (slomo) 2010-02-25 17:06:16 UTC

Is this really a blocker bug for 0.10.14?

Comment 50 Tim-Philipp Müller 2010-02-25 17:21:11 UTC

> Is this really a blocker bug for 0.10.14?

Only the last three patches I think, ie. the

   stream-format=(string)avc-sample   =>   stream-format=(string)avc

changes.

Comment 51 Tim-Philipp Müller 2010-03-01 20:07:41 UTC

Comment on attachment 154454 [details] [review]
x264enc: Replace 'avc-sample' with 'avc' in caps

Please commit, but note that tests/check/elements/x264enc.c needs fixing as well.

Comment 52 Thiago Sousa Santos 2010-03-01 20:57:34 UTC

Comment on attachment 154454 [details] [review]
x264enc: Replace 'avc-sample' with 'avc' in caps

Committed, also updating the check tests.

Comment 53 Mark Nauwelaerts 2010-06-22 09:46:01 UTC

Added alignment attribute to x264enc:

commit 49f1373f1b00242badb61f871caffd4e9c397610
Author: Mark Nauwelaerts <mark.nauwelaerts@collabora.co.uk>
Date:   Tue Jun 22 11:41:42 2010 +0200

    x264enc: add new h264 caps attribute alignment
    
    See #606662.

Comment 54 Mark Nauwelaerts 2010-12-17 14:41:16 UTC

commit 4c368242c0a2105bd90c8945680ae23ff31d31fd
Author: Mark Nauwelaerts <mark.nauwelaerts@collabora.co.uk>
Date:   Fri Dec 17 15:38:15 2010 +0100

    rtph264depay: determine output h264 layout using caps negotiation
    
    ... thereby (partially) deprecating properties currently controlling whether
    or not byte-stream output or NAL/AU alignment (though properties still determine
    fallback if nothing specified in caps).
    
    Fixes #606662.

Comment 55 David Schleef 2011-01-02 23:59:01 UTC

Status:

  qtmux: ok
  x264enc: ok
  mpegtsmux: broken
  mpegtsdemux: broken
  h264parse: ok
  typefind: FIXED
  rtph264depay: ok
  rtph264pay: not in sink pad template
  matroskamux: not in sink pad template
  matroskademux: ok
  avidemux: broken
  avimux: not in sink pad template
  flvmux: not in sink pad template
  flvdemux: broken

These are all pretty trivial fixes that some bored person might want to handle before the upcoming freeze.


commit e3451e5ab8baa9eecc282f22a1385f03967a8c5f
Author: David Schleef <ds@schleef.org>
Date:   Sun Jan 2 15:48:47 2011 -0800

    typefind: Add stream-format to h264 caps

Comment 56 Tim-Philipp Müller 2011-04-13 15:39:31 UTC

Comment on attachment 152509 [details] [review]
qtmux: Restrict h264 caps

Pushed:

commit c385a464387a5cd623f67016f73a7389d2016c7a
Author: Thiago Santos <thiago.sousa.santos@collabora.co.uk>
Date:   Wed Apr 13 16:37:07 2011 +0100

    qtmux: restrict h264 some more to only accept AU-aligned AVC
    
    https://bugzilla.gnome.org/show_bug.cgi?id=606662

Wonder if we should refuse input caps as well if there is no stream-format or codec-data field at all?

Comment 57 Tim-Philipp Müller 2011-04-13 15:50:13 UTC

Comment on attachment 152722 [details] [review]
matroskademux: Adds new h264 fields to srcpad caps

Committed the else-assume-bytestream part, the first bit was already present. (The patch wasn't entirely right btw, the alignment=au bit should only be set in the if(codec_data) block, otherwise we might end up with stream-format=bytes-stream,alignment=au.

Comment 58 David Schleef 2011-04-24 23:44:24 UTC

Created attachment 186568 [details] [review]
0001-mpegtsdemux-tsdemux-Add-byte-stream-to-h264-caps.patch

 mpegtsdemux,tsdemux: Add byte-stream to h264 caps

Comment 59 David Schleef 2011-04-24 23:46:52 UTC

Created attachment 186569 [details] [review]
0001-matroskamux-Add-stream-format-to-h264-caps.patch

 matroskamux: Add stream-format to h264 caps

Comment 60 David Schleef 2011-04-25 00:07:54 UTC

Created attachment 186570 [details] [review]
0001-avimux-Add-stream-format-to-h264-caps.patch

 avimux: Add stream-format to h264 caps

Comment 61 David Schleef 2011-04-25 00:09:50 UTC

Status:

  qtmux: ok
  x264enc: ok
  mpegtsmux: ok
  mpegtsdemux: PATCH
  tsdemux: PATCH
  h264parse: ok
  typefind: ok
  rtph264depay: ok
  rtph264pay: not in sink pad template
  matroskamux: PATCH
  matroskademux: ok
  avidemux: broken
  avimux: PATCH
  flvmux: ok.  (Only has stream-format=avd, however)
  flvdemux: ok.  (Only has stream-format=avc, however)

Getting close...

Comment 62 Tim-Philipp Müller 2012-02-08 20:27:40 UTC

Comment on attachment 186568 [details] [review]
0001-mpegtsdemux-tsdemux-Add-byte-stream-to-h264-caps.patch

commit 40f3b4a6512a65e13ec44667fcc354d2eacaa4de
Author: David Schleef <ds@schleef.org>
Date:   Sun Apr 24 16:42:03 2011 -0700

    mpegtsdemux,tsdemux: Add byte-stream to h264 caps
    
    Fixes #606662.

Comment 63 Tim-Philipp Müller 2012-02-10 14:16:22 UTC

Let's close this bug:


commit 5b25f3737b488c3ddd7f5c2d4f9bd68a46e44da7
Author: Tim-Philipp Müller <tim.muller@collabora.co.uk>
Date:   Wed Feb 8 23:03:28 2012 +0000

    rtph264pay: add stream-format and alignment to h264 sink caps
    
    We're happy to accept both byte-stream and avc, advertise
    that on the sink caps and fix up _get_caps() function to
    not just return "video/x-h264".
    
    https://bugzilla.gnome.org/show_bug.cgi?id=606662

commit 6872b408730fd4eaacbb26fca79ebfb3f204e761
Author: Tim-Philipp Müller <tim.muller@collabora.co.uk>
Date:   Wed Feb 8 20:58:04 2012 +0000

    rtph264depay: add stream-format and alignment fields to src template caps
    
    Because we can. And so we get a warning if we try to output avc with
    nal alignment or somesuch.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=606662

commit dca42d4767adff3578e5d5990604766735ec1f9b
Author: Tim-Philipp Müller <tim.muller@collabora.co.uk>
Date:   Fri Feb 10 13:44:43 2012 +0000

    tests: clean up rtp-payloading test a little
    
    Feed data into the pipeline using appsrc instead of fdsrc and
    a pipe. Store unsigned byte values in guint8 instead of char.
    Getting rid of the capsfilter also helps to avoid 'format is
    not fully specified' warnings when pushing "video/x-h264" data
    into rtph264pay with fully specified h264 caps in the sink template.

Comment 64 Tim-Philipp Müller 2012-02-10 14:18:13 UTC

Comment on attachment 152510 [details] [review]
rtph264pay, rtph264depay: add new stream-format fields to caps (updated)

Tried to apply this, but as it turns out, most of the functionality was already in place in different form, so the patch is basically obsolete.

Some template caps fixes were still needed, but those I chose to do slightly differently, since stream-format=avc,alignment=nal is not a valid option.

Also needed to fix up the getcaps function a bit.