GNOME Bugzilla – Bug 163577
[RFC] Interlaced/progressive media support in GStreamer.
Last modified: 2009-02-19 15:14:53 UTC
[RFC] Expression of interlaced/progressive media in GStreamer. ================================================================ This is a write down on how interlaced video can be handled in GStreamer. Without the intend to be complete or final, I'd like this document to be a starting point of a discussion and refinement process with the ultimate goal to introduce a sufficient notation/handling of interlaced video in GStreamer. Where unsure about naming/something else parts in {} braces. Naming: ------- Frame {Picture}: Top Field and Bottom Field (interleaved or se- quential) where of this fields is meant to be dis- played first. A Frame may also be one progressive image representing image data from only one point in time. Field: Top Field or Bottom Field Top Field: Sequence of interlaced scanlines containing the first scanline. (e.g. 0, 2, 4, 6,...) Bottom Field: Sequence of interlaced scanlines containing the last scanline. (e.g. 1, 3, 5, 7,...) Sequential Layout: Layout of two fields in a frame such that all scan- lines of the first field in time appeare before all the scanlines of the second field. Just by knowing a frame contains a sequential field order we still miss the information whether the bottom or the top field comes first. (e.g. 1, 3, 5, ... 0, 2, 4,... or 0, 2, 4, ... 1, 3, 5,...) Interleaved Layout: Two fields interleaved (e.g. 1, 2, 3, 4, ...). In this case it remains to be determined which field comes first in time. Top Field First(TFF): W.r.t. time in the case of interleaved order (the Top Field has a timestamp lower than the Bottom Field). W.r.t. spacial order in the case of sequential order of fields. (Read: Top Field is the first in that frame) Bottom Field First(BFF): W.r.t. time in the case of interleaved order (the Bottom Field has a timestamp lower than the Top Field). W.r.t. spacial order in the case of sequential order of fields. (Read: Bottom Field is the first in that frame) Use cases: ---------- Deinterlacing: The intend of displaying interlaced video on a progressive output device requires the video to be deinterlaced. There are various method for deinterlacing a stream of fields. The simplest are weave and bob. The first combines two fields in one frame and thus halves the framerate. In addition, this method generates visible comb artefacts. The latter method interpolates missing scanlines. In GStreamer all video is treated as progressive. Either interlaced video has been weaved before/during encoding {or is weaved by source plugins v4lsrc?}. Mixing: Mixing of interlaced video with different properties (TFF vs. BFF) is currently done on the base of weaved fields. Each frame represents two different timestamps. So information of four different points in time may end up in one result of the mixing process. Mixing should just combine two time bases. Implementation in GStreamer: ---------------------------- Assumtions: * Currently every video element can handle interlaced video, if it comes in frames and fields are interleaved. * Sequential fields are not supported by now. Requirements: * An element which can process progressive video, should not bail out on interlaced video served as interleaved frames. (e.g. ffmpegcolorspace is fine accept anything w.r.t. interlaced video) * If an element emits frames with sequential order of fields any plugin connected to its source pads must signal somehow that it can process sequential field order. Otherwise the source must abort negotiation and return GST_PAD_LINK_FAILED. * If an element emits single fields any plugin connected to its source pads must signal that it can process single fields. Otherwise the source must abort negotiation and return GST_PAD_LINK_FAILED. * Elements arbitrarily emitting frames or fields are evil. { They should either require the same capabilities as field only emitting elements. Or drop incidental occoring single fields (mpegdemux: NTSC - 3:2 pulldown media might con- tain incidental single fields in case of an encoding error) * Elements accepting single fields must also accept frames. } * Field order, TFF/BFF are not expected to change during a stream. (If it should change, caps must be renegotiated e.g. mpeg allows this to change) * Supporting single fields per buffer requires the buffer to hold informa- tion if the contained field is a Top or Bottom field. {do we want to do that} New properties for mime type video/*: - (bool) interlaced: TRUE, FALSE - (bool) top field first: TRUE, FALSE - (string) field order: SEQUENTIAL, INTERLEAVED {- (bool) fields only: TRUE, FALSE} // whether one field one buffer is supported With "interlaced= FALSE" the two latter properties have no meaning { are not defined }. Where nothing else is stated "interlaced= FALSE" is be assumed. Progressive sinks: Progressive sinks will have "interlaced= [TRUE, FALSE]" in their caps. A upstream deinterlacer will happily do it's job and provide "interlaced= FALSE". Interlaced sinks: Interlaced sinks will ask for "interlaced= TRUE" in their caps. Any upstream deinterlacer can just "weave" or pass through interlaced media. { (Interlaced sinks could require one field per buffer) } Deinterlacer: The deinterlace plugin accepts interlaced and progressive video. In the case of progressive video deinterlace will pass through all buffers. In the case of interlaced video the deinterlacer might double the framerate according to the used deinterlace method and settings. Output of the deinterlacer will be "interlaced= FALSE" in the case of any method but "weave". Open questions: --------------- * Do we want to support fields on a per buffer base? * Anyone seen any interlaced videosinks? * Specific caps nego examples are missing. e.g. for "mpeg2dec ! ffmpegcolorspace ! deinterlace2 ! xvimagesink" , v4l2src ! deinterlace ! v4l2sink multifilesrc ! video/x-raw-yuv, format:fourcc=YUY2, interlaced:true, ... ! pngenc Thanks for reading all this!
Adjusting status accordingly, some (not so much informed I guess) comments will follow shortly
Some comments: - Progressive sinks don't really have to support interlaced=[TRUE, FALSE]. If data comes in sequential layout, progressive sink advertising support for interlaced video is doing wrong thing, and will display garbled data. Whether wit supports interlaced=TRUE depends strictly on field_layout. - top_field_first and field_order are named and typed confusingly, they should rather be (enum)field-layout = {SEQUENTIAL, INTERLACED} and (enum)field-order = {TOP_FIRST, BOTTOM_FIRST}. Also fields_only seems odd, wouldn't it be better as (int)buffer-fields similar to buffer-frames used in float audio? - I'm not sure, but incidental out-of-order field buffers may be useful for plugins doing inverse telecine and similar things, dropping them would mean loss of data which could be usefully consumed. So evil but we may want to support it. I can see two ways to do it: custom GstBufferFlag denoting whether it is top or bottom field, for which element would have to check before processing the frame, or custom GstEvent, sent before such out-of-order field buffer to prepare element for it. GstEvent is somewhat heavyweight, but may be appropriate if we consider it exceptional occurence. Flag requires conscient check before doing anything with field, but OTOH elements doing field-order sensitive processing would need to keep currently processed field somewhere anyway, so it may incur no additional work in the end. Flag sounds good after all, and shouldn't cause any harm when unused. - What about timestamps? AFAICU, there may be two situations: one where source frames are full, and only artificially split into fields before encoding (like anime DVD, where source is full 24fps video mastered into PAL/NTSC field data), and one where we have truly interlaced data with fields not summing up to frames (real TV feed for example). Should those be timestamped differently (ie. even-odd have equal timestamp vs. even-odd have different timestamps)? That might have influence on assumption that progressive element is always able to process interleaved fielded data (not to mention timestamping such data becomes difficult in the first place). - Example negotiations would be indeed useful to check there aren't any problems with this solution. I hope it's on TODO to deliver before finishing this RFC :) Otherwise, with my limited knowledge of topic at hand, this proposal seems good and rather easy to understand, except that it's yet another thing to audit all video elements agains to check that they behave... Oh well, life sucks anyway.
Single fields need to be dropped because they are incompatible. We can opt to introduce this concept in 0.9. Pretty much all formats assume one buffer to be one frame. Interlace, -mode and field-order sound fine and are compatible, I think. You asked for an interlaced videosink, one such is v4lmjpegsink. Like Maciej says, sequential/interleaved don't have to be strings, enums is fine. Or even ints with macros.
Adding fields to a caps description and making them optional is bad news. It doesn't work. It just leads to problems that are literally impossible to solve. So any changes of this nature will have to wait for 0.9, or use a different media type in the caps.
On mathricks comment: Thanks for your comments Maciej, - IMO progressive sinks should advertise "interlaced=FALSE" - this would be really progressive - and "interlaced=TRUE, field-layout=INTERLEAVED". Progressive sinks should do this so they can handle nothing less than before. In many cases elements will emit "interlaced=TRUE, field-layout=INTERLEAVED". We still want applications to work, even they know nothing about interlaced/progessive source material. - Yep, my naming sucks. Yours is truely less confusing. But it should be s/INTERLEAVED/INTERLACED/. - Yep, I agree on the quality issue. What we just can not do is change a plugin in a way that it emits single fields to elements which don't know anything about it. And that all the way downstream. Consider the following pipeline: A -> B -> C -> D. A can emit incidental fields. D doesn't know about buffer-fields. C does know, but just forwards them after processing. It must be known to A that some element downstream can not handle buffer-fields. Is this possible with caps? Yes , there is a decent caps/negotiation document missing. - Usually that artificially interlaced media is marked with a progressive flag. mpeg2dec, for example, will know about that, if the media was encoded correctly. Thus it'd be mpeg2dec's job to negotiate on "interlaced=FALSE". However, that flag can be missing. Smart deinterlacer (adaptive method) are still be able to identify which fields belong together.
Ping, no one has really thought about this for 0.9. It is unlikely that this gets in, but you do have 5 days so if there is a concrete proposal you might get this in. See bug #319388 for the schedule.
1. Adding interlaced video support to gstreamer would mean that all image/video handling elements need to be reviewed and their caps adjusted. Then e.g. a deinterlacer works reliably in all pipelines. Basically interlaced: { TRUE, FALSE } field-layout: enum { INTERLEAVED, SEQUENTIAL } and field-order: enum { TOP_FIELD_FIRST, BOTTOM_FIELD_FIRST } need to be added to the caps. If that's not possible to do at once for all elements, the deinterlacer will work in some cases otherwise it would have to act on default settings, user settings or just pass through. Reviewed elements should have the following behaviour: a) Upstream peer does not have an interlaced caps entry. Assume interlaced=FALSE and do not publish this assumption downstream (since we do not know if it is really progressive). b) Element cares about interlacing. Negotiate the interlace settings and publish result downstream. c) Element doesn't care. Publish (copy) interlace settings to src pad(s). In addition, 2. We need to think about incidental fields. In mpeg for example it is possible that a single bottom/top field has been muxed into the stream. The problem is that the mpegdemuxer can't just create a buffer and send that single field downstream. How is xvimagesink supposed to handle a single field? It can't (in a sane way). In any case xvimagesink needs to know that this buffer holds only one field. I agree with Maciej that a custom GstBufferFlag would be the best way to solve this. But even then, _all_ video/* handling elements _must_ check for this flag, and at least drop the single field if they can't handle it. I can't think of pipelines, that will fail if changes under 1. are applied incrementally. I'm not sure if addition to caps will violate the freeze. If they do, it won't be doable in 5 days for 0.9. Changes under 2. will require at least the custom GstBufferFlag in the core. It won't harm sitting just there. But with the first element emitting single fields, all elements must be made aware of this. I hope this is a bit more concise summary what needs to be done to get proper interlacing support into gstreamer.
Been thinking of adding 2 new custom flags to a GstBuffer, which get a definition based on the caps of the buffer. For video buffers the meaning would be: GST_BUFFER_FLAG_CUSTOM1 = GST_VIDEO_BUFFER_TFF GST_BUFFER_FLAG_CUSTOM2 = GST_VIDEO_BUFFER_BFF The fact that a buffer is interlaced is therefore detectable because one of the flags is set. Otherwise the buffer contains a single frame/field. This does not need any caps changing and elements can incrementally be updated to handle the interlacing flags. Putting multiple video frames in one buffer could require a new caps property. The flags can still be used though. Alternatively, the buffer size and the field size (as calculated from the description in the caps) can indicated that this buffer conatains multiple samples (like we do now with audio). Alternatively yet another new buffer flag could be added to denote the fact of 2 fields in this buffer. Sounds a lot more simple than having to deal with lots of caps that can change for each buffer (try writing a transform_caps function for ffmpegcolorspace with all these properties :).
I've been thinking about this recently, too. I like the concept of adding buffer flags. It's clean, and expresses a per-buffer quantity in ways that caps cannot. (duh). In particular, it expresses the MPEG2 TFF and BFF flags well, and duration handles NUM_FIELDS. I don't think it's wise to attempt to put more than 2 fields into a buffer. I think it's best to just to what MPEG2 does, and have two fields in the buffer, and say what duration to display the fields, alternating between them if necessary. And if the duration of the buffer is only the duration of one field, then the buffer would contain an extra unused field. However, I've nearly convinced myself that video comprised only of fields should have a new caps type, e.g., video/x-field-yuv. Features would be: - Buffers are individual fields with durations of 1/(2*framerate) - Buffers must have TFF or BFF flag set. - Buffers would have (height/2) lines of size (width) - conversion to/from video/x-raw-yuv is done by interleaving/deinterleaving scanlines of two adjacent buffers In any case, I forsee that we'll put a "video2progressive" filter in front of xvimagesink, which will detect interlaced video and convert fields at (framerate) into frames at (framerate*2) (or leave progressive video unaltered), and have xvimagesink render them appropriately.
Any news on this?
Someone needs to implement the flags and write a correct deinterlacer element using those flags.
Created attachment 112446 [details] [review] patch adding generic media specific flags to core
Created attachment 112447 [details] [review] patch aliasing generic flags with definitions for video streams
Created attachment 112452 [details] [review] patch defining video buffer flags as media specific flags
Created attachment 112478 [details] interlacing caps proposal
Attaching a document I wrote a long time ago to further the discussion. It has some ideas, but got waylaid somewhere between conception and distribution.
Thaytan, that could also go to docs/design/draft-interlace.txt?
Created attachment 112561 [details] [review] replacement base patch
This is the implementation proposal we came up with during the Google SOC Mentor Summit 2008. Were present : David Schleef, Michael Smith, Edward Hervey, Timothy Terriberry and Christian Schaller. This implementation proposal is based on the review of all previous documentation, and feedback of the people present. Interlacing support proposal for 0.10 In order to cover all the cases we need to add the following: * A boolean caps property for video/x-raw-* streams: 'interlaced' If not present (as currently), this implies 'interlaced=False' If True, the GstBuffers contain a frame with two interlaced fields If False, the GstBuffer contain a progressive frame * 2 GstBufferFlags (which only make sense if the 'interlaced' field is present in the caps AND True). Top Field First ==> #define GST_VIDEO_BUFFER_TFF GST_BUFFER_FLAG_MEDIA1 If present, the even lines make up the top field. If not, the odd lines makes up the top field. Repeat First Field ==> #define GST_VIDEO_BUFFER_RFF GST_BUFFER_FLAG_MEDIA2 If present, the first field (defined above) should be repeated (ending up in the frame being displayed for 3 * field_duration). If not present, this is the classic case when the frame will be displayed for 2 * field_duration.
Edward, you missed the part where only one field is valid in the frame, which is why we needed 3 flags.
Didn't we end up figuring out it was in fact the duplicate field of a previous/future frame ? In which case it should go with that other frame (and the correct flags set).
No. MPEG2 has "field pictures", which might be uncleverly combined in a sequence with frame pictures and RFF pictures.
Created attachment 124938 [details] [review] Patch for core to define 3 media-specific buffer flags
Created attachment 124939 [details] [review] Patch for base specifying usage of interlaced flags in caps and buffers
The two patches above are the implementation of the proposal from comment #19 and #20. I also added convenience functions to parse/create caps with interlaced flags.
wim, david, Jan : Are these patches good for comitting ? Or do you wish to see one or two example implementations ((de)muxer? [en|de]coder ? transform element ?).
Works for me. It might be more consistend to change gst_video_format_parse_caps_interlaced() to just parse the interlaced flag, similar to par and framerate. That way, it can be used on non-raw-video streams, such as video/mpeg.
good point.
Created attachment 129063 [details] [review] updated patch against -base patch against -base Adjusted to David's suggestions.
Created attachment 129064 [details] [review] updated patch against -base previous patch was of course wrong (no changes).
commit 37d95a00e4c5fda2cf5dee0f9748dd8eb96a47ca Author: Edward Hervey <bilboed@bilboed.com> Date: Thu Feb 19 16:04:43 2009 +0100 GstBufferFlags: Add 3 new media-specific buffer flags. Partially fixes #163577
commit c44b06781752ae6d002e87e083ae993912ed6d09 Author: Edward Hervey <bilboed@bilboed.com> Date: Mon Jan 26 10:30:53 2009 +0100 video: Add flags for interlaced video along with convenience methods for interlaced caps. These three flags allow all know combinations of interlaced formats. They should only be used when the caps contain 'interlaced=True'. Fixes #163577 (yes, it's a 4 year old bug).