GNOME Bugzilla – Bug 621553
video: new media type for high bit-depth and float video
Last modified: 2012-06-27 07:42:09 UTC
I'm working with cameras that output high bit-depth and sometimes floating point video. While some of these might seem crazy, (almost) all of these will automatically be supported by the gst-opencv elements, so many operations including colorspace conversion will be possible. Below are the MIME types and caps that I've come up with, omitting the common width, height, and framerate fields. I'm taking some liberty with the formatting of these caps to save space, so this isn't exactly how they'd appear in actual use. Raw 16-bit, 32-bit grayscale, signed or unsigned (use existing caps, but add signed field): video/x-raw-gray bpp: [ 1, 32 ] depth: { 8, 16, 32 } endianness: BYTE_ORDER signed: {TRUE, FALSE} Raw floating point (half, single, and double) grayscale: video/x-raw-gray-float bpp: { 16, 32, 64 } depth: { 16, 32, 64 } endianness: BYTE_ORDER Raw 16-bits per channel RGB(A): video/x-raw-rgb16 bpp: { 48, 64 } depth: { 48, 64 } endianness: BYTE_ORDER red_mask: { 255, ... } green_mask: { 255, ... } blue_mask: { 255, ... } alpha_mask: { 255, ... } Raw floating point RGB(A): video/x-raw-rgb-float bpp: { 96, 128 } depth: { 96, 128 } endianness: BYTE_ORDER red_mask: { ?? } green_mask: { ?? } blue_mask: { ?? } alpha_mask: { ?? } For the floating point grayscale we could also create separate types such as video/x-raw-gray-half, video/x-raw-gray-single (or float), video/x-raw-gray-double. For RGB(A)16 we can't (and probably shouldn't) re-use the existing video/x-raw-rgb since the masks are only 32-bits and we need at least 64-bits. For RGB(A)F it might be crazy to think of adding half and double formats here. The other big problem is we don't have a 128-bit integer yet so to have masks we'd need two integers for each channel. Since there aren't many (any?) using this type now we could just force them to use RGB order and forget the masks entirely. I would appreciate any feedback. Thanks! -Josh
> Raw 16-bit, 32-bit grayscale, signed or unsigned (use existing caps, > but add signed field): > Signed gray is just like unsigned gray, only with a different range? I've never seen that anywhere, is that common? > Raw floating point (half, single, and double) grayscale: > video/x-raw-gray-float > bpp: { 16, 32, 64 } > depth: { 16, 32, 64 } > endianness: BYTE_ORDER > I don't think anyone ever uses bpp != depth, so we should drop the bpp property. > Raw 16-bits per channel RGB(A): > video/x-raw-rgb16 > bpp: { 48, 64 } > depth: { 48, 64 } > endianness: BYTE_ORDER > red_mask: { 255, ... } > green_mask: { 255, ... } > blue_mask: { 255, ... } > alpha_mask: { 255, ... } > We need to come up with a better way to specify this than using masks. Also, I'm not sure if there's any 12bpp formats in existance, because if there weren't we could stat thinking in terms of bytes and not bits and that'd mean we could drop endianness. Otherwise, do we need a way to swap 48bit numbers? > Raw floating point RGB(A): > video/x-raw-rgb-float > bpp: { 96, 128 } > depth: { 96, 128 } > endianness: BYTE_ORDER > red_mask: { ?? } > green_mask: { ?? } > blue_mask: { ?? } > alpha_mask: { ?? } > I don't think bpp are necessary here (see gray format). And the masks issue is the same as above, too. What we need for this format is a way to specify the order of the components, isn't it? I'd suggest something like a FOURCC for that, so that one could say order="RGBA" or "order=xRGB". Would that make sense? > For RGB(A)16 we can't (and probably shouldn't) re-use the existing > video/x-raw-rgb since the masks are only 32-bits and we need at least > 64-bits. > I think it'd be nice if we could find a way to specify caps so that we can describe all formats in existance and don't need different caps for 5,8,10,16 and whatever number of bits per component, so that we can use a single format for GStreamer 1.0.
(In reply to comment #1) > > Raw 16-bit, 32-bit grayscale, signed or unsigned (use existing caps, > > but add signed field): > > > Signed gray is just like unsigned gray, only with a different range? > I've never seen that anywhere, is that common? > True. Signed images aren't common, but unfortunately they do occur. Most of my interest lies in the use of scientific cameras and frame grabbers. The frame grabber I'm using now (National Instruments IMAQ card) will spit out video as signed 16-bit integers. Thankfully I haven't come across a camera that actually produces that, and I hope to never do so! > > Raw floating point (half, single, and double) grayscale: > > video/x-raw-gray-float > > bpp: { 16, 32, 64 } > > depth: { 16, 32, 64 } > > endianness: BYTE_ORDER > > > I don't think anyone ever uses bpp != depth, so we should drop the bpp > property. > I would agree for this case. However for video/x-raw-gray I certainly use bpp != depth as often times 10-, 12-, or 14-bit sensor data is stored in 16-bits. I like to allow users to adjust the intensity mapping to 8-bit and if values only go up to 1024 and I don't use bpp=10 then the user experience will suffer when using a control such as a slider. > > Raw 16-bits per channel RGB(A): > > video/x-raw-rgb16 > > bpp: { 48, 64 } > > depth: { 48, 64 } > > endianness: BYTE_ORDER > > red_mask: { 255, ... } > > green_mask: { 255, ... } > > blue_mask: { 255, ... } > > alpha_mask: { 255, ... } > > > We need to come up with a better way to specify this than using masks. Also, > I'm not sure if there's any 12bpp formats in existance, because if there > weren't we could stat thinking in terms of bytes and not bits and that'd mean > we could drop endianness. Otherwise, do we need a way to swap 48bit numbers? > See next response. > > Raw floating point RGB(A): > > video/x-raw-rgb-float > > bpp: { 96, 128 } > > depth: { 96, 128 } > > endianness: BYTE_ORDER > > red_mask: { ?? } > > green_mask: { ?? } > > blue_mask: { ?? } > > alpha_mask: { ?? } > > > I don't think bpp are necessary here (see gray format). > And the masks issue is the same as above, too. What we need for this format is > a way to specify the order of the components, isn't it? > I'd suggest something like a FOURCC for that, so that one could say > order="RGBA" or "order=xRGB". Would that make sense? > I suppose specifying the order of components would suffice. I'd hate to imagine a format requiring masks to specify red, green, and blue data interwoven for every pixel (!). That sounds good to me. > > For RGB(A)16 we can't (and probably shouldn't) re-use the existing > > video/x-raw-rgb since the masks are only 32-bits and we need at least > > 64-bits. > > > I think it'd be nice if we could find a way to specify caps so that we can > describe all formats in existance and don't need different caps for 5,8,10,16 > and whatever number of bits per component, so that we can use a single format > for GStreamer 1.0. > I agree, I'd like to use video/x-raw-rgb regardless of bits. If we forget or fix the issue of channel masks, there shouldn't be any confusion that depth=48 or 64 indicates 16 bits per channel. Of course this would mean redefining video/x-raw-rgb throughout all elements, something I'm certainly not qualified to speak on. :)
Since this bug report, GStreamer got support for 64-bit ARGB as well as a bunch of 16-bit YCbCr formats. These are simple extensions of 8 bit video to larger bit depths: they still follow BT-470 or BT-709, just with more bits. This is the direction I want to follow for video/x-raw-yuv and video/x-raw-rgb. The important part of x-raw-yuv and x-raw-rgb, in this case, is that they assume a whole bunch of things that aren't necessarily true for framegrabbers and/or float video. For example, the range/excursion of values in float video, or the response curve for greyscale video, or the RGB chromaticities of the sensor. Sometimes, this information isn't even known. It isn't terribly important to try to support every format out there. Pick one you need, and implement it. The plan for float video is much the way you describe above, except no bpp, and with some definition of the response curves and excursions (string "linear", "bt709", etc.), and chromaticities for components. Once you have the chromaticities on the components, the RGB ordering is redundant.
I think this bug can be closed now with all the new video stuff in 0.11/1.0, no ? I think it covers basically everything but float formats - please file a separate bug for anything specific that's still missing, or re-open this one, thanks!
I think indeed everything is covered: range, matrix, transfer function, primaries. There is no floating point format yet but it can now be added quite easily.