GNOME Bugzilla – Bug 703346
Integrate shaders from eglglessink
Last modified: 2013-07-19 08:00:00 UTC
eglglessink currently has more optimized shaders than gst-plugins-gl and also has shaders for more color formats. We should integrate them into gst-plugins-gl.
The existing shaders are updated now, and a reordering shader for the different RGB formats was added for GLES2. Matthew is working on adding the additional shaders.
Created attachment 249483 [details] [review] upload: overhaul and new video formats
Review of attachment 249483 [details] [review]: ::: gst-libs/gst/gl/gstglupload.c @@ +126,3 @@ + "uniform vec2 tex_scale0;\n" + "uniform vec2 tex_scale1;\n" + "uniform vec2 tex_scale2;\n" tex_scale2 is not used @@ +142,3 @@ + +/* Channel reordering for XYZ <-> ZYX conversion */ +static const char *frag_REORDER_opengl = { With desktop GL no reordering is necessary, right? It might be more optimal to use the integrated support for the different RGBA orders @@ +147,3 @@ + "uniform vec2 tex_scale0;\n" + "uniform vec2 tex_scale1;\n" + "uniform vec2 tex_scale2;\n" tex_scale are not used @@ +161,3 @@ + "uniform vec2 tex_scale0;\n" + "uniform vec2 tex_scale1;\n" + "uniform vec2 tex_scale2;\n" tex_scale are not used @@ +183,3 @@ + " yuv.x = texture2DRect(Ytex, gl_TexCoord[0].xy * tex_scale0).%c;\n" + " yuv.y = texture2DRect(UVtex, gl_TexCoord[0].xy * tex_scale1).%c;\n" + " yuv.z = texture2DRect(UVtex, gl_TexCoord[0].xy * tex_scale2).%c;\n" YUY2,etc only have a single stride. Why three scales? Two might make sense though @@ +202,3 @@ + "uniform vec2 tex_scale0;\n" + "uniform vec2 tex_scale1;\n" + "uniform vec2 tex_scale2;\n" Scales not used @@ +216,3 @@ + "uniform vec2 tex_scale0;\n" + "uniform vec2 tex_scale1;\n" + "uniform vec2 tex_scale2;\n" Scales not used @@ +237,3 @@ + "uniform vec2 tex_scale0;\n" + "uniform vec2 tex_scale1;\n" + "uniform vec2 tex_scale2;\n" Scales not used @@ +260,3 @@ + "uniform vec2 tex_scale0;\n" + "uniform vec2 tex_scale1;\n" + "uniform vec2 tex_scale2;\n" Scales not used @@ +282,3 @@ + "uniform vec2 tex_scale0;\n" + "uniform vec2 tex_scale1;\n" + "uniform vec2 tex_scale2;\n" Scales not used @@ +334,2 @@ const gchar *AYUV; + const gchar *NV12_NV21; NV12 shaders can also be used for NV16 and NV24 (same format, just chroma plane of a slightly different size) @@ +1127,3 @@ + tex[1].type = GL_UNSIGNED_BYTE; + tex[1].width = GST_ROUND_UP_2 (in_width) / 2; + tex[1].height = GST_ROUND_UP_2 (in_height) / 2; Ideally you would just use the information from the GstVideoFrame / GstVideoInfo for this everywhere @@ +1227,3 @@ + gl->BindTexture (GL_TEXTURE_RECTANGLE_ARB, upload->in_texture[1]); + gl->TexSubImage2D (GL_TEXTURE_RECTANGLE_ARB, 0, 0, 0, + GST_ROUND_UP_2 (in_width) / 2, in_height, Same here, GstVideoInfo/GstVideoFrame @@ +1228,3 @@ + gl->TexSubImage2D (GL_TEXTURE_RECTANGLE_ARB, 0, 0, 0, + GST_ROUND_UP_2 (in_width) / 2, in_height, + GL_BGRA, GL_UNSIGNED_INT_8_8_8_8_REV, upload->data[0]); GL_RGBA, GL_UNSIGNED_BYTE? @@ +1247,3 @@ + gl->TexSubImage2D (GL_TEXTURE_RECTANGLE_ARB, 0, 0, 0, + in_width / 2, in_height / 2, + GL_LUMINANCE_ALPHA, GL_UNSIGNED_BYTE, upload->data[1]); If available it is more optimal to use GL_R8 and GL_RG8 here (red, red/green), instead of LUM/ALPHA. Everywhere @@ +1441,3 @@ + tex_scaling[2] = tex_scaling[4] = 0.5; + else if (v_format == GST_VIDEO_FORMAT_Y41B) + tex_scaling[2] = tex_scaling[4] = 0.25; That's not what the point of the scaling was :) The scaling was used to handle arbitrary strides, together with GL_UNPACK. The scales should be at 1.0 for all the different planes, and GL should do bilinear interpolation for us instead of us worrying in the shaders about the different subsampling. Maybe drop the tex_scaling bits everywhere for now until we support arbitrary strides. That makes the code simpler ::: gst-libs/gst/gl/gstglupload.h @@ +95,3 @@ #define GST_GL_UPLOAD_FORMATS "{ RGB, RGBx, RGBA, BGR, BGRx, BGRA, xRGB, " \ + "xBGR, ARGB, ABGR, Y444, I420, YV12, Y42B, " \ + "Y41B, NV12, NV21, YUY2, UYVY, AYUV }" Btw, you can easily also support YVYU (same as YUY2 just chroma components swapped)
(In reply to comment #3) > Review of attachment 249483 [details] [review]: > > ::: gst-libs/gst/gl/gstglupload.c > @@ +126,3 @@ > + "uniform vec2 tex_scale0;\n" > + "uniform vec2 tex_scale1;\n" > + "uniform vec2 tex_scale2;\n" > > tex_scale2 is not used Doesn't really matter much (actually not at all). > @@ +142,3 @@ > + > +/* Channel reordering for XYZ <-> ZYX conversion */ > +static const char *frag_REORDER_opengl = { > > With desktop GL no reordering is necessary, right? It might be more optimal to > use the integrated support for the different RGBA orders Yea sure you can try and mix and match texture formats to get the result however chances are you'll hit the slow path in most drivers because it has to convert CPU side to a format that the GPU can use. It's simpler to use a shader and may even be faster. If you really want to check, profile it. > @@ +147,3 @@ > + "uniform vec2 tex_scale0;\n" > + "uniform vec2 tex_scale1;\n" > + "uniform vec2 tex_scale2;\n" > > tex_scale are not used > > @@ +161,3 @@ > + "uniform vec2 tex_scale0;\n" > + "uniform vec2 tex_scale1;\n" > + "uniform vec2 tex_scale2;\n" > > tex_scale are not used > > @@ +183,3 @@ > + " yuv.x = texture2DRect(Ytex, gl_TexCoord[0].xy * tex_scale0).%c;\n" > + " yuv.y = texture2DRect(UVtex, gl_TexCoord[0].xy * tex_scale1).%c;\n" > + " yuv.z = texture2DRect(UVtex, gl_TexCoord[0].xy * tex_scale2).%c;\n" > > YUY2,etc only have a single stride. Why three scales? Two might make sense > though I just put them in based on what was in the others. > @@ +202,3 @@ > + "uniform vec2 tex_scale0;\n" > + "uniform vec2 tex_scale1;\n" > + "uniform vec2 tex_scale2;\n" > > Scales not used > > @@ +216,3 @@ > + "uniform vec2 tex_scale0;\n" > + "uniform vec2 tex_scale1;\n" > + "uniform vec2 tex_scale2;\n" > > Scales not used > > @@ +237,3 @@ > + "uniform vec2 tex_scale0;\n" > + "uniform vec2 tex_scale1;\n" > + "uniform vec2 tex_scale2;\n" > > Scales not used > > @@ +260,3 @@ > + "uniform vec2 tex_scale0;\n" > + "uniform vec2 tex_scale1;\n" > + "uniform vec2 tex_scale2;\n" > > Scales not used > > @@ +282,3 @@ > + "uniform vec2 tex_scale0;\n" > + "uniform vec2 tex_scale1;\n" > + "uniform vec2 tex_scale2;\n" > > Scales not used > > @@ +334,2 @@ > const gchar *AYUV; > + const gchar *NV12_NV21; > > NV12 shaders can also be used for NV16 and NV24 (same format, just chroma plane > of a slightly different size) Can do that later. > @@ +1127,3 @@ > + tex[1].type = GL_UNSIGNED_BYTE; > + tex[1].width = GST_ROUND_UP_2 (in_width) / 2; > + tex[1].height = GST_ROUND_UP_2 (in_height) / 2; > > Ideally you would just use the information from the GstVideoFrame / > GstVideoInfo for this everywhere Correct, another patch. > @@ +1227,3 @@ > + gl->BindTexture (GL_TEXTURE_RECTANGLE_ARB, upload->in_texture[1]); > + gl->TexSubImage2D (GL_TEXTURE_RECTANGLE_ARB, 0, 0, 0, > + GST_ROUND_UP_2 (in_width) / 2, in_height, > > Same here, GstVideoInfo/GstVideoFrame > > @@ +1228,3 @@ > + gl->TexSubImage2D (GL_TEXTURE_RECTANGLE_ARB, 0, 0, 0, > + GST_ROUND_UP_2 (in_width) / 2, in_height, > + GL_BGRA, GL_UNSIGNED_INT_8_8_8_8_REV, upload->data[0]); > > GL_RGBA, GL_UNSIGNED_BYTE? That might need changing. Anyone got a big-endian machine? > @@ +1247,3 @@ > + gl->TexSubImage2D (GL_TEXTURE_RECTANGLE_ARB, 0, 0, 0, > + in_width / 2, in_height / 2, > + GL_LUMINANCE_ALPHA, GL_UNSIGNED_BYTE, upload->data[1]); > > If available it is more optimal to use GL_R8 and GL_RG8 here (red, red/green), > instead of LUM/ALPHA. Everywhere Do that later. This is the general case. > @@ +1441,3 @@ > + tex_scaling[2] = tex_scaling[4] = 0.5; > + else if (v_format == GST_VIDEO_FORMAT_Y41B) > + tex_scaling[2] = tex_scaling[4] = 0.25; > > That's not what the point of the scaling was :) The scaling was used to handle > arbitrary strides, together with GL_UNPACK. The scales should be at 1.0 for all > the different planes, and GL should do bilinear interpolation for us instead of > us worrying in the shaders about the different subsampling. The bilinear interpolation comes from the texture parameters, not from what you specify in GL_UNPACK/strides. Arbitrary strides should come with the patch that enables GstVideoInfo/GstVideoFrame from the caller. > Maybe drop the tex_scaling bits everywhere for now until we support arbitrary > strides. That makes the code simpler And breaks the resulting image for video formats with subsampling. > ::: gst-libs/gst/gl/gstglupload.h > @@ +95,3 @@ > #define GST_GL_UPLOAD_FORMATS "{ RGB, RGBx, RGBA, BGR, BGRx, BGRA, xRGB, " \ > + "xBGR, ARGB, ABGR, Y444, I420, YV12, Y42B, " \ > + "Y41B, NV12, NV21, YUY2, UYVY, AYUV }" > > Btw, you can easily also support YVYU (same as YUY2 just chroma components > swapped) Later.
(In reply to comment #4) > > @@ +142,3 @@ > > + > > +/* Channel reordering for XYZ <-> ZYX conversion */ > > +static const char *frag_REORDER_opengl = { > > > > With desktop GL no reordering is necessary, right? It might be more optimal to > > use the integrated support for the different RGBA orders > > Yea sure you can try and mix and match texture formats to get the result > however chances are you'll hit the slow path in most drivers because it has to > convert CPU side to a format that the GPU can use. It's simpler to use a > shader and may even be faster. If you really want to check, profile it. Ack, just a general comment :) > > @@ +334,2 @@ > > const gchar *AYUV; > > + const gchar *NV12_NV21; > > > > NV12 shaders can also be used for NV16 and NV24 (same format, just chroma plane > > of a slightly different size) > > Can do that later. Yes > > @ +1127,3 @@ > > + tex[1].type = GL_UNSIGNED_BYTE; > > + tex[1].width = GST_ROUND_UP_2 (in_width) / 2; > > + tex[1].height = GST_ROUND_UP_2 (in_height) / 2; > > > > Ideally you would just use the information from the GstVideoFrame / > > GstVideoInfo for this everywhere > > Correct, another patch. Yes > > @@ +1228,3 @@ > > + gl->TexSubImage2D (GL_TEXTURE_RECTANGLE_ARB, 0, 0, 0, > > + GST_ROUND_UP_2 (in_width) / 2, in_height, > > + GL_BGRA, GL_UNSIGNED_INT_8_8_8_8_REV, upload->data[0]); > > > > GL_RGBA, GL_UNSIGNED_BYTE? > > That might need changing. Anyone got a big-endian machine? Well, there's no GL_BGRA and GL_UNSIGNED_INT_8_8_8_8_REV on GLES. And it works fine with that on little endian machines (both are mapped to the other ones) :) > > @@ +1247,3 @@ > > + gl->TexSubImage2D (GL_TEXTURE_RECTANGLE_ARB, 0, 0, 0, > > + in_width / 2, in_height / 2, > > + GL_LUMINANCE_ALPHA, GL_UNSIGNED_BYTE, upload->data[1]); > > > > If available it is more optimal to use GL_R8 and GL_RG8 here (red, red/green), > > instead of LUM/ALPHA. Everywhere > > Do that later. This is the general case. Yes > > @@ +1441,3 @@ > > + tex_scaling[2] = tex_scaling[4] = 0.5; > > + else if (v_format == GST_VIDEO_FORMAT_Y41B) > > + tex_scaling[2] = tex_scaling[4] = 0.25; > > > > That's not what the point of the scaling was :) The scaling was used to handle > > arbitrary strides, together with GL_UNPACK. The scales should be at 1.0 for all > > the different planes, and GL should do bilinear interpolation for us instead of > > us worrying in the shaders about the different subsampling. > > The bilinear interpolation comes from the texture parameters, not from what you > specify in GL_UNPACK/strides. Arbitrary strides should come with the patch > that enables GstVideoInfo/GstVideoFrame from the caller. Sure > > Maybe drop the tex_scaling bits everywhere for now until we support arbitrary > > strides. That makes the code simpler > > And breaks the resulting image for video formats with subsampling. For the planar formats you'll have different width/height values depending on the subsampling in glTexImage2D(). However in the shader you'll always go from 0.0 to 1.0 in all directions and no scaling is required there (1.0 would be at the end of the texture, i.e. the pixel at 1/4 width for Y41B chroma planes). For the packed and semi-planar formats the scaling will be required but that should IMHO happen inside the shader for now. Just remove the texScale stuff until we add support for arbitrary strides. See http://cgit.freedesktop.org/gstreamer/gst-plugins-bad/tree/ext/eglgles/gstegladaptation.c?h=0.10#n156 and the related code. That's the non-abitrary-stride versions from the past that worked just fine :) > > ::: gst-libs/gst/gl/gstglupload.h > > @@ +95,3 @@ > > #define GST_GL_UPLOAD_FORMATS "{ RGB, RGBx, RGBA, BGR, BGRx, BGRA, xRGB, " \ > > + "xBGR, ARGB, ABGR, Y444, I420, YV12, Y42B, " \ > > + "Y41B, NV12, NV21, YUY2, UYVY, AYUV }" > > > > Btw, you can easily also support YVYU (same as YUY2 just chroma components > > swapped) > > Later. Yes
(In reply to comment #5) > (In reply to comment #4) > > > Maybe drop the tex_scaling bits everywhere for now until we support arbitrary > > > strides. That makes the code simpler > > > > And breaks the resulting image for video formats with subsampling. > > For the planar formats you'll have different width/height values depending on > the subsampling in glTexImage2D(). However in the shader you'll always go from > 0.0 to 1.0 in all directions and no scaling is required there (1.0 would be at > the end of the texture, i.e. the pixel at 1/4 width for Y41B chroma planes). And that will fail with the TEXTURE_RECTANLGE target which we currently use for Desktop GL. TexCoords with TEXTURE_RECTANGLE are unnormalized i.e. go from 0 to width/height NOT 0 to 1.0. You may want to instead check for the NPOT extension and use that if you want normalized TexCoords. It could also be implemented using multitexturing in the shader i.e. use gl_TexCoord[i]. IMHO it's just a different way of doing the same thing and doesn't really matter. > For the packed and semi-planar formats the scaling will be required but that > should IMHO happen inside the shader for now. Just remove the texScale stuff > until we add support for arbitrary strides. It is happening in the shader already using tex_scaling. > See > http://cgit.freedesktop.org/gstreamer/gst-plugins-bad/tree/ext/eglgles/gstegladaptation.c?h=0.10#n156 > and the related code. That's the non-abitrary-stride versions from the past > that worked just fine :) See above.
(In reply to comment #6) > (In reply to comment #5) > > (In reply to comment #4) > > > > Maybe drop the tex_scaling bits everywhere for now until we support arbitrary > > > > strides. That makes the code simpler > > > > > > And breaks the resulting image for video formats with subsampling. > > > > For the planar formats you'll have different width/height values depending on > > the subsampling in glTexImage2D(). However in the shader you'll always go from > > 0.0 to 1.0 in all directions and no scaling is required there (1.0 would be at > > the end of the texture, i.e. the pixel at 1/4 width for Y41B chroma planes). > > And that will fail with the TEXTURE_RECTANLGE target which we currently use for > Desktop GL. TexCoords with TEXTURE_RECTANGLE are unnormalized i.e. go from 0 > to width/height NOT 0 to 1.0. You may want to instead check for the NPOT > extension and use that if you want normalized TexCoords. It could also be > implemented using multitexturing in the shader i.e. use gl_TexCoord[i]. IMHO > it's just a different way of doing the same thing and doesn't really matter. Ah! Sorry for the confusion then, I only really know GLES shaders. It all makes sense then
Created attachment 249582 [details] [review] v2 Fixed the minor issues.
Merged it all from the git branch