I'm writing media player framework for Apple TV, using OpenGL ES and ffmpeg. Conversion to RGBA is required for rendering on OpenGL ES, soft convert using swscale is unbearably slow, so using information on the internet I came up with two ideas: using neon (like here) or using fragment shaders and GL_LUMINANCE and GL_LUMINANCE_ALPHA.
As I know almost nothing about OpenGL, the second option still doesn't work :)
Can you give me any pointers how to proceed? Thank you in advance.
It is most definitely worthwhile learning OpenGL ES2.0 shaders:
YCbCr
saves you 25% bus bandwidth if your video has 4:2:0 sampled chrominance.Y
and C{b,r}
textures, in effect stretching the chrominance texture out over the same area.)YCbCr
textures to the GPU is fast (no data-copy or swizzling) with the texture cache (see the CVOpenGLESTextureCache*
API functions). You will save 1-2 data-copies compared to NEON.I am using these techniques to great effect in my super-fast iPhone camera app, SnappyCam.
You are on the right track for implementation: use a GL_LUMINANCE
texture for Y
and GL_LUMINANCE_ALPHA
if your CbCr
is interleaved. Otherwise use three GL_LUMINANCE
textures if all of your YCbCr
components are noninterleaved.
Creating two textures for 4:2:0 bi-planar YCbCr
(where CbCr
is interleaved) is straightforward:
glBindTexture(GL_TEXTURE_2D, texture_y);
glTexImage2D(
GL_TEXTURE_2D,
0,
GL_LUMINANCE, // Texture format (8bit)
width,
height,
0, // No border
GL_LUMINANCE, // Source format (8bit)
GL_UNSIGNED_BYTE, // Source data format
NULL
);
glBindTexture(GL_TEXTURE_2D, texture_cbcr);
glTexImage2D(
GL_TEXTURE_2D,
0,
GL_LUMINANCE_ALPHA, // Texture format (16-bit)
width / 2,
height / 2,
0, // No border
GL_LUMINANCE_ALPHA, // Source format (16-bits)
GL_UNSIGNED_BYTE, // Source data format
NULL
);
where you would then use glTexSubImage2D()
or the iOS5 texture cache to update these textures.
I'd also recommend using a 2D varying
that spans the texture coordinate space (x: [0,1], y: [0,1])
so that you avoid any dependent texture reads in your fragment shader. The end result is super-fast and doesn't load the GPU at all in my experience.