Decode video frames on iPhone GPU

simon.d picture simon.d · Feb 17, 2012 · Viewed 7.1k times · Source

I'm looking for the fastest way to decode a local mpeg-4 video's frames on the iPhone. I'm simply interested in the luminance values of the pixels in every 10th frame. I don't need to render the video anywhere.

I've tried ffmpeg, AVAssetReader, ImageAssetGenerator, OpenCV, and MPMoviePlayer but they're all too slow. The fastest speed I can get is ~2x (2 minutes of video scanned in a minute). I'd like something closer to 10x.

Assuming my attempts above didn't utilize the GPU, is there any way to accomplish my goal with something that does run on the GPU? OpenGL seems like it's mostly for rendering output but I have seen it used as filters for incoming video. Maybe that's an option?

Thanks in advance!

Answer

Duncan C picture Duncan C · Feb 26, 2012

If you are willing to use an iOS 5 only solution, take a look at the sample app ChromaKey from the 2011 WWDC session on AVCaputureSession.

That demo captures 30 FPS of video from the built-in camera and passes each frame to OpenGL as a texture. It then uses OpenGL to manipulate the frame, and optionally writes the result out to an output video file.

The code uses some serious low-level magic to bind a Core Video Pixel buffer from an AVCaptureSession to OpenGL so they share memory in the graphics hardware.

It should be fairly straightforward to change the AVCaptureSession to use a movie file as input rather than camera input.

You could probably set up the session to deliver frames in Y/UV form rather than RGB, where the Y component is luminance. Failing that, it would be a pretty simple matter to write a shader that would convert RGB values for each pixel to luminance values.

You should be able to do all this on ALL Frames, not just every 10th frame.