At a high level, I created an app that lets a user point his or her iPhone camera around and see video frames that have been processed with visual effects. Additionally, the user can tap a button to take a freeze-frame of the current preview as a high-resolution photo that is saved in their iPhone library.
To do this, the app follows this procedure:
1) Create an AVCaptureSession
captureSession = [[AVCaptureSession alloc] init];
[captureSession setSessionPreset:AVCaptureSessionPreset640x480];
2) Hook up an AVCaptureDeviceInput using the back-facing camera.
videoInput = [[[AVCaptureDeviceInput alloc] initWithDevice:backFacingCamera error:&error] autorelease];
[captureSession addInput:videoInput];
3) Hook up an AVCaptureStillImageOutput to the session to be able to capture still frames at Photo resolution.
stillOutput = [[AVCaptureStillImageOutput alloc] init];
[stillOutput setOutputSettings:[NSDictionary
dictionaryWithObject:[NSNumber numberWithInt:kCVPixelFormatType_32BGRA]
forKey:(id)kCVPixelBufferPixelFormatTypeKey]];
[captureSession addOutput:stillOutput];
4) Hook up an AVCaptureVideoDataOutput to the session to be able to capture individual video frames (CVImageBuffers) at a lower resolution
videoOutput = [[AVCaptureVideoDataOutput alloc] init];
[videoOutput setVideoSettings:[NSDictionary dictionaryWithObject:[NSNumber numberWithInt:kCVPixelFormatType_32BGRA] forKey:(id)kCVPixelBufferPixelFormatTypeKey]];
[videoOutput setSampleBufferDelegate:self queue:dispatch_get_main_queue()];
[captureSession addOutput:videoOutput];
5) As video frames are captured, the delegate's method is called with each new frame as a CVImageBuffer:
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
{
CVImageBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
[self.delegate processNewCameraFrame:pixelBuffer];
}
6) Then the delegate processes/draws them:
- (void)processNewCameraFrame:(CVImageBufferRef)cameraFrame {
CVPixelBufferLockBaseAddress(cameraFrame, 0);
int bufferHeight = CVPixelBufferGetHeight(cameraFrame);
int bufferWidth = CVPixelBufferGetWidth(cameraFrame);
glClear(GL_COLOR_BUFFER_BIT);
glGenTextures(1, &videoFrameTexture_);
glBindTexture(GL_TEXTURE_2D, videoFrameTexture_);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, bufferWidth, bufferHeight, 0, GL_BGRA, GL_UNSIGNED_BYTE, CVPixelBufferGetBaseAddress(cameraFrame));
glBindBuffer(GL_ARRAY_BUFFER, [self vertexBuffer]);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, [self indexBuffer]);
glDrawElements(GL_TRIANGLE_STRIP, 4, GL_UNSIGNED_SHORT, BUFFER_OFFSET(0));
glBindBuffer(GL_ARRAY_BUFFER, 0);
glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0);
[[self context] presentRenderbuffer:GL_RENDERBUFFER];
glDeleteTextures(1, &videoFrameTexture_);
CVPixelBufferUnlockBaseAddress(cameraFrame, 0);
}
This all works and leads to the correct results. I can see a video preview of 640x480 processed through OpenGL. It looks like this:
However, if I capture a still image from this session, its resolution will also be 640x480. I want it to be high resolution, so in step one I change the preset line to:
[captureSession setSessionPreset:AVCaptureSessionPresetPhoto];
This correctly captures still images at the highest resolution for the iPhone4 (2592x1936).
However, the video preview (as received by the delegate in steps 5 and 6) now looks like this:
I've confirmed that every other preset (High, medium, low, 640x480, and 1280x720) previews as intended. However, the Photo preset seems to send buffer data in a different format.
I've also confirmed that the data being sent to the buffer at the Photo preset is actually valid image data by taking the buffer and creating a UIImage out of it instead of sending it to openGL:
CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
CGContextRef context = CGBitmapContextCreate(CVPixelBufferGetBaseAddress(cameraFrame), bufferWidth, bufferHeight, 8, bytesPerRow, colorSpace, kCGBitmapByteOrder32Little | kCGImageAlphaPremultipliedFirst);
CGImageRef cgImage = CGBitmapContextCreateImage(context);
UIImage *anImage = [UIImage imageWithCGImage:cgImage];
This shows an undistorted video frame.
I've done a bunch of searching and can't seem to fix it. My hunch is that it's a data format issue. That is, I believe that the buffer is being set correctly, but with a format that this line doesn't understand:
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, bufferWidth, bufferHeight, 0, GL_BGRA, GL_UNSIGNED_BYTE, CVPixelBufferGetBaseAddress(cameraFrame));
My hunch was that changing the external format from GL_BGRA to something else would help, but it doesn't... and through various means it looks like the buffer is actually in GL_BGRA.
Does anyone know what's going on here? Or do you have any tips on how I might go about debugging why this is happening? (What's super weird is that this happens on an iphone4 but not on an iPhone 3GS ... both running ios4.3)
This was a doozy.
As Lio Ben-Kereth pointed out, the padding is 48 as you can see from the debugger
(gdb) po pixelBuffer
<CVPixelBuffer 0x2934d0 width=852 height=640 bytesPerRow=3456 pixelFormat=BGRA
# => 3456 - 852 * 4 = 48
OpenGL can compensate for this, but OpenGL ES cannot (more info here openGL SubTexturing)
So here is how I'm doing it in OpenGL ES:
(CVImageBufferRef)pixelBuffer // pixelBuffer containing the raw image data is passed in
/* ... */
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, videoFrameTexture_);
int frameWidth = CVPixelBufferGetWidth(pixelBuffer);
int frameHeight = CVPixelBufferGetHeight(pixelBuffer);
size_t bytesPerRow, extraBytes;
bytesPerRow = CVPixelBufferGetBytesPerRow(pixelBuffer);
extraBytes = bytesPerRow - frameWidth*4;
GLubyte *pixelBufferAddr = CVPixelBufferGetBaseAddress(pixelBuffer);
if ( [[captureSession sessionPreset] isEqualToString:@"AVCaptureSessionPresetPhoto"] )
{
glTexImage2D( GL_TEXTURE_2D, 0, GL_RGBA, frameWidth, frameHeight, 0, GL_BGRA, GL_UNSIGNED_BYTE, NULL );
for( int h = 0; h < frameHeight; h++ )
{
GLubyte *row = pixelBufferAddr + h * (frameWidth * 4 + extraBytes);
glTexSubImage2D( GL_TEXTURE_2D, 0, 0, h, frameWidth, 1, GL_BGRA, GL_UNSIGNED_BYTE, row );
}
}
else
{
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, frameWidth, frameHeight, 0, GL_BGRA, GL_UNSIGNED_BYTE, pixelBufferAddr);
}
Before, I was using AVCaptureSessionPresetMedium
and getting 30fps. In AVCaptureSessionPresetPhoto
I'm getting 16fps on an iPhone 4. The looping for the sub-texture does not seem to affect the frame rate.
I'm using an iPhone 4 on iOS 5.