iOS CVImageBuffer distorted from AVCaptureSessionDataOutput with AVCaptureSessionPresetPhoto

sotangochips picture sotangochips · Jun 30, 2011 · Viewed 10.8k times · Source

At a high level, I created an app that lets a user point his or her iPhone camera around and see video frames that have been processed with visual effects. Additionally, the user can tap a button to take a freeze-frame of the current preview as a high-resolution photo that is saved in their iPhone library.

To do this, the app follows this procedure:

1) Create an AVCaptureSession

captureSession = [[AVCaptureSession alloc] init];
[captureSession setSessionPreset:AVCaptureSessionPreset640x480];

2) Hook up an AVCaptureDeviceInput using the back-facing camera.

videoInput = [[[AVCaptureDeviceInput alloc] initWithDevice:backFacingCamera error:&error] autorelease];
[captureSession addInput:videoInput];

3) Hook up an AVCaptureStillImageOutput to the session to be able to capture still frames at Photo resolution.

stillOutput = [[AVCaptureStillImageOutput alloc] init];
[stillOutput setOutputSettings:[NSDictionary
    dictionaryWithObject:[NSNumber numberWithInt:kCVPixelFormatType_32BGRA]
    forKey:(id)kCVPixelBufferPixelFormatTypeKey]];
[captureSession addOutput:stillOutput];

4) Hook up an AVCaptureVideoDataOutput to the session to be able to capture individual video frames (CVImageBuffers) at a lower resolution

videoOutput = [[AVCaptureVideoDataOutput alloc] init];
[videoOutput setVideoSettings:[NSDictionary dictionaryWithObject:[NSNumber numberWithInt:kCVPixelFormatType_32BGRA] forKey:(id)kCVPixelBufferPixelFormatTypeKey]];
[videoOutput setSampleBufferDelegate:self queue:dispatch_get_main_queue()];
[captureSession addOutput:videoOutput];

5) As video frames are captured, the delegate's method is called with each new frame as a CVImageBuffer:

- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
{
    CVImageBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    [self.delegate processNewCameraFrame:pixelBuffer];
}

6) Then the delegate processes/draws them:

- (void)processNewCameraFrame:(CVImageBufferRef)cameraFrame {
    CVPixelBufferLockBaseAddress(cameraFrame, 0);
    int bufferHeight = CVPixelBufferGetHeight(cameraFrame);
    int bufferWidth = CVPixelBufferGetWidth(cameraFrame);

    glClear(GL_COLOR_BUFFER_BIT);

    glGenTextures(1, &videoFrameTexture_);
    glBindTexture(GL_TEXTURE_2D, videoFrameTexture_);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
    glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);

    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, bufferWidth, bufferHeight, 0, GL_BGRA, GL_UNSIGNED_BYTE, CVPixelBufferGetBaseAddress(cameraFrame));

    glBindBuffer(GL_ARRAY_BUFFER, [self vertexBuffer]);
    glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, [self indexBuffer]);

    glDrawElements(GL_TRIANGLE_STRIP, 4, GL_UNSIGNED_SHORT, BUFFER_OFFSET(0));

    glBindBuffer(GL_ARRAY_BUFFER, 0);
    glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0);
    [[self context] presentRenderbuffer:GL_RENDERBUFFER];

    glDeleteTextures(1, &videoFrameTexture_);

    CVPixelBufferUnlockBaseAddress(cameraFrame, 0);
}

This all works and leads to the correct results. I can see a video preview of 640x480 processed through OpenGL. It looks like this:

640x480 Correct Preview

However, if I capture a still image from this session, its resolution will also be 640x480. I want it to be high resolution, so in step one I change the preset line to:

[captureSession setSessionPreset:AVCaptureSessionPresetPhoto];

This correctly captures still images at the highest resolution for the iPhone4 (2592x1936).

However, the video preview (as received by the delegate in steps 5 and 6) now looks like this:

Photo preview incorrect

I've confirmed that every other preset (High, medium, low, 640x480, and 1280x720) previews as intended. However, the Photo preset seems to send buffer data in a different format.

I've also confirmed that the data being sent to the buffer at the Photo preset is actually valid image data by taking the buffer and creating a UIImage out of it instead of sending it to openGL:

CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
CGContextRef context = CGBitmapContextCreate(CVPixelBufferGetBaseAddress(cameraFrame), bufferWidth, bufferHeight, 8, bytesPerRow, colorSpace, kCGBitmapByteOrder32Little | kCGImageAlphaPremultipliedFirst); 
CGImageRef cgImage = CGBitmapContextCreateImage(context); 
UIImage *anImage = [UIImage imageWithCGImage:cgImage];

This shows an undistorted video frame.

I've done a bunch of searching and can't seem to fix it. My hunch is that it's a data format issue. That is, I believe that the buffer is being set correctly, but with a format that this line doesn't understand:

glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, bufferWidth, bufferHeight, 0, GL_BGRA, GL_UNSIGNED_BYTE, CVPixelBufferGetBaseAddress(cameraFrame));

My hunch was that changing the external format from GL_BGRA to something else would help, but it doesn't... and through various means it looks like the buffer is actually in GL_BGRA.

Does anyone know what's going on here? Or do you have any tips on how I might go about debugging why this is happening? (What's super weird is that this happens on an iphone4 but not on an iPhone 3GS ... both running ios4.3)

Answer

Dex picture Dex · Oct 31, 2011

This was a doozy.

As Lio Ben-Kereth pointed out, the padding is 48 as you can see from the debugger

(gdb) po pixelBuffer
<CVPixelBuffer 0x2934d0 width=852 height=640 bytesPerRow=3456 pixelFormat=BGRA
# => 3456 - 852 * 4 = 48

OpenGL can compensate for this, but OpenGL ES cannot (more info here openGL SubTexturing)

So here is how I'm doing it in OpenGL ES:

(CVImageBufferRef)pixelBuffer   // pixelBuffer containing the raw image data is passed in

/* ... */
glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_2D, videoFrameTexture_);

int frameWidth = CVPixelBufferGetWidth(pixelBuffer);
int frameHeight = CVPixelBufferGetHeight(pixelBuffer);

size_t bytesPerRow, extraBytes;

bytesPerRow = CVPixelBufferGetBytesPerRow(pixelBuffer);
extraBytes = bytesPerRow - frameWidth*4;

GLubyte *pixelBufferAddr = CVPixelBufferGetBaseAddress(pixelBuffer);

if ( [[captureSession sessionPreset] isEqualToString:@"AVCaptureSessionPresetPhoto"] )
{

    glTexImage2D( GL_TEXTURE_2D, 0, GL_RGBA, frameWidth, frameHeight, 0, GL_BGRA, GL_UNSIGNED_BYTE, NULL );

    for( int h = 0; h < frameHeight; h++ )
    {
        GLubyte *row = pixelBufferAddr + h * (frameWidth * 4 + extraBytes);
        glTexSubImage2D( GL_TEXTURE_2D, 0, 0, h, frameWidth, 1, GL_BGRA, GL_UNSIGNED_BYTE, row );
    }
}
else
{
    glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, frameWidth, frameHeight, 0, GL_BGRA, GL_UNSIGNED_BYTE, pixelBufferAddr);
}

Before, I was using AVCaptureSessionPresetMedium and getting 30fps. In AVCaptureSessionPresetPhoto I'm getting 16fps on an iPhone 4. The looping for the sub-texture does not seem to affect the frame rate.

I'm using an iPhone 4 on iOS 5.