iOS: Real Time OCR on top of live camera feed (similar to iTunes Redeem Gift Card)

boliva picture boliva · Sep 30, 2013 · Viewed 22.8k times · Source

Is there a way to accomplish something similar to what the iTunes and App Store Apps do when you redeem a Gift Card using the device camera, recognizing a short string of characters in real time on top of the live camera feed?

iTunes App Redeem Gift Card UI

I know that in iOS 7 there is now the AVMetadataMachineReadableCodeObject class which, AFAIK, only represents barcodes. I'm more interested in detecting and reading the contents of a short string. Is this possible using publicly available API methods, or some other third party SDK that you might know of?

There is also a video of the process in action:

https://www.youtube.com/watch?v=c7swRRLlYEo

Best,

Answer

Donovan picture Donovan · Nov 26, 2014

I'm working on a project that does something similar to the Apple app store redeem with camera as you mentioned.

A great starting place on processing live video is a project I found on GitHub. This is using the AVFoundation framework and you implement the AVCaptureVideoDataOutputSampleBufferDelegate methods.

Once you have the image stream (video), you can use OpenCV to process the video. You need to determine the area in the image you want to OCR before you run it through Tesseract. You have to play with the filtering, but the broad steps you take with OpenCV are:

  • Convert the images to B&W using cv::cvtColor(inputMat, outputMat, CV_RGBA2GRAY);
  • Threshold the images to eliminate unnecessary elements. You specify the threshold value to eliminate, and then set everything else to black (or white).
  • Determine the lines that form the boundary of the box (or whatever you are processing). You can either create a "bounding box" if you have eliminated everything but the desired area, or use the HoughLines algorithm (or the probabilistic version, HoughLinesP). Using this, you can determine line intersection to find corners, and use the corners to warp the desired area to straighten it into a proper rectangle (if this step is necessary in your application) prior to OCR.
  • Process the portion of the image with Tesseract OCR library to get the resulting text. It is possible to create training files for letters in OpenCV so you can read the text without Tesseract. This could be faster but also could be a lot more work. In the App Store case, they are doing something similar to display the text that was read overlaid on top of the original image. This adds to the cool factor, so it just depends on what you need.

Some other hints:

  • I used the book "Instant OpenCV" to get started quickly with this. It was pretty helpful.
  • Download OpenCV for iOS from OpenCV.org/downloads.html
  • I have found adaptive thresholding to be very useful, you can read all about it by searching for "OpenCV adaptiveThreshold". Also, if you have an image with very little in between light and dark elements, you can use Otsu's Binarization. This automatically determines the threshold values based on the histogram of the grayscale image.