Image preprocessing for text recognition

Osiris picture Osiris · Jul 13, 2012 · Viewed 43.3k times · Source

What's the best set of image preprocessing operations to apply to images for text recognition in EmguCV?

I've included two sample images here.

Applying a low or high pass filter won't be suitable, as the text may be of any size. I've tried median and bilateral filters, but they don't seem to affect the image much.

The ideal result would be a binary image with all the text white, and most of the rest black. This image would then be sent to the OCR engine.

Thanks

Answer

karlphillip picture karlphillip · Jul 13, 2012

There's nothing like the best set. Keep in mind that digital images can be acquired by different capture devices and each device can embed its own preprocessing system (filters) and other characteristics that can drastically change the image and even add noises to them. So every case would have to be treated (preprocessed) differently.

However, there are commmon operations that can be used to improve the detection, for instance, a very basic one would be to convert the image to grayscale and apply a threshold to binarize the image. Another technique I've used before is the bounding box, which allows you to detect the text region. To remove noises from images you might be interested in erode/dilate operations. I demonstrate some of these operations on this post.

Also, there are other interesting posts about OCR and OpenCV that you should take a look:

Now, just to show you a simple approach that can be used with your sample image, this is the result of inverting the color and applying a threshold:

cv::Mat new_img = cv::imread(argv[1]);
cv::bitwise_not(new_img, new_img);

double thres = 100;
double color = 255;
cv::threshold(new_img, new_img, thres, color, CV_THRESH_BINARY);

cv::imwrite("inv_thres.png", new_img);