How can I implement OCR on a website using PHP?

Moshe picture Moshe · Jan 31, 2010 · Viewed 38.4k times · Source

Are there any free OCR libraries that work with PHP or Python on a Linux server? The idea is to be able to upload an image and pull out characters from it, or allow users to "draw characters", and parse them out of said image.

Answer

nategood picture nategood · Jan 31, 2010

Since you're on a Linux box, I would highly recommend Google's open source project ocropus.

It's not PHP, but I think it will be your best option. Of course you can call it from within PHP via exec. Its mature and has a lot of options. From the project site:

The OCRopus engine is based on two research projects: a high-performance handwriting recognizer developed in the mid-90's and deployed by the US Census bureau, and novel high-performance layout analysis methods.

There is also another open source project, tesseract. I've used this in the past as well and have been pleased with the results. Includes training, limiting your alphabet, etc.