7-Segment Display OCR

Karim picture Karim · Feb 20, 2012 · Viewed 8.6k times · Source

I'm building an iOS application (take a picture and run OCR on it) using Tesseract (an OCR library) and it is working very well with well written numbers and characters (using usual fonts).

The problem I am having is that if I try it on a 7-Segment Display, it gives very very bad results.

So my question is: Does anyone know how I can approach this problem? Is there a way for Tesseract to recognize these characters?

Answer

Matt K picture Matt K · May 11, 2012

I too had great difficulty in getting tesseract to recognize digits from images of LCD displays.

I had some marginal success by preprocessing the images with ImageMagick to overlay a copy of the image on itself with a slight vertical shift to fill in the gaps between segments:

$ composite -compose Multiply -geometry +0+3  foo.tif foo.tif foo2.png

In the end, though, my saving grace was the "Seven Segment Optical Character Recognition" binary: http://www.unix-ag.uni-kl.de/~auerswal/ssocr/

Many thanks to the author, Erik Auerswald, for this code!