I'm building an iOS application (take a picture and run OCR on it) using Tesseract (an OCR library) and it is working very well with well written numbers and characters (using usual fonts).
The problem I am having is that if I try it on a 7-Segment Display, it gives very very bad results.
So my question is: Does anyone know how I can approach this problem? Is there a way for Tesseract to recognize these characters?
I too had great difficulty in getting tesseract to recognize digits from images of LCD displays.
I had some marginal success by preprocessing the images with ImageMagick to overlay a copy of the image on itself with a slight vertical shift to fill in the gaps between segments:
$ composite -compose Multiply -geometry +0+3 foo.tif foo.tif foo2.png
In the end, though, my saving grace was the "Seven Segment Optical Character Recognition" binary: http://www.unix-ag.uni-kl.de/~auerswal/ssocr/
Many thanks to the author, Erik Auerswald, for this code!