Best way to recognize characters in screenshot?

Tomek picture Tomek · Nov 17, 2010 · Viewed 8.3k times · Source

What would you recommend for recognizing all characters from a screenshot? The screenshot is perfectly clear (only black text on a white background), also I can choose any standard font for the text (installed on Windows). I have tried some OCR ways (Tesseract and such), but it made mistakes in recognizing some characters (that baffled me, as the text is without slightest noise, and the fonts were some most common ones - Courier New, Fixedsys etc.), and I need it to be 100% accurate. Is there some library available for this specific purpose, some pattern recognition or something? or should I get the screenshot with some monospaced font, and iterate through the image moving to the right +font_size pixels and then comparing captured thing to in-memory representation of letters and number of same font in the same size? What would be the best approach to this problem? Thank you very much in advance.

UPDATE: I've finally managed to get 100% accuracy by training Tesseract with monospaced font (Courier New) in exact size that I'm screenshotting. Hope that helps someone in the future :)

Answer

blade picture blade · Aug 6, 2016

Since this is the first result on Google for tesseract recognize screenshot, let me do bit of necromancy and add a much simpler solution.

Tesseract expects images at around 300 dpi or more and standard dpi for Windows is 96. Which means you need to rescale the image to 300%. After that, the results improve dramatically.

100%
1x scale
Result: Whal would you recommend for recognizing all characters from a screensnor 7

200%
2x scale
Result: What would you recommend for recognizing all chamcters from a screenth ?

300%
3x scale
Result: What would you recommend for recognizing all characters from a screenshot ?

Anything above 300% works just as well.