TensorFlow - Text recognition in image

A. Attia picture A. Attia · Feb 15, 2017 · Viewed 12.8k times · Source

I am new to TensorFlow and to Deep Learning. I am trying to recognize text in naturel scene images. I used to work with an OCR but I would like to use Deep Learning. The text has always the same format : ABC-DEF 88:88.

What I have done is recognize every character/digit. It means that I cropped the image around every character (so each picture gives me 10 characters) to build my training and test set and they build a two conv neural networks. So my training set was a set of characters pictures and the labels were just characters/digits.

But I want to go further. What I would like to do is just to give the full pictures and output the entire text (not one character such as in my previous model).

Thank you in advance for any help.

Answer

soloice picture soloice · Feb 15, 2017

The difficulty is that you don't know where the text is. The solution is, given an image, you need to use a sliding window to crop different part of the image, then use a classifier to decide if there are texts in the cropped area. If so, use your character/digit recognizer to tell which characters/digits they really are.

So you need to train another classifer: given a cropped image (the size of cropped images should be slightly larger than that of your text area), decide if there are texts inside.

Just construct training set (positive samples are text areas, negative samples are other areas randomly cropped from the big images) and train it~