I am interested in TEXT_DETECTION of Google Vision API, it works impressively. But it seems that TEXT_DETECTION only gives exactly result when the text is in English. In my case, i want to use TEXT_DETECTION in a quite narrow context, for example detection text on ads banners in specific language (in Vietnamese for my case). Can i train the machine on my own data collection to get more exactly result? And how to implement this?
Beside TEXT_DETECTION of Google Vision API, Google also has Google's Optical Character Recognition (OCR) software using dependencies of Tesseract. As i known, they have different algorithms to detect text. I used both Google Docs and TEXT_DETECTION of Google Vision API to read text (in Vietnamse) from a picture. Google Docs gave a good result but Vision API didn't. Why Google Vision API does not inherit advantages of Google OCR?
I want to say something more about Google Vision API Text Detection, maybe any Google Expert here and can read this. As Google announced, their TEXT_DETECTION was fantastic: "Even though the words in this image were slanted and unclear, the OCR extracts the words and their positions correctly. It even picks up the word "beacon" on the presenter's t-shirt". But for some of my pics, what happened was really funny. For example with this pic, even the words "Kem Oxit" are very big in center of pic, it was not recognized. Or in this pic, the red text "HOA CHAT NGOC VIET" in center of pic was not recognized too. There must be something wrong with the text detection algorithm.
Did you experiment with LanguageHints (link to documentation)?
Vietnamese is in the list of supported languages, if the text is always in Vietnamese, this should improve the quality of text detection.
If this wouldn't help, you cannot improve the quality of text detection by giving it your own training examples.