Tesseract Ocr Engine Cube mode - Training Tesseract

George Melidis picture George Melidis · May 16, 2013 · Viewed 16k times · Source

Can you explain me what cube mode and Cube Data Files are on Tesseract ocr Engine and what is the advantage of using them?

And how can i train tesseract for Greek to have better results?

Answer

Siarhei Yakushevich picture Siarhei Yakushevich · Nov 21, 2013

For those who might be still interested. On Tesseract's website, there are standard trained data sets for different files.

https://code.google.com/p/tesseract-ocr/downloads/list?num=100&start=100

Procedure for training is described here (for version 3.01)

https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract3

In the case of Cube, there is another engine in comparison with Tesseract. It consumes more resources, slower, but gives better results.

Data files -set of files, that should finally lead(be merged into) to a trained data file.