How can I run tesseract with multiple languages one time?

pars picture pars · Jun 24, 2014 · Viewed 18.2k times · Source

I have to analyzed a image which containing both English and Japanese texts. When I run tesseract by default (-l eng), some Japanese characters lost. Otherwise, if I run tesseract with japanese (-l jpn) some English characters lost (e.g. Email).

How can I run one process which recognize both English and Japanese characters?

Answer

tobltobs picture tobltobs · Dec 22, 2014

Since tesseract 3.02 it is possible to specify multiple languages for the -l parameter.

-l lang The language to use. If none is specified, English is assumed. Multiple languages may be specified, separated by plus characters. Tesseract uses 3-character ISO 639-2 language codes.

An example:

tesseract myscan.png out -l deu+eng