I recently used tesseract OCR with python and I kept getting an error when I was trying to import image_to_string
from tesseract.
Code causing the problem:
# Perform OCR using tesseract-ocr library
from tesseract import image_to_string
image = Image.open('input-NEAREST.tif')
print image_to_string(image)
Error caused by above code:
Traceback (most recent call last):
file "./captcha.py", line 52, in <module>
from tesseract import image_to_string
ImportError: cannot import name image_to_string
I've verified that the tesseract module is installed:
digital_alchemy@roaming-gnome /home $ pydoc modules | grep 'tesseract'
Hdf5StubImagePlugin _tesseract gzip sipconfig
ORBit cairo mako tesseract
I believe that I've grabbed all the required packages but unfortunately I'm just stuck at this point. It appears that the function is not in the module.
Any help greatly appreciated.
Another possibility that seems to have worked for me is to modify pytesseract so that instead of import Image it has from PIL import Image
Code that works in PyCharm after modifying pytesseract:
from pytesseract import image_to_string
from PIL import Image
im = Image.open(r'C:\Users\<user>\Downloads\dashboard-test.jpeg')
print(im)
print(image_to_string(im))
Pytesseract I installed via the package management built into PyCharm