I have trying to use pytesseract for OCR (extracting text from the image). I have successfully installed pytessearct by using the command -
pip install pytessearct
When I try to install it again, it clearly says -
Requirement already satisfied (use --upgrade to upgrade):
pytesseract in ./site-packages
This means pytessearct is installed successfully. When i try to import this package in my iPython notebook using -
import pytessearct
It throws an error -
ImportError: No module named pytesseract
Why is that happening?
To use Python-tesseract - requires python 2.5+ or python 3.x - first you have to install PIL and pytesseract packages through pip:
pip install Pillow
pip install pytesseract
Then you have to download and install the tesseract OCR:
https://sourceforge.net/projects/tesseract-ocr-alt/?source=typ_redirect
As far as I know it automatically adds it to your PATH variable.
Then use it like this way:
import pytesseract
from PIL import Image
img = Image.open('Capture.PNG')
pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe'
print( pytesseract.image_to_string(img) )
I hope it helps :)