Install tesseract/pytesser on Mac OS X

user3684792 picture user3684792 · May 28, 2014 · Viewed 11.1k times · Source

I am trying to install this (and additionally pytesser) for osx 10.9 (with anaconda as default python). I have looked around online but I can't get any of the tutorials to work as they all seem to be extinct (homebrew doesn't have a formula for leptonica for instance). I have probably been struggling to install this for the best part of a week with absolutely no luck at all.

Has anyone managed to succeed recently-how did you do it?

Thanks

Edit: Strangely the brew for leptonica has spluttered into life. I have the fairly strange error below.

brew install tesseract
==> Downloading https://bitbucket.org/3togo/python-tesseract/downloads/tesseract
Already downloaded: /Library/Caches/Homebrew/tesseract-3.03-rc1.tar.gz
==> ./configure --prefix=/usr/local/Cellar/tesseract/3.03-rc1
checking for leptonica... yes
checking for pixCreate in -llept... yes
checking leptonica version >= 1.70... configure: error: in `/private/tmp/tesseract-      19Ol/tesseract-3.03':
configure: error: leptonica 1.70 or higher is required
See `config.log' for more details

READ THIS: https://github.com/Homebrew/homebrew/wiki/troubleshooting

i.e it is registering the install but still not working. I will check out the config. file as instructed

Edit 2:

Upon trying to import the library in python I get this:

import tesseract

Traceback (most recent call last):

File "<stdin>", line 1, in <module>

File "//anaconda/lib/python2.7/site-packages/python-tesseract_0.8-3.0-py2.7_macosx-10.9-   intel.egg/tesseract.py", line 28, in <module>

_tesseract = swig_import_helper()

File "//anaconda/lib/python2.7/site-packages/python-tesseract_0.8-3.0-py2.7_macosx-10.9-intel.egg/tesseract.py", line 24, in swig_import_helper

_mod = imp.load_module('_tesseract', fp, pathname, description)

ImportError: dlopen(//anaconda/lib/python2.7/site-packages/python-tesseract_0.8-3.0- py2.7_macosx-10.9-intel.egg/_tesseract.so, 2): Library not loaded: /usr/local/lib/libtesseract.3.dylib

Referenced from: //anaconda/lib/python2.7/site-packages/python-tesseract_0.8-3.0-py2.7_macosx-10.9-intel.egg/_tesseract.so

Reason: image not found

To be honest I am a complete amateur with respect to any of this behind the scenes installation and had to google extensively to even get this far. I would be really grateful if someone with a bit of knowledge could shed any light on the obvious things to try, as I feel as though I have exhausted the web looking for solutions and am getting close to considering this library unuseable and attempting to write my own ocr library-100% not a job I am looking forward to. Alternatively, if anyone knows any decent python ocr libraries with decent support/ install mainatenance I would love to know about them (From my google searching I suspect that tesseract is by far the best known, which is why it is so frustrating that the install is so tricky)

I will happily provide any any more info about my system etc to any warrior willing to have a crack at helping with this.

Thanks!