Top "Ocr" questions

Optical Character Recognition, usually abbreviated to OCR, is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text.

PDF and text layer

According to this site http://www.searchable-pdf.com/content.php?lang=en&c=61, a PDF can be searchable when …

pdf ocr scanning
Alternative to Tesseract OCR Training?

For the past 3 months I've been trying to train the Tesseract With identifying a collection of images I've had, due …

ocr tesseract
Does Tesseract's hOCR output really contain bounding boxes and confidence levels for each character?

In the Tesseract FAQ they say you can: How can I get the coordinates and confidence of each character? There …

ocr tesseract hocr
Page layout analysis using Tesseract?

Tesseract 3 is able to perform page layout analysis. However, I couldn't find any sample code or documentation on how to …

c++ image-processing ocr tesseract
ABBYY Mobile OCR Engine for Iphone

I am looking to use/buy a OCR solution for my next iPhone app. Searching through the answers on this …

iphone mobile ocr abbyy
Open source OCR tool available in the market

Is there any open source OCR library written in .NET, or written in any language but can be used in …

.net ocr
Understanding Freeman chain codes for OCR

Note that I'm really looking for an answer to my question. I am not looking for a link to some …

algorithm ocr
Pytesser set character whitelist

Does anyone know how to set the character whitelist for Pytesseract? I want it to only output A-z and 0-9. …

python ocr tesseract pytesser
What OCR options exist beyond Tesseract?

I've used Tesseract a bit and it's results leave much to be desired. I'm currently detecting very small images (35x15, …

php python ruby ocr tesseract
Extracting tables from a pdf

I'm trying to get the data from the tables in this PDF. I've tried pdfminer and pypdf with a little …

python python-2.7 ocr pdfminer pdf-parsing