I'm using OpenCV to extract a subimage of a scanned document and would like to use tesseract to perform OCR over this subimage.
I found out that I can use two methods for text recognition in tesseract, but so far I wasn't able to find a working solution.
A.) How can I convert a cv::Mat
into a PIX*
?
(PIX* is a datatype of leptonica)
Based on vasiles code below, this is essentially my current code:
cv::Mat image = cv::imread("c:/image.png");
cv::Mat subImage = image(cv::Rect(50, 200, 300, 100));
int depth;
if(subImage.depth() == CV_8U)
depth = 8;
//other cases not considered yet
PIX* pix = pixCreateHeader(subImage.size().width, subImage.size().height, depth);
pix->data = (l_uint32*) subImage.data;
tesseract::TessBaseAPI tess;
STRING text;
if(tess.ProcessPage(pix, 0, 0, &text))
{
std::cout << text.string();
}
While it doesn't crash or anything, the OCR result still is wrong. It should recognize one word of my sample image, but instead it returns some non-readable characters.
The method PIX_HEADER
doesn't exist, so I used pixCreateHeader
, but it doesn't take the number of channels as an argument. So how can I set the number of channels?
B.) How can I use cv::Mat
for TesseractRect()
?
Tesseract offers another method for text recognition with this signature:
char * TessBaseAPI::TesseractRect (
const UINT8 * imagedata,
int bytes_per_pixel,
int bytes_per_line,
int left,
int top,
int width,
int height
)
Currently I am using the following code, but it also returns non-readable characters (although different ones than from the code above.
char* cr = tess.TesseractRect(
subImage.data,
subImage.channels(),
subImage.channels() * subImage.size().width,
0,
0,
subImage.size().width,
subImage.size().height);
tesseract::TessBaseAPI tess;
cv::Mat sub = image(cv::Rect(50, 200, 300, 100));
tess.SetImage((uchar*)sub.data, sub.size().width, sub.size().height, sub.channels(), sub.step1());
tess.Recognize(0);
const char* out = tess.GetUTF8Text();