Reputation: 3905
I'm using the Tesseract OCR for an application I'm writing. I just want to recognize the text on some areas from a picture I get from time to time. The Basic calls work at the moment
tesseract::TessBaseAPI api;
api.SetPageSegMode(tesseract::PSM_AUTO); // Segmentation on auto
api.Init("/usr/local/share/","eng"); // path = parent directory of tessdata
pFile = fopen( "home/myname/test.bmp","r" ); // Open picture
PIX* image; // Image format from leptonica
image = pixReadStreamBmp(pFile);
fclose(pFile);
api.SetImage(image); // Run the OCR
char* textOutput = new char[512];
textOutput =api.GetUTF8Text(); // Get the text
So far this code works fine. But at some point the OCR isnt as accurate as I would wish. I actually don't want to train a new language for my purpose, so I wanted to know if there is any possibility to increase the accuracy over some API calls? Maybe some suggestions here! Best Regards
Tobias
Upvotes: 1
Views: 5641
Reputation: 41
Yes, that is correct, OCR didn't work properly, if you want more accuracy than execute following code.
/*
* word_OCR.cpp
*
* Created on: Jun 23, 2016
* Author: pratik
*/
#include <opencv2/opencv.hpp>
#include <tesseract/baseapi.h>
#include <leptonica/allheaders.h>
#include <iostream>
using namespace std;
using namespace cv;
int main(int argc ,char **argv)
{
Pix *image = pixRead(argv[1]);
if (image == 0) {
cout << "Cannot load input file!\n";
}
tesseract::TessBaseAPI tess;
if (tess.Init("/usr/share/tesseract/tessdata", "eng")) {
fprintf(stderr, "Could not initialize tesseract.\n");
exit(1);
}
tess.SetImage(image);
tess.Recognize(0);
tesseract::ResultIterator *ri = tess.GetIterator();
tesseract::PageIteratorLevel level = tesseract::RIL_WORD;
if(ri!=0)
{
do {
const char *word = ri->GetUTF8Text(level);
cout << word << endl;
delete []word;
} while (ri->Next(level));
delete []ri;
}
}
Here extract the word by word from image and give word as a output and accurate around 90-95%.
Upvotes: 0
Reputation: 634
For me just scaling up the image improved accuracy till almost 100%. Tesseract also states in their documentation somewhere, that for best results you need 300 dpi or more.
So I added:
ocrimage = pixScale(image,4.167,4.167);
api.SetImage(ocrimage);
(4.167 ~ dpi increase from 72 to 300)
Note that I also tried api.SetSourceResolution(..) instead, to tell Tesseract that my image is of less dpi, but somehow that doesn't give as good results as scaling up the image the equivalent amount.
Upvotes: 0
Reputation: 52646
May be, you should provide some enhancement for image.
Smoothing the image remove noises inside image and it will decrease false results.
Pixel height of alphabets will be better in the range of 30 or 40.
Although tesseract work on grayscale images, binary images are found to give better results. For thresholding, use method of Adaptive thresholding.
It is also good to have enough space between words.
You can get further tips from tesseract forum.
Upvotes: 2