Reputation: 7304
I use Tesseract OCR for my text reading. My binary image is clear, but when the image is read by OCR, there is error in reading. Actual numbers are 05820, but it is read as 05320. Very clear and sharp image has error, what could be wrong in implementation? I attached the image and the Tessearact code I used.
![enter image description here][1]int OCR::textRecognition(void){
tesseract::TessBaseAPI tess;
tess.Init(NULL, "eng", tesseract::OEM_DEFAULT);
tess.SetPageSegMode(tesseract::PSM_SINGLE_BLOCK);
tess.SetImage((uchar*)extText.data, extText.cols, extText.rows, 1, extText.cols);
// Get the text
char* out = tess.GetUTF8Text();
std::cout << out << std::endl;
return SUCCESS;
}
Upvotes: 0
Views: 279
Reputation: 10850
Try train tesseract using the font you plan to work with. It should drammatically improve precision. You can use SerakTesseractTrainer to do this. Here is youtube tutorial: http://www.youtube.com/watch?v=47rgBL9NZkM
Upvotes: 2