Reading error using Tessearact OCR

Question

I use Tesseract OCR for my text reading. My binary image is clear, but when the image is read by OCR, there is error in reading. Actual numbers are 05820, but it is read as 05320. Very clear and sharp image has error, what could be wrong in implementation? I attached the image and the Tessearact code I used.

     ![enter image description here][1]int OCR::textRecognition(void){
        tesseract::TessBaseAPI tess;
        tess.Init(NULL, "eng", tesseract::OEM_DEFAULT);
        tess.SetPageSegMode(tesseract::PSM_SINGLE_BLOCK);

        tess.SetImage((uchar*)extText.data, extText.cols, extText.rows, 1, extText.cols);
        // Get the text
        char* out = tess.GetUTF8Text();
        std::cout << out << std::endl;
        return SUCCESS;
    }

enter image description here

Andrey Smorodov · Accepted Answer

Try train tesseract using the font you plan to work with. It should drammatically improve precision. You can use SerakTesseractTrainer to do this. Here is youtube tutorial: http://www.youtube.com/watch?v=47rgBL9NZkM

Reading error using Tessearact OCR

Answers (1)

Related Questions