Ege Yıldırım
Ege Yıldırım

Reputation: 435

Decreasing Tesseract OCR Execution time

i made a Optical Character Recognition program by using Tesseract, however it works slower than its intended to. Im using ocrb traineddata which i found on github and i believe creating my own trained data with smaller size will increase execution speed. I'm relatively new to OCR so do you have any tips to increase time efficiency? Maybe an alternative for Pix?

Its runtime is 0.1 second in my computer, it would be perfect if its below 0.066. Here is my function:

 std::string imageToText() {

    tesseract::TessBaseAPI api;
    api.Init("./tessdata", "ocrb_int");
    Pix* image = pixRead("randommrz.jpg");
    api.SetImage(image);

    return api.GetUTF8Text();
}

Also i'm aware of lack of garbage collecting

Upvotes: 2

Views: 5943

Answers (3)

Ege Yıldırım
Ege Yıldırım

Reputation: 435

Okay so i was able to reduce running time to 0.036 secs, im writing my steps for future devs :)

  1. I used CMAKE and included only neccessary libraries, i dont know how much this effected the running time but i guess it's good practice.

  2. I trained my own data with tesseract, took a lot of trial/error but finally managed to create a new one with good time/accuracy tradeoff.

  3. I did preprocessing manually with OpenCV, didn't use leptonica ( pixRead() etc.) First problem i encountered in this step was: tessApi->SetImage() function takes Pix* object , but there is a overloaded option which you can use like:

    tesseract::TessBaseAPI* tess = new tesseract::TessBaseAPI();
    cv::Mat image = cv::imread("filename.png");
    /*
    Preprocessing
    */
    tess->SetImage(image.data, image.cols, image.rows, 3, image.step);
    
    return tessApi->GetUTF8Text();
    

Hope this helps!

Upvotes: 3

user898678
user898678

Reputation: 3328

If you are interesting in speed improvement you have to measure each step and whole process. This can give you better picture what is possible to expected:

  1. tesseract initialization - smaller data (ocrb_int) can help, but disk access is limitation (e.g. accessing tesseract library). Also removing some tesseract features could decrease tesseract size - search on internet)
  2. image loading - disk access and image size is your limitation. You can play with different image format
  3. OCR process itself - image preprocessing can help in this stage (but it will cost you something anyway)

Upvotes: 2

Nick Gkloumpos
Nick Gkloumpos

Reputation: 151

According to this tesseract has some OpenCl support so running it in a GPU could improve performance.

If that is not an option you could minimize the data to be processed. Try not to load the full image but only the part containing text. This is called CRAFT, however i could find mainly python implementations.

Upvotes: 0

Related Questions