Vamsi
Vamsi

Reputation: 153

Pytesseract is very slow for real time OCR, any way to optimise my code?

I'm trying to create a real time OCR in python using mss and pytesseract.

So far, I've been able to capture my entire screen which has a steady FPS of 30. If I wanted to capture a smaller area of around 500x500, I've been able to get 100+ FPS.

However, as soon as I include this line of code, text = pytesseract.image_to_string(img), boom 0.8 FPS. Is there any way I could optimise my code to get a better FPS? Also the code is able to detect text, its just extremely slow.

from mss import mss
import cv2
import numpy as np
from time import time
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r'C:\\Users\\Vamsi\\AppData\\Local\\Programs\\Tesseract-OCR\\tesseract.exe'

with mss() as sct:
    # Part of the screen to capture
    monitor = {"top": 200, "left": 200, "width": 500, "height": 500}

    while "Screen capturing":
        begin_time = time()

        # Get raw pixels from the screen, save it to a Numpy array
        img = np.array(sct.grab(monitor))

        # Finds text from the images
        text = pytesseract.image_to_string(img)

        # Display the picture
        cv2.imshow("Screen Capture", img)

        # Display FPS
        print('FPS {}'.format(1 / (time() - begin_time)))

        # Press "q" to quit
        if cv2.waitKey(25) & 0xFF == ord("q"):
            cv2.destroyAllWindows()
            break

Upvotes: 11

Views: 23263

Answers (4)

Pete
Pete

Reputation: 6723

I had this same problem. OCRing a document in the native desktop environment, took 5 seconds and the same document when running in docker on the same machine, took 200+ seconds.

The solution turned out to be adding:

ENV OMP_THREAD_LIMIT=1

to my dockerfile.

This disable multithreading in tesseract. Why it makes it faster in docker, I couldn't tell you, but it brings it down close to native performance for me.

Upvotes: 2

Punnerud
Punnerud

Reputation: 8021

After looking at the pytesseract code I see that it convert the image format and save locally before feeding it to tesseract. By changing from PNG to JPG i got a 3x speedup (9.5 to 3seconds/image). I guess there is more optimization that could be done in the Python code part.

Upvotes: 2

Pasindu Ranasinghe
Pasindu Ranasinghe

Reputation: 237

You can use the “easyocr”, a lightweight python package which can be used for OCR applications. It is very fast, reliable and has access to over 70+ languages, including English, Chinese, Japanese, Korean, Hindi, and many more are being added.

"pip install easyocr"

Check this out: https://huggingface.co/spaces/tomofi/EasyOCR

Upvotes: 1

user898678
user898678

Reputation: 3328

pytesseract is not efficient "by default", as it wraps tesseract executable, it save temporary files to disk etc... If you are serious about performance you need to use tesseract API directly (e.g. via tesserocr or by creating custom API wrapper)

Upvotes: 0

Related Questions