Reputation: 153
I'm trying to create a real time OCR in python using mss
and pytesseract
.
So far, I've been able to capture my entire screen which has a steady FPS of 30. If I wanted to capture a smaller area of around 500x500, I've been able to get 100+ FPS.
However, as soon as I include this line of code, text = pytesseract.image_to_string(img)
, boom 0.8 FPS. Is there any way I could optimise my code to get a better FPS? Also the code is able to detect text, its just extremely slow.
from mss import mss
import cv2
import numpy as np
from time import time
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\\Users\\Vamsi\\AppData\\Local\\Programs\\Tesseract-OCR\\tesseract.exe'
with mss() as sct:
# Part of the screen to capture
monitor = {"top": 200, "left": 200, "width": 500, "height": 500}
while "Screen capturing":
begin_time = time()
# Get raw pixels from the screen, save it to a Numpy array
img = np.array(sct.grab(monitor))
# Finds text from the images
text = pytesseract.image_to_string(img)
# Display the picture
cv2.imshow("Screen Capture", img)
# Display FPS
print('FPS {}'.format(1 / (time() - begin_time)))
# Press "q" to quit
if cv2.waitKey(25) & 0xFF == ord("q"):
cv2.destroyAllWindows()
break
Upvotes: 11
Views: 23263
Reputation: 6723
I had this same problem. OCRing a document in the native desktop environment, took 5 seconds and the same document when running in docker on the same machine, took 200+ seconds.
The solution turned out to be adding:
ENV OMP_THREAD_LIMIT=1
to my dockerfile.
This disable multithreading in tesseract. Why it makes it faster in docker, I couldn't tell you, but it brings it down close to native performance for me.
Upvotes: 2
Reputation: 8021
After looking at the pytesseract code I see that it convert the image format and save locally before feeding it to tesseract. By changing from PNG to JPG i got a 3x speedup (9.5 to 3seconds/image). I guess there is more optimization that could be done in the Python code part.
Upvotes: 2
Reputation: 237
You can use the “easyocr”, a lightweight python package which can be used for OCR applications. It is very fast, reliable and has access to over 70+ languages, including English, Chinese, Japanese, Korean, Hindi, and many more are being added.
"pip install easyocr"
Check this out: https://huggingface.co/spaces/tomofi/EasyOCR
Upvotes: 1
Reputation: 3328
pytesseract is not efficient "by default", as it wraps tesseract executable, it save temporary files to disk etc... If you are serious about performance you need to use tesseract API directly (e.g. via tesserocr or by creating custom API wrapper)
Upvotes: 0