Reputation: 21
I am using pytesseract to read number from the screen in real-time. The image mostly number, dot and 2 letters (M and R) as below. In real-time number will keep changing but the letter M and R will stay the same place. Background will always green with black letters.
As you can see the number on image is very clear but the pytesseract read the number and the result is not really satisfy. Sometime its read 7 become 1. I would like to find the algorithms that help improce OCR result.
Currently I am using Pillow to convert image to gray scale and also try resize image bigger or smaller but still improve result much. Also applied filter on the image as below but result still not 100% correct.
img = cv2.imread('screenshot.png')
img = cv2.resize(img, None, fx=scale_factor, fy=scale_factor, interpolation=cv2.INTER_CUBIC)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img = cv2.threshold(cv2.bilateralFilter(img, 5, 75, 75), 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
text = tess.image_to_string(img)
Please help suggest any algorithms that will help improve this OCR result.
Upvotes: 1
Views: 612
Reputation: 7995
You can easily detect applying simple-thresholding
Code:
import cv2
import pytesseract
img = cv2.imread("UEWHj.png")
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thr = cv2.threshold(gry, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
txt = pytesseract.image_to_string(thr)
print(txt)
cv2.imshow("thr", thr)
cv2.waitKey(0)
Upvotes: 1