Reputation: 147

Pytesseract image to text problem in Python

Please check the following image:

I am using the following code to extract text from the image.

img = cv2.imread("img.png")
txt = pytesseract.image_to_string(img)

But the result is showing different than the original one:

It is showing the following result:

+BuFl

But it should be:

+Bu#L

I don't know what the problem is. I am pretty new in Pytesseract.

Is there anyone who can help me to sort out the problem?

Thank you very much.

Upvotes: 1

Answers (1)

Ahx

Reputation: 8005

One way of solving is applying otsu-thresholding

Otsu's method automatically finds the threshold value unlike global thresholding.

The result of applying Otsu's threshold will be:

import cv2
import pytesseract


img = cv2.imread("Tqom8.png")  # Load the image
img = cv2.resize(img, (0, 0), fx=0.5, fy=0.5)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)  # Convert to gray
thr = cv2.threshold(gray, 0, 128, cv2.THRESH_OTSU)[1]
txt = pytesseract.image_to_string(gray, config='--psm 6')
print(pytesseract.__version__)
print(txt)

Result:

0.3.8
+Bu#L

Also make sure to read the Improving the quality of the output

Upvotes: 1

Pytesseract image to text problem in Python

Answers (1)

Related Questions