Reputation: 147
Please check the following image:
I am using the following code to extract text from the image.
img = cv2.imread("img.png")
txt = pytesseract.image_to_string(img)
But the result is showing different than the original one:
It is showing the following result:
+BuFl
But it should be:
+Bu#L
I don't know what the problem is. I am pretty new in Pytesseract.
Is there anyone who can help me to sort out the problem?
Thank you very much.
Upvotes: 1
Views: 155
Reputation: 7985
One way of solving is applying otsu-thresholding
Otsu's method automatically finds the threshold value unlike global thresholding.
The result of applying Otsu's threshold will be:
import cv2
import pytesseract
img = cv2.imread("Tqom8.png") # Load the image
img = cv2.resize(img, (0, 0), fx=0.5, fy=0.5)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # Convert to gray
thr = cv2.threshold(gray, 0, 128, cv2.THRESH_OTSU)[1]
txt = pytesseract.image_to_string(gray, config='--psm 6')
print(pytesseract.__version__)
print(txt)
Result:
0.3.8
+Bu#L
Also make sure to read the Improving the quality of the output
Upvotes: 1