Reputation: 43
I am trying to detect this letter but it doesn't seem to recognize it.
import cv2
import pytesseract as tess
img = cv2.imread("letter.jpg")
imggray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
print(tess.image_to_string(imggray))
this is the image in question:
Upvotes: 0
Views: 180
Reputation: 5924
Preprocessing of the image (e.g. inverting it) should help, and also you could take advantage of pytesseract image_to_string
config options.
For instance, something along these lines:
import pytesseract
import cv2 as cv
import requests
import numpy as np
import io
# I read this directly from imgur
response = requests.get('https://i.sstatic.net/LGFAu.jpg')
nparr = np.frombuffer(response.content, np.uint8)
img = cv.imdecode(nparr, cv.IMREAD_GRAYSCALE)
# simple inversion as preprocessing
neg_img = cv.bitwise_not(img)
# invoke tesseract with options
text = pytesseract.image_to_string(neg_img, config='--psm 7')
print(text)
should parse the letter correctly.
Have a look at related questions for some additional info about preprocessing and tesseract options:
Why does pytesseract fail to recognise digits from image with darker background?
Why does pytesseract fail to recognize digits in this simple image?
Why does tesseract fail to read text off this simple image?
Upvotes: 2
Reputation: 8005
@Davide Fiocco 's answer is definitely correct.
I just want to show another way of doing it with adaptive-thresholding
When you apply adaptive-thesholding
result will be:
Now when you read it:
txt = pytesseract.image_to_string(thr, config="--psm 7")
print(txt)
Result:
B
Code:
import cv2
import pytesseract
img = cv2.imread("LGFAu.jpg")
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thr = cv2.adaptiveThreshold(gry, 252, cv2.ADAPTIVE_THRESH_MEAN_C,
cv2.THRESH_BINARY_INV, 11, 2)
txt = pytesseract.image_to_string(thr, config="--psm 7")
print(txt)
Upvotes: 1