Reputation: 11
I have a small sized image file that was cropped and saved from an original larger image based on a matching criteria. I need to extract the data from this cropped image. But no matter what I try, I am unable to extract the text with pytesseract for this image. Is there something that I can try ?
import cv2 import pytesseract from pytesseract import Output
img = cv2.imread('rois/roi11.jpg') data = pytesseract.image_to_boxes(img, output_type=Output.DICT) print(data)
I have tried scaling up, applying thresholds on the image with no luck.
import cv2
import pytesseract
img = cv2.imread('rois/roi11.jpg')
img2 = cv2.resize(img, (0, 0), fx=2, fy=2)
gry = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)
thr = cv2.threshold(gry, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
data = pytesseract.image_to_string(thr)
print(data)
Upvotes: 1
Views: 61
Reputation: 591
This code works for me:
config_tesseract = '--tessdata-dir tessdata --psm 7'
thr = cv2.threshold(gry, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
texto = pytesseract.image_to_string(thr, lang='por', config=config_tesseract)
print(texto)
Upvotes: 0