Reputation: 45
I have the attached an image with 300 DPI. I am using the code below to extract text but I am getting no text. Anyone know the issue?
finalImg = Image.open('withdpi.jpg')
text = pytesseract.image_to_string(finalImg)
Upvotes: 0
Views: 365
Reputation: 7995
Lets observe what is your code doing.
We need to see what part of the text is localized and detected.
For understanding the code behavior we will use image_to_data
function.
image_to_data
will show what part of the image is detected.
# Open the image and convert it to the gray-scale
finalImg = Image.open('hP5Pt.jpg').convert('L')
# Initialize ImageDraw class for displaying the detected rectangle in the image
finalImgDraw = ImageDraw.Draw(finalImg)
# OCR detection
d = pytesseract.image_to_data(finalImg, output_type=pytesseract.Output.DICT)
# Get ROI part from the detection
n_boxes = len(d['level'])
# For each detected part
for i in range(n_boxes):
# Get the localized region
(x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])
# Initialize shape for displaying the current localized region
shape = [(x, y), (w, h)]
# Draw the region
finalImgDraw.rectangle(shape, outline="red")
# Display
finalImg.show()
# OCR "psm 6: Assume a single uniform block of text."
txt = pytesseract.image_to_string(cropped, config="--psm 6")
# Result
print(txt)
Result:
i
I
```
So the result is the image itself displays nothing is detected. The code is not-functional. The output does not display the desired result.
There might be various reasons.
Here are some facts of the input image:
Binary image.
Big rectangle artifact.
Text is a little bit dilated.
We can't know whether the image requires pre-processing without testing.
We are sure about the big-black-rectangle is an artifact. We need to remove the artifact. One solution is selecting part of the image.
To select the part of image, we need to use crop
and some trial-and-error to find the roi.
If we the image as two pieces in terms of height. We don't want the other artifact containing half.
From the first glance, we want (0
-> height/2
). If you play with the values you can see that the exact text location is between (height/6
-> height/4
)
Result will be:
$1,582
Code:
# Open the image and convert it to the gray-scale
finalImg = Image.open('hP5Pt.jpg').convert('L')
# Get height and width of the image
w, h = finalImg.size
# Get part of the desired text
finalImg = finalImg.crop((0, int(h/6), w, int(h/4)))
# Initialize ImageDraw class for displaying the detected rectangle in the image
finalImgDraw = ImageDraw.Draw(finalImg)
# OCR detection
d = pytesseract.image_to_data(finalImg, output_type=pytesseract.Output.DICT)
# Get ROI part from the detection
n_boxes = len(d['level'])
# For each detected part
for i in range(n_boxes):
# Get the localized region
(x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])
# Initialize shape for displaying the current localized region
shape = [(x, y), (w, h)]
# Draw the region
finalImgDraw.rectangle(shape, outline="red")
# Display
finalImg.show()
# OCR "psm 6: Assume a single uniform block of text."
txt = pytesseract.image_to_string(cropped, config="--psm 6")
# Result
print(txt)
If you can't get the same solution as mine, you need to check your pytesseract version, using:
print(pytesseract.get_tesseract_version())
For me the result is 4.1.1
Upvotes: 1