Reputation: 45

Issue recognizing text in image with pytesseract python module

I have the attached an image with 300 DPI. I am using the code below to extract text but I am getting no text. Anyone know the issue?

finalImg = Image.open('withdpi.jpg') text = pytesseract.image_to_string(finalImg)

image to extract text from

Upvotes: 0

Answers (1)

Ahx

Reputation: 7995

Lets observe what is your code doing.

We need to see what part of the text is localized and detected.
For understanding the code behavior we will use image_to_data function.
image_to_data will show what part of the image is detected.

# Open the image and convert it to the gray-scale
finalImg = Image.open('hP5Pt.jpg').convert('L')

# Initialize ImageDraw class for displaying the detected rectangle in the image
finalImgDraw = ImageDraw.Draw(finalImg)

# OCR detection
d = pytesseract.image_to_data(finalImg, output_type=pytesseract.Output.DICT)

# Get ROI part from the detection
n_boxes = len(d['level'])

# For each detected part
for i in range(n_boxes):

    # Get the localized region
    (x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])

    # Initialize shape for displaying the current localized region
    shape = [(x, y), (w, h)]

    # Draw the region
    finalImgDraw.rectangle(shape, outline="red")

    # Display
    finalImg.show()

    # OCR "psm 6: Assume a single uniform block of text."
    txt = pytesseract.image_to_string(cropped, config="--psm 6")

    # Result
    print(txt)

Result:

```
i
I
```

```

So the result is the image itself displays nothing is detected. The code is not-functional. The output does not display the desired result.
There might be various reasons.
Here are some facts of the input image:
- Binary image.
- Big rectangle artifact.
- Text is a little bit dilated.

We can't know whether the image requires pre-processing without testing.
We are sure about the big-black-rectangle is an artifact. We need to remove the artifact. One solution is selecting part of the image.
To select the part of image, we need to use crop and some trial-and-error to find the roi.
- If we the image as two pieces in terms of height. We don't want the other artifact containing half.
- From the first glance, we want (0 -> height/2). If you play with the values you can see that the exact text location is between (height/6 -> height/4)
Result will be:
```
$1,582
```
Code:

# Open the image and convert it to the gray-scale
finalImg = Image.open('hP5Pt.jpg').convert('L')

# Get height and width of the image
w, h = finalImg.size

# Get part of the desired text
finalImg = finalImg.crop((0, int(h/6), w, int(h/4)))

# Initialize ImageDraw class for displaying the detected rectangle in the image
finalImgDraw = ImageDraw.Draw(finalImg)

# OCR detection
d = pytesseract.image_to_data(finalImg, output_type=pytesseract.Output.DICT)

# Get ROI part from the detection
n_boxes = len(d['level'])

# For each detected part
for i in range(n_boxes):

    # Get the localized region
    (x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])

    # Initialize shape for displaying the current localized region
    shape = [(x, y), (w, h)]

    # Draw the region
    finalImgDraw.rectangle(shape, outline="red")

    # Display
    finalImg.show()

    # OCR "psm 6: Assume a single uniform block of text."
    txt = pytesseract.image_to_string(cropped, config="--psm 6")

    # Result
    print(txt)

If you can't get the same solution as mine, you need to check your pytesseract version, using:

print(pytesseract.get_tesseract_version())

For me the result is 4.1.1

Upvotes: 1

Issue recognizing text in image with pytesseract python module

Answers (1)

Related Questions