PSEUDO
PSEUDO

Reputation: 113

Detect only horizontal text with Tesseract

I've an image with some horizontal and vertical text. And I'm detecting text using tesseract OCR. And here is the array tesseract returns

'text': ['', '', '', '', 'Some', 'other', 'text', 'horizontal', '', '', '', 'JEDIY9A', ']xO]', 'WOPUeI', 'BWOS', 'SI', 'SIUL']

As you can see it only detect horizontal text correctly. So is there a way to force tesseract to detect only horizontal text? So later I will rotate the image by 90 and again pass image to detect vertical text(which is now horizontal).

Or is there a simple solution?

Image with horizontal & vertical text

Upvotes: 1

Views: 3943

Answers (4)

Bapuji Nakka
Bapuji Nakka

Reputation: 31

boxes = pytesseract.image_to_boxes(img, config='--psm 12')

PSM mode 12 helped to detect horizontal text only

Upvotes: 0

Esraa Abdelmaksoud
Esraa Abdelmaksoud

Reputation: 1689

You may consider manipulating the page segmentation modes. However, there's no guarantee it won't detect the rest of the text and recognize it incorrectly. So, I suggest that you use PSM 6 to read your image as a block and return the output as a data frame as follows:

import pytesseract
from pytesseract import Output
df = pytesseract.image_to_data(image, output_type=Output.DATAFRAME, config='--psm 6')

When you have a look at the output, you'll find a confidence score and line number for the text. You may use any of these to set a threshold and then recreate a list as you wish from the output like this:

import pandas as pd
filter_text = df[df['conf'] == 90]
text_list = df['text'].tolist()

You may also consider the spacing between the text using the column "left" for more accuracy. Data frames are always helpful. You just need to make the best use of them. :)

Upvotes: 0

Ahx
Ahx

Reputation: 7995

pytesseract is not rotation-invariant. Therefore you need to do additional preprocessing to read the vertical text.

For instance: rotating text 90 degreee clock-wise.

img = cv2.rotate(img, cv2.ROTATE_90_CLOCKWISE)

When you read

print(pytesseract.image_to_string(gry).split("\n")[0])

Result:

This is some random text vertical

So how can you read the both text at the same time?

  • First read the horizontal text

    • import cv2
      import pytesseract
      
      img = cv2.imread("7IIrb.jpg")
      gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
      txt = pytesseract.image_to_string(gry).split("\n")[0]
      print(txt)
      
    • Then rorate the image 90 degree clock-wise

      • gry = cv2.rotate(gry, cv2.ROTATE_90_CLOCKWISE)
        txt = pytesseract.image_to_string(gry).split("\n")[0]
        print(txt)
        
    • Result:

      Some other text horizontal
      This is some random text vertical
      

Code:


import cv2
import pytesseract

img = cv2.imread("7IIrb.jpg")
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
txt = pytesseract.image_to_string(gry).split("\n")[0]
print(txt)
gry = cv2.rotate(gry, cv2.ROTATE_90_CLOCKWISE)
txt = pytesseract.image_to_string(gry).split("\n")[0]
print(txt)

Upvotes: 0

Sachin Rajput
Sachin Rajput

Reputation: 248

Read about the page segmentation you will it there . there is one valid value of psm that does exactly what you want....

Page segmentation modes:
  0    Orientation and script detection (OSD) only.
  1    Automatic page segmentation with OSD.
  2    Automatic page segmentation, but no OSD, or OCR.
  3    Fully automatic page segmentation, but no OSD. (Default)
  4    Assume a single column of text of variable sizes.
  5    Assume a single uniform block of vertically aligned text.
  6    Assume a single uniform block of text.
  7    Treat the image as a single text line.
  8    Treat the image as a single word.
  9    Treat the image as a single word in a circle.
 10    Treat the image as a single character.
 11    Sparse text. Find as much text as possible in no particular order.
 12    Sparse text with OSD.
 13    Raw line. Treat the image as a single text line,
                        bypassing hacks that are Tesseract-specific.

try --psm 6 or 12

or you can try this answer here is a solution that could work for you How do I detect vertical text with OpenCV for extraction

Upvotes: 3

Related Questions