Reputation: 113
I've an image with some horizontal and vertical text. And I'm detecting text using tesseract OCR. And here is the array tesseract returns
'text': ['', '', '', '', 'Some', 'other', 'text', 'horizontal', '', '', '', 'JEDIY9A', ']xO]', 'WOPUeI', 'BWOS', 'SI', 'SIUL']
As you can see it only detect horizontal text correctly. So is there a way to force tesseract to detect only horizontal text? So later I will rotate the image by 90 and again pass image to detect vertical text(which is now horizontal).
Or is there a simple solution?
Upvotes: 1
Views: 3943
Reputation: 31
boxes = pytesseract.image_to_boxes(img, config='--psm 12')
PSM mode 12 helped to detect horizontal text only
Upvotes: 0
Reputation: 1689
You may consider manipulating the page segmentation modes. However, there's no guarantee it won't detect the rest of the text and recognize it incorrectly. So, I suggest that you use PSM 6 to read your image as a block and return the output as a data frame as follows:
import pytesseract
from pytesseract import Output
df = pytesseract.image_to_data(image, output_type=Output.DATAFRAME, config='--psm 6')
When you have a look at the output, you'll find a confidence score and line number for the text. You may use any of these to set a threshold and then recreate a list as you wish from the output like this:
import pandas as pd
filter_text = df[df['conf'] == 90]
text_list = df['text'].tolist()
You may also consider the spacing between the text using the column "left" for more accuracy. Data frames are always helpful. You just need to make the best use of them. :)
Upvotes: 0
Reputation: 7995
pytesseract
is not rotation-invariant. Therefore you need to do additional preprocessing to read the vertical text.
For instance: rotating text 90 degreee clock-wise.
img = cv2.rotate(img, cv2.ROTATE_90_CLOCKWISE)
When you read
print(pytesseract.image_to_string(gry).split("\n")[0])
Result:
This is some random text vertical
So how can you read the both text at the same time?
First read the horizontal text
import cv2
import pytesseract
img = cv2.imread("7IIrb.jpg")
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
txt = pytesseract.image_to_string(gry).split("\n")[0]
print(txt)
Then rorate the image 90 degree clock-wise
gry = cv2.rotate(gry, cv2.ROTATE_90_CLOCKWISE)
txt = pytesseract.image_to_string(gry).split("\n")[0]
print(txt)
Result:
Some other text horizontal
This is some random text vertical
Code:
import cv2
import pytesseract
img = cv2.imread("7IIrb.jpg")
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
txt = pytesseract.image_to_string(gry).split("\n")[0]
print(txt)
gry = cv2.rotate(gry, cv2.ROTATE_90_CLOCKWISE)
txt = pytesseract.image_to_string(gry).split("\n")[0]
print(txt)
Upvotes: 0
Reputation: 248
Read about the page segmentation you will it there . there is one valid value of psm that does exactly what you want....
Page segmentation modes:
0 Orientation and script detection (OSD) only.
1 Automatic page segmentation with OSD.
2 Automatic page segmentation, but no OSD, or OCR.
3 Fully automatic page segmentation, but no OSD. (Default)
4 Assume a single column of text of variable sizes.
5 Assume a single uniform block of vertically aligned text.
6 Assume a single uniform block of text.
7 Treat the image as a single text line.
8 Treat the image as a single word.
9 Treat the image as a single word in a circle.
10 Treat the image as a single character.
11 Sparse text. Find as much text as possible in no particular order.
12 Sparse text with OSD.
13 Raw line. Treat the image as a single text line,
bypassing hacks that are Tesseract-specific.
try --psm 6 or 12
or you can try this answer here is a solution that could work for you How do I detect vertical text with OpenCV for extraction
Upvotes: 3