Reputation: 21
I am currently trying out Google Vision API and extracting text from an image of a form. Google Vision API extract everything on the form despite me setting up ROI on specific text location which I want. Is there a way to extract out the text that I want at specific location instead of the whole image?
Upvotes: 2
Views: 1178
Reputation: 131
def get_text_within(document, x1, y1, x2, y2):
text = ""
for page in document.pages:
for block in page.blocks:
for paragraph in block.paragraphs:
for word in paragraph.words:
for symbol in word.symbols:
min_x = min(symbol.bounding_box.vertices[0].x, symbol.bounding_box.vertices[1].x,
symbol.bounding_box.vertices[2].x, symbol.bounding_box.vertices[3].x)
max_x = max(symbol.bounding_box.vertices[0].x, symbol.bounding_box.vertices[1].x,
symbol.bounding_box.vertices[2].x, symbol.bounding_box.vertices[3].x)
min_y = min(symbol.bounding_box.vertices[0].y, symbol.bounding_box.vertices[1].y,
symbol.bounding_box.vertices[2].y, symbol.bounding_box.vertices[3].y)
max_y = max(symbol.bounding_box.vertices[0].y, symbol.bounding_box.vertices[1].y,
symbol.bounding_box.vertices[2].y, symbol.bounding_box.vertices[3].y)
if (min_x >= x1 and max_x <= x2 and min_y >= y1 and max_y <= y2):
text += symbol.text
if (symbol.property.detected_break.type == 1 or
symbol.property.detected_break.type == 3):
text += ' '
if (symbol.property.detected_break.type == 2):
text += '\t'
if (symbol.property.detected_break.type == 5):
text += '\n'
return text
Upvotes: 0
Reputation: 60
There is no way to extract text only from a specific location of an image using the Google Vision API, it always extracts the text from the whole image. However, if you want to extract the text from a specific location, you could try cropping the image before passing it to the API. Another option would be filtering out the results from the API call using the position of the four bounding vertices associated with each piece of text.
You can find more info on what is possible to do with the Google Vision API here.
Upvotes: 0