Reputation: 83
I'm using google cloud vision OCR to detect text in an image. I tried .confidence after the text that google provided, but it always returns as 0.0
response = client.document_text_detection(image=image_googlecloud)
texts = response.text_annotations
texts[0].confidence == 0.0
###This is the part of output of the response variable (the last few lines)###
y: 2657
}
}
text: "E"
confidence: 1.0
}
confidence: 0.9900000095367432
}
confidence: 0.9900000095367432
}
block_type: TEXT
confidence: 0.9900000095367432
}
}
When I print the the response variable has all the confidence values (all greater than 0.0) but when I try to get the confidence of a certain word (in the method above) it returns 0.0. Is there a way around this to get the confidence of each word?
Upvotes: 1
Views: 599
Reputation: 1552
DOCUMENT_TEXT_DETECTION follows this hierarchy for extracted text structure:
TextAnnotation -> Page -> Block -> Paragraph -> Word -> Symbol.
So to get the confidence of each word you have to iterate through the structural components.
You can refer to the below mentioned code for getting the confidence of each word.
Text in my image : “GOOD MORNING A JOURNEY OF A THOUSAND MILES BEGIN WITH A SINGLE STEP.”
code:
def detect_document_uri(uri):
"""Detects document features in the file located in Google Cloud
Storage."""
from google.cloud import vision
client = vision.ImageAnnotatorClient()
image = vision.Image()
image.source.image_uri = uri
response = client.document_text_detection(image=image)
for page in response.full_text_annotation.pages:
for block in page.blocks:
for paragraph in block.paragraphs:
for word in paragraph.words:
words = ''.join([
symbol.text for symbol in word.symbols
])
print('Words: {} (confidence: {})'.format(
words, word.confidence))
if response.error.message:
raise Exception(
'{}\nFor more info on error messages, check: '
'https://cloud.google.com/apis/design/errors'.format(
response.error.message))
detect_document_uri("gs://your_bucket_name/image.jpg")
output:
Code for local machine :
def detect_document(path):
"""Detects document features in an image."""
from google.cloud import vision
import io
client = vision.ImageAnnotatorClient()
# [START vision_python_migration_document_text_detection]
with io.open(path, 'rb') as image_file:
content = image_file.read()
image = vision.Image(content=content)
response = client.document_text_detection(image=image)
for page in response.full_text_annotation.pages:
for block in page.blocks:
for paragraph in block.paragraphs:
for word in paragraph.words:
word_text = ''.join([
symbol.text for symbol in word.symbols
])
print('Word text: {} (confidence: {})'.format(
word_text, word.confidence))
if response.error.message:
raise Exception(
'{}\nFor more info on error messages, check: '
'https://cloud.google.com/apis/design/errors'.format(
response.error.message))
detect_document("path of image from local machine")
Upvotes: 1