Google Cloud Vision accuracy for each text returns 0.0

Question

I'm using google cloud vision OCR to detect text in an image. I tried .confidence after the text that google provided, but it always returns as 0.0

response = client.document_text_detection(image=image_googlecloud)
texts = response.text_annotations

texts[0].confidence == 0.0

###This is the part of output of the response variable (the last few lines)###
                y: 2657
              }
            }
            text: "E"
            confidence: 1.0
          }
          confidence: 0.9900000095367432
        }
        confidence: 0.9900000095367432
      }
      block_type: TEXT
      confidence: 0.9900000095367432
    }
  }

When I print the the response variable has all the confidence values (all greater than 0.0) but when I try to get the confidence of a certain word (in the method above) it returns 0.0. Is there a way around this to get the confidence of each word?

Sandeep Mohanty · Accepted Answer

DOCUMENT_TEXT_DETECTION follows this hierarchy for extracted text structure:

TextAnnotation -> Page -> Block -> Paragraph -> Word -> Symbol.

So to get the confidence of each word you have to iterate through the structural components.

You can refer to the below mentioned code for getting the confidence of each word.

Text in my image : “GOOD MORNING A JOURNEY OF A THOUSAND MILES BEGIN WITH A SINGLE STEP.”

code:

def detect_document_uri(uri):
   """Detects document features in the file located in Google Cloud
   Storage."""
   from google.cloud import vision
   client = vision.ImageAnnotatorClient()
   image = vision.Image()
   image.source.image_uri = uri

   response = client.document_text_detection(image=image)

   for page in response.full_text_annotation.pages:
       for block in page.blocks:
          
           for paragraph in block.paragraphs:
              
               for word in paragraph.words:
                   words = ''.join([
                       symbol.text for symbol in word.symbols
                   ])
                   print('Words: {} (confidence: {})'.format(
                       words, word.confidence))

   if response.error.message:
       raise Exception(
           '{}
For more info on error messages, check: '
           'https://cloud.google.com/apis/design/errors'.format(
               response.error.message))

detect_document_uri("gs://your_bucket_name/image.jpg")

output:

Code for local machine :

def detect_document(path):
    """Detects document features in an image."""
    from google.cloud import vision
    import io
    client = vision.ImageAnnotatorClient()

    # [START vision_python_migration_document_text_detection]
    with io.open(path, 'rb') as image_file:
        content = image_file.read()

    image = vision.Image(content=content)

    response = client.document_text_detection(image=image)

    for page in response.full_text_annotation.pages:
        for block in page.blocks:
            
            for paragraph in block.paragraphs:
               
                for word in paragraph.words:
                    word_text = ''.join([
                        symbol.text for symbol in word.symbols
                    ])
                    print('Word text: {} (confidence: {})'.format(
                        word_text, word.confidence))

                    
    if response.error.message:
        raise Exception(
            '{}
For more info on error messages, check: '
            'https://cloud.google.com/apis/design/errors'.format(
                response.error.message))
                
detect_document("path of image from local machine")

Output :

Google Cloud Vision accuracy for each text returns 0.0

Answers (1)

Related Questions