Stephen
Stephen

Reputation: 1617

How to encourage text detection in Google Cloud Document AI

While using the Google Cloud GUI interface to label some documents for Document AI training, I received an error message "Cannot create labels with empty values".

The image The image very clearly shows a well contrasted printed 1929. The OCR itself is capable of detecting this text, as it appears in the raw text output for this image.

What (if anything) can be done to help encourage the AI to detect this text?

Upvotes: 0

Views: 158

Answers (1)

Holt Skinner
Holt Skinner

Reputation: 2234

I wouldn't entirely agree that the "1929" in the scanned image is well-contrasted, but it is strange that this is not being detected when highlighting with the bounding box. I do notice that in the bounding box you have in the image, the "tail" of the 9 character is not entirely contained. It could be useful to try a larger bounding box to contain the full number, or make a correction to the detected value by manually inputting the value on the left-hand side of the labeling console.


Update: The documentation says to NOT correct values that are incorrectly read by OCR when using it for training purposes.

If the value of the label is not correctly detected by OCR, don't manually correct the value. That would render it unusable for training purposes.

Upvotes: 0

Related Questions