Yuriy Chachora
Yuriy Chachora

Reputation: 769

Specifying the document language in Google Document AI API

I'm trying to parse a handwritten document with Google Cloud Document AI. The document contains Cyrillic characters, however Document AI occasionally detects words with Latin characters. Is there way to specify the language of the document, so it will try to recognize the words in particular language regardless of the confidence?

Upvotes: 3

Views: 768

Answers (2)

Holt Skinner
Holt Skinner

Reputation: 2234

There was a recent update to Document AI that supports the languageHints parameter, which allows you to specify a language. Note: This only works when using the v1beta3 endpoint with the Document OCR processor at this time.

If the language is supported, then provide the BCP-47 code for the language in the processOptions field when sending the processing request.

Upvotes: 2

Prajna Rai T
Prajna Rai T

Reputation: 1818

These are the languages supported in Document AI.

Currently it's not possible to specify the language to recognize the words in a particular language in the document. It can only detect language.

If you want the feature to specify the language of the document to be implemented, you can open a new feature request on the issue tracker describing your requirement.

Upvotes: 2

Related Questions