Bijaya Rai
Bijaya Rai

Reputation: 55

Getting list of fields detected from Invoice Parser google Document AI Processor

Invoice Parser documentation lists the fields that this processor attempts to find in a document. https://cloud.google.com/document-ai/docs/processors-list?hl=en_US#processor_invoice-processor

I am only interested in some of the fields in the Invoice processor, is there a way to get the list of fields and request the processor to find the data points for a selected set of fields?

E.g. I am only interested in invoice_date and invoice_id. So, the processor only finds these two fields for me and does not waste its time finding others.

I found an endpoint that retrieves the processor's details but it does not get the field list. https://cloud.google.com/document-ai/docs/reference/rest/v1beta3/projects.locations.processors/get

This process endpoint does not have anything to specify the fields that I am interested in. https://cloud.google.com/document-ai/docs/reference/rest/v1beta3/projects.locations.processors/process

My research indicates that this feature does not exist, hoping someone tells me that I am wrong.

Upvotes: 0

Views: 811

Answers (1)

Holt Skinner
Holt Skinner

Reputation: 2234

Currently, it's not possible to limit the specific entities that are extracted by a specialized processor. The processor will always processes the entire document and extract everything it can find.

However, a new parameter has recently been added to the process method in the REST API that lets you provide a fieldMask to limit the fields that will be returned in the Document object. This has also been added to the How-to Guide for making a processing request. Currently, this only works for the REST API (not client libraries) and online processing.

This isn't exactly what you're looking for, but it can significantly limit the size of the Document response if you don't need fields like content

Upvotes: 0

Related Questions