Reputation: 55
Invoice Parser documentation lists the fields that this processor attempts to find in a document. https://cloud.google.com/document-ai/docs/processors-list?hl=en_US#processor_invoice-processor
I am only interested in some of the fields in the Invoice processor, is there a way to get the list of fields and request the processor to find the data points for a selected set of fields?
E.g. I am only interested in invoice_date and invoice_id. So, the processor only finds these two fields for me and does not waste its time finding others.
I found an endpoint that retrieves the processor's details but it does not get the field list. https://cloud.google.com/document-ai/docs/reference/rest/v1beta3/projects.locations.processors/get
This process endpoint does not have anything to specify the fields that I am interested in. https://cloud.google.com/document-ai/docs/reference/rest/v1beta3/projects.locations.processors/process
My research indicates that this feature does not exist, hoping someone tells me that I am wrong.
Upvotes: 0
Views: 811
Reputation: 2234
Currently, it's not possible to limit the specific entities that are extracted by a specialized processor. The processor will always processes the entire document and extract everything it can find.
However, a new parameter has recently been added to the process
method in the REST API that lets you provide a fieldMask
to limit the fields that will be returned in the Document
object. This has also been added to the How-to Guide for making a processing request. Currently, this only works for the REST API (not client libraries) and online processing.
This isn't exactly what you're looking for, but it can significantly limit the size of the Document
response if you don't need fields like content
Upvotes: 0