Document Intelligence custom model not applying

Question

I am new to coding and trying to use my custom extraction template model in python file to read more documents. I made the model using Document Intelligence Studio, applied my own labels, trained it and tested and it seemed to work fine. I tried using model ID to read pdfs from my local folder using python app and it didn't analyse documents with the labels from the model. Instead it seemed that it used auto-labeling which I don't want because I would've used a sample model then. Is the fault in my code? Or maybe something else? Here it is (I mainly used the code provided by Document Intelligence Studio, maybe I should change it?):

endpoint = ""
key = ""

model_id = ""
formUrl = ""

document_analysis_client = DocumentAnalysisClient(
    endpoint=endpoint, credential=AzureKeyCredential(key)
)

# Initialize the DocumentAnalysisClient with endpoint and key
document_analysis_client = DocumentAnalysisClient(
        endpoint=endpoint, credential=AzureKeyCredential(key)
)

# Make sure your document's type is included in the list of document types the custom model can analyze
poller = document_analysis_client.begin_analyze_document_from_url(model_id, formUrl)
result = poller.result()

for idx, document in enumerate(result.documents):
    print("--------Analyzing document #{}--------".format(idx + 1))
    print("Document has type {}".format(document.doc_type))
    print("Document has confidence {}".format(document.confidence))
    print("Document was analyzed by model with ID {}".format(result.model_id))
    for name, field in document.fields.items():
        field_value = field.value if field.value else field.content
        print("......found field of type '{}' with value '{}' and with confidence {}".format(field.value_type, field_value, field.confidence))


# iterate over tables, lines, and selection marks on each page
for page in result.pages:
    print("
Lines found on page {}".format(page.page_number))
    for line in page.lines:
        print("...Line '{}'".format(line.content.encode('utf-8')))
    for word in page.words:
        print(
            "...Word '{}' has a confidence of {}".format(
                word.content.encode('utf-8'), word.confidence
            )
        )
    for selection_mark in page.selection_marks:
        print(
            "...Selection mark is '{}' and has a confidence of {}".format(
                selection_mark.state, selection_mark.confidence
            )
        )

for i, table in enumerate(result.tables):
    print("
Table {} can be found on page:".format(i + 1))
    for region in table.bounding_regions:
        print("...{}".format(i + 1, region.page_number))
    for cell in table.cells:
        print(
            "...Cell[{}][{}] has content '{}'".format(
                cell.row_index, cell.column_index, cell.content.encode('utf-8')
            )
        )
print("-----------------------------------")

I tried changing types of models I have on Document Intelligence Studio but nothing helped, so I figured it has to do something with my code... maybe I have to specify labeling? But dunno...

Document Intelligence custom model not applying

Answers (1)

Related Questions