Reputation: 1
I'd like to have an azure function such that whenever there is a new document added to an Azure Cosmos DB service, it automatically adds this new document to an existing index of an Azure AI Search service to make this new document searchable.
The challenge is that the new documents could contain new fields or even nested new fields that don't exist in the index. My understanding is that we need to first add these fields to the index before adding the documents.
My question is if there is an easy way to handle such use cases. I could manually compare the fields of the new documents with the existing fields in the index and add the new fields to the index. However, handling the nested fields and making it robust could be cumbersome. I would imagine there already exists a function to do it as when you first create the index, Azure is able to extract the nested fields for you to pick.
Edit:
Function code is added:
@app.cosmos_db_trigger(arg_name="docs",
container_name="xxx",
database_name="xxxx",
lease_container_name="leases",
create_lease_container_if_not_exists="true",
connection="xxx")
def cosmosdb_trigger(docs: func.DocumentList):
logging.info('Python CosmosDB triggered.')
if docs:
logging.info(f'{len(docs)} documents modified.')
# Initialize the search client
service_name = 'xxx'
index_name = 'xxx'
search_client = SearchClient(endpoint= f'https://{service_name}.search.windows.net',
index_name=index_name,
credential=AzureKeyCredential(os.getenv('xxx')))
logging.info(f'search_client is initiated')
documents = [doc.to_dict() for doc in docs]
# todo: since documents may contain fields that don't exist in the index, update the fields of index here before uploading documents.
search_client.upload_documents(documents=documents)
logging.info(f'Documents sent to {service_name}')
Upvotes: 0
Views: 888
Reputation: 3478
The code below adds new documents to an Azure Cosmos DB service, which need to be automatically added to an existing Azure AI Search index, including potential new fields or nested fields.
The Cosmos DB function triggered will activate when there is a change in the Azure Cosmos DB collection. When a new document is added to the Cosmos DB collection, the document is retrieved.
The code below extracts the fields and nested fields from the document and compares the extracted fields with the existing fields in the Azure AI Search index.
If any new fields or nested fields are found, it updates the Azure AI Search index schema to include these new fields and adds the document to the Azure AI Search index.
Used packages: azure.core
, azure.cosmos
, and azure-search-documents
.
Used code from DOC Azure Cosmos DB trigger for Functions
def cosmos_db_trigger(documents):
for document in documents:
new_document = document['data']
fields = extract_fields(new_document)
index_name = "index"
search_service_endpoint = "Search service url"
search_api_key = "admin keys"
search_client = SearchIndexClient(search_service_endpoint, AzureKeyCredential(search_api_key))
existing_fields = get_existing_fields(search_client, index_name)
new_fields = compare_fields(fields, existing_fields)
if new_fields:
update_index_schema(search_client, index_name, new_fields)
add_document_to_index(search_client, index_name, new_document)
Output:
Upvotes: 0