Reputation: 21
I'm working on an Azure Function that triggers when a document is marked as deleted in an Azure Cosmos DB container. When this happens, I want to delete the associated messages in Cosmos DB and also remove corresponding vector entries from Azure Cognitive Search.
Here's the relevant part of my code for the Azure Cognitive Search deletion:
import azure.functions as func
from azure.cosmos import CosmosClient, exceptions
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
import os
import logging
chat_deletion_trigger = func.Blueprint()
@chat_deletion_trigger.cosmos_db_trigger(
connection="CosmosDBConnectionString",
database_name="ChatDatabase",
container_name="Chats",
lease_container_name="leases",
create_lease_container_if_not_exists=True,
arg_name="documents"
)
def ChatDeletionTrigger(documents: func.DocumentList) -> None:
logging.info("ChatDeletionTrigger function started processing.")
cosmos_client = CosmosClient.from_connection_string(os.environ['CosmosDBConnectionString'])
database_name = 'ChatDatabase'
database = cosmos_client.get_database_client(database_name)
messages_container = database.get_container_client('Messages')
search_client = SearchClient(
endpoint=os.getenv('SearchServiceEndpoint'),
index_name="azureblob-index",
credential=AzureKeyCredential(os.getenv('SearchServiceKey'))
)
for document in documents:
chat_id = document.get('id')
is_deleted = document.get('isDeleted', False)
if not chat_id:
logging.error("Document does not have an 'id' field. Skipping document.")
continue
if not is_deleted:
logging.info(f"Chat with id: {chat_id} is not marked as deleted. Skipping document.")
continue
logging.info(f"Processing deletion of messages and vector entries associated with chat id: {chat_id}")
try:
messages_container.scripts.execute_stored_procedure(
sproc="deleteMessagesByChatId",
params=[chat_id],
partition_key=chat_id
)
logging.info(f"Deleted messages associated with chatId {chat_id} using stored procedure.")
except exceptions.CosmosHttpResponseError as e:
logging.error(f"Error deleting messages for chat id {chat_id}: {str(e)}")
try:
results = search_client.search(search_text="", filter=f"chatId eq '{chat_id}'")
document_ids = [doc['documentId'] for doc in results]
if document_ids:
delete_actions = [{"@search.action": "delete", "documentId": doc_id} for doc_id in document_ids]
batch = {"value": delete_actions}
search_client.index_documents(batch=batch)
logging.info(f"Deleted vector entries for chatId: {chat_id}")
else:
logging.info(f"No vector entries found for chatId: {chat_id}")
except Exception as e:
logging.error(f"Error deleting vector entries for chat id {chat_id}: {str(e)}")
logging.info("ChatDeletionTrigger function completed processing.")
Issue: When I attempt to delete the vector entries associated with a chatId, I receive the following error: Error deleting vector entries for chat id {chat_id}: SearchClient.index_documents() missing 1 required positional argument: 'batch'
I checked the Azure Cognitive Search documentation but didn't find a clear explanation for this issue.
Question: How can I correctly pass the batch of delete actions to the index_documents method in Azure Cognitive Search? Am I structuring the batch parameter incorrectly, or is there something else I'm missing? Any help would be appreciated!
Upvotes: 0
Views: 129
Reputation: 21
To resolve the issue you're encountering when attempting to delete vector entries in Azure Cognitive Search, you need to correctly structure the batch of delete actions before passing it to the index_documents
method.
The error occurs because the index_documents
method expects an IndexDocumentsBatch
object or a list of documents to index. In your original code, you're passing a dictionary (batch = {"value": delete_actions}
), which is incorrect.
You should use the IndexDocumentsBatch
class to create a batch of delete actions and then pass it to the index_documents
method. Here's how you can modify your code to fix the issue:
import azure.functions as func
from azure.cosmos import CosmosClient, exceptions
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient, IndexDocumentsBatch
import os
import logging
chat_deletion_trigger = func.Blueprint()
@chat_deletion_trigger.cosmos_db_trigger(
connection="CosmosDBConnectionString",
database_name="ChatDatabase",
container_name="Chats",
lease_container_name="leases",
create_lease_container_if_not_exists=True,
arg_name="documents"
)
def ChatDeletionTrigger(documents: func.DocumentList) -> None:
logging.info("ChatDeletionTrigger function started processing.")
# Initialize Cosmos DB client
cosmos_client = CosmosClient.from_connection_string(os.environ['CosmosDBConnectionString'])
database_name = 'ChatDatabase'
database = cosmos_client.get_database_client(database_name)
messages_container = database.get_container_client('Messages')
# Initialize Azure Cognitive Search client
search_client = SearchClient(
endpoint=os.getenv('SearchServiceEndpoint'),
index_name="azureblob-index", # Replace with your actual index name
credential=AzureKeyCredential(os.getenv('SearchServiceKey'))
)
for document in documents:
chat_id = document.get('id')
is_deleted = document.get('isDeleted', False)
if not chat_id:
logging.error("Document does not have an 'id' field. Skipping document.")
continue
# Check if the chat is marked as deleted
if not is_deleted:
logging.info(f"Chat with id: {chat_id} is not marked as deleted. Skipping document.")
continue
logging.info(f"Processing deletion of messages and vector entries associated with chat id: {chat_id}")
try:
# Call the stored procedure to delete all messages by chatId
messages_container.scripts.execute_stored_procedure(
sproc="deleteMessagesByChatId",
params=[chat_id],
partition_key=chat_id # Assuming partition key is chatId
)
logging.info(f"Deleted messages associated with chatId {chat_id} using stored procedure.")
except exceptions.CosmosHttpResponseError as e:
logging.error(f"Error deleting messages for chat id {chat_id}: {str(e)}")
try:
# Fetch the documents to identify the correct document IDs
results = search_client.search(search_text="", filter=f"chatId eq '{chat_id}'")
# Create a list of document IDs to delete
document_ids = [{"metadata_storage_path": doc['metadata_storage_path']} for doc in results]
if document_ids:
# Create a batch and add delete actions
batch = IndexDocumentsBatch()
batch.add_delete_actions(*document_ids) # Use correct key field
# Execute the batch delete operation
search_client.index_documents(batch=batch)
logging.info(f"Deleted vector entries for chatId: {chat_id}")
else:
logging.info(f"No vector entries found for chatId: {chat_id}")
except Exception as e:
logging.error(f"Error deleting vector entries for chat id {chat_id}: {str(e)}")
logging.info("ChatDeletionTrigger function completed processing.")
IndexDocumentsBatch
: The code now creates an IndexDocumentsBatch
object and adds delete actions using the add_delete_actions
method.metadata_storage_path
in this case).index_documents
: The index_documents
method now correctly receives the batch of delete actions.Upvotes: 0
Reputation: 6487
Modify your code as given below to get the expected response and use delete_documents method.
import azure.functions as func
import logging
from azure.cosmos import CosmosClient
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
import os
app = func.FunctionApp()
@app.cosmos_db_trigger(arg_name="documents", container_name="Chats",
database_name="ChatDatabase", connection="CosmosDBConnectionString",
lease_container_name="leases",create_lease_container_if_not_exists=True)
def ChatDeletionTrigger(documents: func.DocumentList):
logging.info('Python CosmosDB triggered.')
cosmos_client = CosmosClient.from_connection_string(os.environ['CosmosDBConnectionString'])
database_name = 'ChatDatabase'
database = cosmos_client.get_database_client(database_name)
messages_container = database.get_container_client('Chats')
search_client = SearchClient(
endpoint="https://******.search.windows.net",
index_name="azureblob-index",
credential=AzureKeyCredential("SearchServiceKey")
)
for document in documents:
chat_id = document.get('id')
if not chat_id:
logging.error("Document does not have an 'id' field. Skipping document.")
continue
logging.info(f"Processing deletion of messages and vector entries associated with chat id: {chat_id}")
try:
results = search_client.search(search_text="", filter=f"chatId eq '{chat_id}'")
document_ids = [doc['documentId'] for doc in results]
if document_ids:
delete_actions = [{"@search.action": "delete", "documentId": doc_id} for doc_id in document_ids]
search_client.delete_documents(documents=delete_actions)
logging.info(f"Deleted vector entries for chatId: {chat_id}")
else:
logging.info(f"No vector entries found for chatId: {chat_id}")
except Exception as e:
logging.error(f"Error deleting vector entries for chat id {chat_id}: {str(e)}")
logging.info("ChatDeletionTrigger function completed processing.")
I am able to delete a document by using the document Id.
Azure Functions Core Tools
Core Tools Version: 4.0.5907 Commit hash: N/A +807e89766a92b14fd07b9f0bc2bea1d8777ab209 (64-bit)
Function Runtime Version: 4.834.3.22875
[2024-08-28T10:34:44.071Z] 0.02s - Debugger warning: It seems that frozen modules are being used, which may
[2024-08-28T10:34:44.073Z] 0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
[2024-08-28T10:34:44.073Z] 0.00s - to python to disable frozen modules.
[2024-08-28T10:34:44.074Z] 0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
[2024-08-28T10:34:44.165Z] Worker process started and initialized.
Functions:
ChatDeletionTrigger: cosmosDBTrigger
For detailed output, run func with --verbose flag.
[2024-08-28T10:34:49.093Z] Host lock lease acquired by instance ID '0000000000000000000000000D2022A4'.
[2024-08-28T10:35:06.852Z] Executing 'Functions.ChatDeletionTrigger' (Reason='New changes on container Chats at 2024-08-28T10:35:06.8097381Z', Id=5d5c5b7b-bc06-4383-bb77-67bfaab2dc32)
[2024-08-28T10:35:06.963Z] Python CosmosDB triggered.
[2024-08-28T10:35:07.748Z] Request URL: 'https://******.documents.azure.com:443/'
Request method: 'GET'
Request headers:
'Cache-Control': 'no-cache'
'x-ms-version': 'REDACTED'
'x-ms-documentdb-query-iscontinuationexpected': 'REDACTED'
'x-ms-date': 'REDACTED'
'authorization': 'REDACTED'
'Accept': 'application/json'
'Content-Length': '0'
'User-Agent': 'azsdk-python-cosmos/4.7.0 Python/3.11.9 (Windows-10-10.0.22631-SP0)'
No body was attached to the request
[2024-08-28T10:35:08.655Z] Response status: 200
Response headers:
'Cache-Control': 'no-store, no-cache'
'Pragma': 'no-cache'
'Transfer-Encoding': 'chunked'
'Content-Type': 'application/json'
'Content-Location': 'REDACTED'
'Server': 'Microsoft-HTTPAPI/2.0'
'x-ms-max-media-storage-usage-mb': 'REDACTED'
'x-ms-media-storage-usage-mb': 'REDACTED'
'x-ms-databaseaccount-consumed-mb': 'REDACTED'
'x-ms-databaseaccount-reserved-mb': 'REDACTED'
'x-ms-databaseaccount-provisioned-mb': 'REDACTED'
'Strict-Transport-Security': 'REDACTED'
'x-ms-gatewayversion': 'REDACTED'
'Date': 'Wed, 28 Aug 2024 10:35:06 GMT'
[2024-08-28T10:35:08.668Z] Processing deletion of messages and vector entries associated with chat id: 123
[2024-08-28T10:35:08.672Z] Request URL: 'https://******.search.windows.net/indexes('azureblob-index')/docs/search.post.search?api-version=REDACTED'
Request method: 'POST'
Request headers:
'Content-Type': 'application/json'
'Content-Length': '43'
'api-key': 'REDACTED'
'Accept': 'application/json;odata.metadata=none'
'x-ms-client-request-id': '32787a9a-6529-11ef-898d-7c214ae5d066'
'User-Agent': 'azsdk-python-search-documents/11.5.1 Python/3.11.9 (Windows-10-10.0.22631-SP0)'
A body is sent with the request
[2024-08-28T10:35:09.711Z] Response status: 200
Response headers:
'Transfer-Encoding': 'chunked'
'Content-Type': 'application/json; odata.metadata=none; odata.streaming=true; charset=utf-8'
'Content-Encoding': 'REDACTED'
'Vary': 'REDACTED'
'Server': 'Microsoft-IIS/10.0'
'Strict-Transport-Security': 'REDACTED'
'Preference-Applied': 'REDACTED'
'OData-Version': 'REDACTED'
'request-id': '32787a9a-6529-11ef-898d-7c214ae5d066'
'elapsed-time': 'REDACTED'
'Date': 'Wed, 28 Aug 2024 10:35:08 GMT'
[2024-08-28T10:35:09.724Z] Request URL: 'https://******.search.windows.net/indexes('azureblob-index')/docs/search.index?api-version=REDACTED'
Request method: 'POST'
Request headers:
'Content-Type': 'application/json'
'Content-Length': '63'
'api-key': 'REDACTED'
'Accept': 'application/json;odata.metadata=none'
'x-ms-client-request-id': '3319fce2-6529-11ef-89cd-7c214ae5d066'
'User-Agent': 'azsdk-python-search-documents/11.5.1 Python/3.11.9 (Windows-10-10.0.22631-SP0)'
A body is sent with the request
[2024-08-28T10:35:09.980Z] Response status: 200
Response headers:
'Transfer-Encoding': 'chunked'
'Content-Type': 'application/json; odata.metadata=none; odata.streaming=true; charset=utf-8'
'Content-Encoding': 'REDACTED'
'Vary': 'REDACTED'
'Server': 'Microsoft-IIS/10.0'
'Strict-Transport-Security': 'REDACTED'
'Preference-Applied': 'REDACTED'
'OData-Version': 'REDACTED'
'request-id': '3319fce2-6529-11ef-89cd-7c214ae5d066'
'elapsed-time': 'REDACTED'
'Date': 'Wed, 28 Aug 2024 10:35:08 GMT'
[2024-08-28T10:35:09.984Z] ChatDeletionTrigger function completed processing.
[2024-08-28T10:35:09.983Z] Deleted vector entries for chatId: 123
[2024-08-28T10:35:10.048Z] Executed 'Functions.ChatDeletionTrigger' (Succeeded, Id=5d5c5b7b-bc06-4383-bb77-67bfaab2dc32, Duration=3214ms)
Upvotes: 0