pez
pez

Reputation: 83

Azure AI Search: Filtering DateTimeOffset field by month

Say I have an Azure AI Search index definition representing a blog post like below. (simplified for consiceness)

id: String
title: String
posted_at: DateTimeOffset

How can I filter the documents that are posted in the same month last year and beyond? For example, if it's May 2024, I want to get documents which are posted in May 2023, May 2022, ...etc

I was looking for something similar to month() function described in this page( https://learn.microsoft.com/en-us/odata/webapi/date-and-timeofday-support#query-examples ) but it's not available in AI Search.

azure.core.exceptions.HttpResponseError: () Invalid expression: Function 'month' is not supported.

I'm currently using Python SDK search API.

https://learn.microsoft.com/en-us/python/api/azure-search-documents/azure.search.documents.searchclient?view=azure-python#azure-search-documents-searchclient-search

Upvotes: 0

Views: 645

Answers (1)

Suresh Chikkam
Suresh Chikkam

Reputation: 3448

How can I filter the documents that are posted in the same month last year and beyond? For example, if it's May 2024, I want to get documents which are posted in May 2023, May 2022, ...etc

Use a combination of greater than or equal and less than operators along with specific date values that represent the start and end of the desired month.

  1. Determine the start and end dates for the desired month.
  2. Construct a filter expression that compares the DateTimeOffset field against these dates.

I was looking for something similar to month() function described in this page( ) but it's not available in AI Search.

Azure AI Search doesn't provide a built-in function like month() to extract the month component directly.

  • To get this programmatically using the Azure Python SDK check the below code.
from azure.search.documents import SearchClient
from datetime import datetime, timedelta

# Define your Azure Search endpoint and key
endpoint = "YOUR_SEARCH_ENDPOINT"
key = "YOUR_SEARCH_KEY"
index_name = "YOUR_INDEX_NAME"

# Create a SearchClient
client = SearchClient(endpoint=endpoint, index_name=index_name, credential=key)

# Define the current month and year
current_month = datetime.now().month
current_year = datetime.now().year

# Calculate start and end dates for the desired month
start_date = datetime(current_year, current_month, 1)
end_date = start_date + timedelta(days=32)  # Adding 32 days to cover the entire month

# Format dates as strings
start_date_str = start_date.strftime('%Y-%m-%dT00:00:00Z')
end_date_str = end_date.strftime('%Y-%m-%dT00:00:00Z')

# Construct filter expression
filter_expression = f"posted_at ge {start_date_str} and posted_at lt {end_date_str}"

# Execute the search query with the constructed filter expression
results = client.search(search_text="*", filter=filter_expression)

# Process and display search results
for result in results:
    print(result)
  • Use the filter expression to retrieve documents posted in May 2023.
filter=posted_at ge 2023-05-01T00:00:00Z and posted_at lt 2023-06-01T00:00:00Z

posted_at ge 2023-05-01T00:00:00Z checks that the documents are posted on or after May 1, 2023.

posted_at lt 2023-06-01T00:00:00Z checks that the documents are posted before June 1, 2023, effectively including all documents posted in May 2023.

Upvotes: 0

Related Questions