Reputation: 1704
We are using Search Index to run one of our API. The data to the index is populated using the Azure functions which pull data from the database. We could see that the number of records in the database and the Search Service is different. Is there any way to get the list of Keys in the Search Service so that we can compare with the database and see which keys are missing?
Regards,
John
Upvotes: 1
Views: 1376
Reputation: 2685
You can try to search "*". And use orderby and filter to get all data by following example. I use data metadata_storage_last_modified as filter.
offset skip time
0 --%--> 0
100,000 --%--> 100,000 getLastTime
101,000 --%--> 0 useLastTime
200,000 --%--> 99,000 useLastTime
201,000 --%--> 100,000 useLastTime & getLastTime
202,000 --%--> 0 useLastTime
Because Skip limit is 100k, so we can calculate skip by
AzureSearchSkipLimit = 100k
AzureSearchTopLimit = 1k
skip = offset % (AzureSearchSkipLimit + AzureSearchTopLimit)
If total search count will large than AzureSearchSkipLimit, then apply
orderby = "metadata_storage_last_modified desc"
When skip reach AzureSearchSkipLimit ,then get metadata_storage_last_modified time from end of data. And put metadata_storage_last_modified as next 100k search filer.
filter = metadata_storage_last_modified lt ${metadata_storage_last_modified}
Upvotes: 1
Reputation: 1681
The Azure Search query API is designed for search/filter scenarios, it doesn't offer an efficient way to traverse through all documents.
That said, you can do this reasonably by scanning the keys in order: if you have a field in your index (the key field or another one) that's both filterable and sortable, you can use $select to pull only the keys for each document, 1000 at a time, ordered by that field. After you retrieve the first 1000, don't do $skip (which will limit you to 100,000), instead use a filter that uses greater-than against the field, using the highest value you saw in the previous response. This will allow you to traverse the whole set at reasonable performance, although doing it 1000 at a time will take time.
Upvotes: 2