Reputation: 41
I"m trying to understand why the Microsoft.Azure.Documents.Client makes multiple calls when running a query.
var option = new FeedOptions { EnableCrossPartitionQuery = true, MaxItemCount = 100};
var myobj = cosmosClient.CreateDocumentQuery<myCosmosObj>(documentUri, option)
.Where(x => x.ID == request.Id);
while (myobj.AsDocumentQuery().HasMoreResults)
{
var results = await myobj.AsDocumentQuery().ExecuteNextAsync<myCosmosObj>();
resultList.AddRange(results);
}
A Fiddler trace shows 5 calls to the cosmos collection dbs/mycollectionname/colls/docs (the while loop above runs 5 times)
My question is 1 network hop would improve performance, so I would like to understand why it is making 5 network calls, and If there is something I need to do with the configuration to adjust this. I have already tried adjusting the ResultSize. This is roughly a 3GB collection.
Upvotes: 1
Views: 445
Reputation: 7200
David's answer is theoretically correct however it is missing a crucial point.
Your code is wrong. The way your create the document query inside the loop means that you will always query the result of the first execution 5 times.
The code should actually be like this:
var query = cosmosClient.CreateDocumentQuery<myCosmosObj>(documentUri, option)
.Where(x => x.ID == request.Id).AsDocumentQuery();
while (query.HasMoreResults)
{
var results = await query.ExecuteNextAsync<myCosmosObj>();
resultList.AddRange(results);
}
This will now properly run your query and it will use the continuation properties of the query object in order to read the next page in ExecuteNextAsync
.
Upvotes: 2
Reputation: 71035
With a partitioned collection, the most efficient way to find a document by id is by also specifying the partition key (which then directs your query to a single partition). Without PK, there's really no way to know, up front, which partition your documents will reside in. And that's likely why you're seeing 5 calls (you likely have 5 partitions).
The alternative, which your code shows, is to do a cross-partition query, which has to do one query per partition, to seek the document you're looking for.
One more thing to note: A query will have higher RU cost than a Read. And if you already know the partition key and id, there's no need to invoke the query engine (as you can only retrieve a single document anyway, for a given partition key + row key combination).
Upvotes: 1