M.K. Kim
M.K. Kim

Reputation: 531

Cosmos db query cost and performance depends on the number of items in the same partition?

I have a simple JSON data as below {"id":beos033mxle-1232, "name": "John Doe", "city":"Denver", "gender": "m"} and the partition key is /city

I was wondering if I run the simple query as below SELECT * FROM User u WHERE u.name = "John Doe" AND u.city = "Denver" does the number of items in the same partition (in this case Denver) affect the cost and performance of this query? (Finding 1 item of 10 items vs 100000 items)

And if I run the query without partition key say SELECT * FROM User u WHERE u.name = "John Doe" which will query across all the partitions and the number of items in each partition will affect the cost and performance in this case as well?

Upvotes: 0

Views: 1279

Answers (1)

Mark Brown
Mark Brown

Reputation: 8793

For sure the more data the more expensive but when doing an in-partition query and then filtering on a property with that is indexed it's not bad as the index is very efficient.

Cross partition queries get more expensive the more physical partitions you have. So generally yes, the more data, the more expensive. This is why a good partition key is so critical to scale. Because this type of data store is "scale out". So the more data, the more computers your data sits on. The more computers you have to query to find your data, the more expensive it is.

If you have a small database 20GB or less, then cross partition queries aren't such a problem because all your data is sitting on a single physical partition which has a max size of 50 GB.

Upvotes: 1

Related Questions