Reputation: 337700
I am having a consistent problem with the performance of Azure Table Storage. I'm querying a table which holds user accounts. The table stores the userId in both the PartitionKey
and RowKey
so I can easily make point queries.
My issue is because in several cases I need to retrieve multiple users in a single query. To achieve that I have a class which builds filter strings for me. The manner which this works is not related to the problem, however this is an example of the output:
(PartitionKey eq '00540de6-dd2b-469f-8730-e7800e06ccc0' and RowKey eq '00540de6-dd2b-469f-8730-e7800e06ccc0') or
(PartitionKey eq '02aa11b7-974a-4ee9-9a8e-5fc09970bb99' and RowKey eq '02aa11b7-974a-4ee9-9a8e-5fc09970bb99') or
(PartitionKey eq '040aec50-ebcd-4e5d-8f58-82aea616bd82' and RowKey eq '040aec50-ebcd-4e5d-8f58-82aea616bd82') or
// up to 22 more (25 total)
Upon first execution of the query it takes a long time to execute, between 2-5 seconds, and is missing data which is leading to errors. When run a second time the query takes between 0.2 and 0.5 seconds to complete and has all data contained within it.
Note that I also tried it just supplying just the PartitionKey
, however it made no difference. I had assumed that a point query would perform better.
From this presentation of the bug I can only presume it's caused by the data being 'cold' when first requested and then pulled from a 'hot' cache upon successive requests.
If this is the case, how can I change the filter string to improve performance? Alternatively, how can I change the timeout of the table storage query to give it more time to complete? Is it possible to increase the scaling of my table storage?
Upvotes: 1
Views: 144
Reputation: 6467
Please don't use point query strings concatenated with 'or', since Azure Storage Table can't treat it as multiple point queries. Instead, Azure Table will treat it as a full table scan, which is terrible in performance. You should execute 25 point queries respectively to improve performance.
Upvotes: 3