Rory McCrossan
Rory McCrossan

Reputation: 337700

Unreliable performance in Azure Table Storage when joining point queries together

I am having a consistent problem with the performance of Azure Table Storage. I'm querying a table which holds user accounts. The table stores the userId in both the PartitionKey and RowKey so I can easily make point queries.

My issue is because in several cases I need to retrieve multiple users in a single query. To achieve that I have a class which builds filter strings for me. The manner which this works is not related to the problem, however this is an example of the output:

(PartitionKey eq '00540de6-dd2b-469f-8730-e7800e06ccc0' and RowKey eq '00540de6-dd2b-469f-8730-e7800e06ccc0') or 
(PartitionKey eq '02aa11b7-974a-4ee9-9a8e-5fc09970bb99' and RowKey eq '02aa11b7-974a-4ee9-9a8e-5fc09970bb99') or 
(PartitionKey eq '040aec50-ebcd-4e5d-8f58-82aea616bd82' and RowKey eq '040aec50-ebcd-4e5d-8f58-82aea616bd82') or 
// up to 22 more (25 total)

Upon first execution of the query it takes a long time to execute, between 2-5 seconds, and is missing data which is leading to errors. When run a second time the query takes between 0.2 and 0.5 seconds to complete and has all data contained within it.

Note that I also tried it just supplying just the PartitionKey, however it made no difference. I had assumed that a point query would perform better.

From this presentation of the bug I can only presume it's caused by the data being 'cold' when first requested and then pulled from a 'hot' cache upon successive requests.

If this is the case, how can I change the filter string to improve performance? Alternatively, how can I change the timeout of the table storage query to give it more time to complete? Is it possible to increase the scaling of my table storage?

Upvotes: 1

Views: 144

Answers (1)

Zhaoxing Lu
Zhaoxing Lu

Reputation: 6467

Please don't use point query strings concatenated with 'or', since Azure Storage Table can't treat it as multiple point queries. Instead, Azure Table will treat it as a full table scan, which is terrible in performance. You should execute 25 point queries respectively to improve performance.

Upvotes: 3

Related Questions