Alexander Shvetsov
Alexander Shvetsov

Reputation: 639

How to reduce Azure Table Storage latency?

I have a rather huge (30 mln rows, up to 5–100Kb each) Table on Azure.
Each RowKey is a Guid and PartitionKey is a first Guid part, for example:

PartitionKey = "1bbe3d4b"
RowKey = "1bbe3d4b-2230-4b4f-8f5f-fe5fe1d4d006"

Table has 600 reads and 600 writes (updates) per second with an average latency of 60ms. All queries use both PartitionKey and RowKey.
BUT, some reads take up to 3000ms (!). In average, >1% of all reads take more than 500ms and there's no correlation with entity size (100Kb row may be returned in 25ms and 10Kb one – in 1500ms).

My application is an ASP.Net MVC 4 web-site running on 4-5 Large instances.

I have read all MSDN articles regarding Azure Table Storage performance goals and already did the following:

Also I checked that:

What could be the reason of such performance issues and how to improve it?

Upvotes: 5

Views: 1546

Answers (2)

user152949
user152949

Reputation:

I use the MergeOption.NoTracking setting on the DataServiceContext.MergeOption property for extra performance if I have no intention of updating the entity anytime soon. Here is an example:

var account = CloudStorageAccount.Parse(RoleEnvironment.GetConfigurationSettingValue("DataConnectionString"));
var tableStorageServiceContext = new AzureTableStorageServiceContext(account.TableEndpoint.ToString(), account.Credentials);
tableStorageServiceContext.RetryPolicy = RetryPolicies.Retry(3, TimeSpan.FromSeconds(1));
tableStorageServiceContext.MergeOption = MergeOption.NoTracking;
tableStorageServiceContext.AddObject(AzureTableStorageServiceContext.CloudLogEntityName, newItem);
tableStorageServiceContext.SaveChangesWithRetries();

Another problem might be that you are retrieving the entire enity with all its properties even though you intend only use one or two properties - this is of course wasteful but can't be easily avoided. However, If you use Slazure then you can use query projections to only retrieve the entity properties that you are interested in from the table storage and nothing more, which would give you better query performance. Here is an example:

using SysSurge.Slazure;
using SysSurge.Slazure.Linq;
using SysSurge.Slazure.Linq.QueryParser;

namespace TableOperations
{
    public class MemberInfo
    {
        public string GetRichMembers()
        {
            // Get a reference to the table storage
            dynamic storage = new QueryableStorage<DynEntity>("UseDevelopmentStorage=true");

            // Build table query and make sure it only return members that earn more than $60k/yr
            // by using a "Where" query filter, and make sure that only the "Name" and 
            // "Salary" entity properties are retrieved from the table storage to make the
            // query quicker.   
            QueryableTable<DynEntity> membersTable = storage.WebsiteMembers;
            var memberQuery = membersTable.Where("Salary > 60000").Select("new(Name, Salary)");

            var result = "";

            // Cast the query result to a dynamic so that we can get access its dynamic properties
            foreach (dynamic member in memberQuery)
            {
                // Show some information about the member
                result += "LINQ query result: Name=" + member.Name + ", Salary=" + member.Salary + "<br>";
            }

            return result;
        }
    }
}

Full disclosure: I coded Slazure.

You could also consider pagination if you are retrieving large data sets, example:

// Retrieve 50 members but also skip the first 50 members
var memberQuery = membersTable.Where("Salary > 60000").Take(50).Skip(50);

Upvotes: 1

Typically, if a specific query requires scanning a large number of rows, that will take longer time. Is the behavior you are seeing specific a query / data? Or, are you seeing the performance varies for the same data and query?

Upvotes: 0

Related Questions