user1454265
user1454265

Reputation: 878

Sudden dropoff in Azure queue performance

Short version: What reasons could there be for a sudden, dramatic, and seemingly permanent increase in the rate of timing-out Azure queue requests?

It's going to be difficult to provide all of the details that could possibly be relevant here, but here's a start:

This is an Azure application (SDK v2.0) with a WCF service placing work requests on a queue (roughly 100k calls a day) and a couple of worker roles which process the queue. We've got New Relic monitoring with the latest .NET agent (3.3.38).

We've run into an issue in our latest release, deployed a few days ago -- after it ran normally for about 24 hours, all of a sudden we started seeing a greatly increased rate of timeouts when our worker roles fetch messages from the queue, along with a catastrophic drop in throughput (our application can now barely keep up with its own queue using 40 workers, whereas it usually gets by with just 2!) Ever since the timeouts started, they show no signs of letting up, keeping up at the same rate since it started happening.

A couple images from New Relic to illustrate:

enter image description here

enter image description here

While this isn't nearly enough information to provide a good answer, I'm just trying to figure out where I might start looking. I've got support tickets open with New Relic and Microsoft, but we're trying to investigate on our own as well. Could this be throttling? Some kind of resource exhaustion in my queue processor worker role? We don't see increased load on the WCF service, and we haven't changed Azure client libraries or changed much of anything in the code that processes the queue.

Upvotes: 0

Views: 154

Answers (1)

Vinay Shah - Microsoft
Vinay Shah - Microsoft

Reputation: 332

I suggest you enable analytics on your storage account to determine if the bottleneck is server side or client side/network related. Specifically, you can look at Storage Analytics Metrics table - AverageE2ELatency and AverageServerLatency properties to check if the issue is server side or client side.

You can learn more about Azure storage analytics from links below

Overview: http://msdn.microsoft.com/en-us/library/hh343270.aspx

How to enable in portal: http://azure.microsoft.com/en-us/documentation/articles/storage-monitor-storage-account/

Metrics table Schema: http://msdn.microsoft.com/en-us/library/hh343264.aspx

Blog post: http://blogs.msdn.com/b/windowsazurestorage/archive/2011/08/03/windows-azure-storage-analytics.aspx

Upvotes: 2

Related Questions