Reputation: 236
For the purpose of this question, let's pretend I have the following table:
Transaction:
Basically, a single de-normalized table with a million plus rows in it. It is important to note that only two columns will ever by updated: Profit and NumberOfTimesPurchased. When a sale is made, the NumberOfTimesPurchased will be updated and the new profit amount will be re-calculated.
Now, I need to do some minimal reporting on this table, which consists of queries that aggregate and group. As an example:
SELECT
City, AVG(UnitCost), AVG(SellAmount),
SUM(NumberOfTimesPurchased), AVG(Profit)
FROM
Transaction
GROUP BY
City
SELECT
State, AVG(UnitCost), AVG(SellAmount), SUM(NumberOfTimesPurchased),
AVG(Profit)
FROM
Transaction
GROUP BY
State
SELECT
Country, AVG(UnitCost), AVG(SellAmount), SUM(NumberOfTimesPurchased),
AVG(Profit)
FROM
Transaction
GROUP BY
Country
SELECT
ProductId, ProductName, AVG(UnitCost), AVG(SellAmount),
SUM(NumberOfTimesPurchased), AVG(Profit)
FROM
Transaction
GROUP BY
ProductId, ProductName
These queries are quick: ~1 second. However, I've noticed that under load, performance significantly drops (from 1 second up to a minute when there are 20+ concurrent requests), and I'm guessing the reason is that each query performs a full table scan.
I've attempted to use indexed views for each query, however my update statement performance takes a beating since each view needs to be rebuilt. On the same note, I've attempted to create covering indexes for each query, but again my update statement performance is not acceptable.
Assuming full table scans are the culprit, do I have any realistic options to get the query time down while keeping update performance at acceptable levels?
Note that I cannot use column store indexes (I'm using the cheaper version of Azure SQL Database). I'd also like to stay away from any sort of roll-up implementation, as I need the data available immediately.
Finally - the example above is not a completely accurate representation of my table. I have 20 or so different columns that can be 'grouped', and 6 columns that can be updated. No inserts or deletes.
Upvotes: 1
Views: 1183
Reputation: 14379
Because there are no WHERE
clauses on your queries, the database engine can nothing but a table scan (or clustered index scan which is really the same thing). If there were covering indexes with containing all the columns from your query, then the engine would prefer those. If your real queries have WHERE
clauses, then appropriate indexing with those columns as the leading columns of the index might help.
But I think your problem lies elsewhere. As far as concurrency goes you haven't put enough money in the meter. According to the main service tiers doc, the Basic tier for Azure SQL Database is for:
... supporting typically one single active operation at a given time. Examples include databases used for development or testing, or small-scale infrequently used applications.
Therefore you might want to think about splashing out for Premium edition to support both your concurrency requirement and columnstore indexes, which are perfectly suited to this type of query. Just for fun, I created a test-rig based on AdventureWorksDW2012 to try and recreate your problem which is here. Query performance was atrocious (> 20 secs). I'd be surprised if you weren't getting DTU warnings on your portal:
An upgrade to Standard (S0-S2) did boost performance so you should experiment. You could look at scaling up for busy query times and down when not required.
This table also looks a bit like a fact table, so you might want to consider refactoring this as a fact / dimensional model then use Azure Analysis Services on top to bring that sub-second performance.
Coincidentally there is a feedback item you can vote for to bring columnstore to Standard tier:
Recent comments suggest it is "in the work queue" as at May 2017;
Upvotes: 1