Reputation: 15306
The following query returns a correct result but how do I get the same result faster?
The goal is to output a table for tracking sellers progress by summarizing their sales today, this week, month and quarter.
SellerID Today ThisWeek ThisMonth ThisQuarter
----------- --------------------- --------------------- --------------------- ---------------------
1 400,00 700,00 900,00 900,00
2 950,00 1850,00 2650,00 2650,00
My query:
CREATE TABLE #sales(
[Price] MONEY,
[Date] DATE,
[SellerID] INT
)
INSERT INTO #sales VALUES
(100, '2012-01-01', 1),
(200, '2012-04-01',1),
(300, '2012-04-23',1),
(400, '2012-04-27',1),
(700, '2012-01-01', 2),
(700, '2012-01-02', 2),
(800, '2012-04-01',2),
(900, '2012-04-23',2),
(950, '2012-04-27',2)
SELECT
SellerID AS SellerID,
SUM(CASE WHEN [Date] >= DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()),0) THEN [Price] END) AS Today,
SUM(CASE WHEN [Date] >= DATEADD(WEEK, DATEDIFF(WEEK, 0, GETDATE()), 0) THEN [Price] END) AS ThisWeek,
SUM(CASE WHEN [Date] >= DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE()), 0) THEN [Price] END) AS ThisMonth,
SUM(CASE WHEN [Date] >= DATEADD(QUARTER, DATEDIFF(QUARTER, 0, GETDATE()), 0) THEN [Price] END) AS ThisQuarter
FROM #sales
WHERE DATEPART(YEAR, [Date]) = DATEPART(YEAR, GETDATE())
GROUP BY SellerID
When executing the same query on a larger table this gets quite slow. Just removing the CASE-statements cuts the execution time by almost 50%.
How can I achieve the same result in a faster and more efficient way?
Upvotes: 5
Views: 3864
Reputation: 142
select
SellerID
,sum(case when [Date]=getdate() then [Price] else 0 end) as Today
,sum(case when datepart(week,[Date])=datepart(week,getdate()) then [Price] else 0 end) as ThisWeek
,sum(case when datepart(MONTH,[Date])=datepart(month,getdate()) then [Price] else 0 end) as ThisMonth
,sum(case when datepart(QUARTER,[Date])=datepart(QUARTER,getdate()) then [Price] else 0 end) as ThisQUARTER
from #sales
Group by SellerID
Upvotes: 0
Reputation: 69789
Since it is Friday afternoon, I thought I'd expand on my comment regarding warehousing. even if you cannot fully explore cubes with SSAS or any other OLAP you can still do your own report specific warehousing. In your case I would set up a new Database (I always call mine DW but the world is your oyster), and create 2 schemas Fact and Dim (representing facts and dimensions). In your case it would need 2 tables, although you may want to add another dimension for "SellerID" depending on if this needs further reporting on.
CREATE TABLE Dim.Date
( DateKey DATE NOT NULL,
DayOfWeek VARCHAR(20) NOT NULL,
Day TINYINT NOT NULL,
Week TINYINT NOT NULL,
Quarter TINYINT NOT NULL,
Month TINYINT NOT NULL,
Year SMALLINT NOT NULL
CONSTRAINT PK_Dim_Date_DateKey PRIMARY KEY (DateKey)
)
CREATE TABLE Fact.Sales
( DateKey DATE NOT NULL,
SellerID INT NOT NULL,
Sales INT NOT NULL,
Amount MONEY NOT NULL,
CONSTRAINT PK_Fact_Sales PRIMARY KEY (DateKey, SellerID),
CONSTRAINT FK_Fact_Sales_DateKey FOREIGN KEY (DateKey) REFERENCES Dim.Date
)
Assuming the data will not get backdated you can use a procedure like this to fill your warehouse on a scheduled job:
DECLARE @MaxDate DATE
SELECT @MaxDate = DATEADD(DAY, 1, MAX(DateKey))
FROM Fact.Sales
INSERT INTO Dim.Date
SELECT DATEADD(DAY, Increment, @MaxDate),
DATENAME(WEEKDAY, DATEADD(DAY, Increment, @MaxDate)),
DATEPART(DAY, DATEADD(DAY, Increment, @MaxDate)),
DATEPART(WEEK, DATEADD(DAY, Increment, @MaxDate)),
DATEPART(MONTH, DATEADD(DAY, Increment, @MaxDate)),
DATEPART(QUARTER, DATEADD(DAY, Increment, @MaxDate)),
DATEPART(YEAR, DATEADD(DAY, Increment, @MaxDate))
FROM ( SELECT ROW_NUMBER() OVER(ORDER BY Object_ID) - 1 [Increment]
FROM Sys.Objects
) obj
WHERE NOT EXISTS
( SELECT 1
FROM Dim.Date
WHERE Date.DateKey = DATEADD(DAY, Increment, @MaxDate)
)
INSERT INTO Fact.Sales
SELECT [Date], SellerID, COUNT(*), SUM(Price)
FROM LiveDatabase..Sales
WHERE [Date] >= @MaxDate
GROUP BY [Date], SellerID
This would leave you with the following query to produce your report
SELECT SellerID,
SUM(CASE WHEN Today.DateKey = Date.DateKey THEN Amount ELSE O END) [Today],
SUM(CASE WHEN Today.Week = Date.Week THEN Amount ELSE O END) [ThisWeek],
SUM(CASE WHEN Today.Month = Date.Month THEN Amount ELSE O END) [ThisMonth],
SUM(CASE WHEN Today.Quarter = Date.Quarter THEN Amount ELSE O END) [ThisQuarter],
SUM(CASE WHEN Today.Year = Date.Year THEN Amount ELSE O END) [ThisYear]
FROM Fact.Sales
INNER JOIN Dim.Date
ON Date.DateKey = Sales.DateKey
INNER JOIN Dim.Date Today
ON Today.DateKey = CAST(GETDATE() AS DATE)
AND Today.Year = Date.Year
GROUP BY SellerID
It looks, if anything, more complicated than the original query, but the more the online database grows the more you will see the benefit. I've done an SQL Fiddle to demonstrate the advantages, it fills the live data with 10000 random sales records, then creates a warehouse (It may take a few seconds to build the schema). You should notice the execution time of the query on the warehouse is significantly faster (c.20x). It may not be 20x faster on the first run, but once the query plan has been cached for both queries the warehouse query is consistently 20x faster (has been for me anyway).
Upvotes: 9
Reputation:
Keep a de-normalised version of the data maybe?
e.g.: http://sqlfiddle.com/#!3/300a5/2
select
*
,DATENAME(day, [date]) as day
,DATENAME(month, [date]) as month
, DATENAME(year, [date]) as year
,DATENAME(quarter, [date]) as quarter
into deNormalised
from #sales
then you can run queries like:
select
year
,sum(price)
from
deNormalised
where
quarter = 1
group by
year
to get a comparison of first quarters across years
Obviously this means you have to come up with a schedule for maintaining you de-normalised version of the data. you might do that with a trigger on update or every hour.
you could also try adding the latest data to the de-normalised results.. that way you are only doing the slow processing on the rows that have been created today.
EDIT: I don't know if just using the DATENAME functions would improve the performance using your existing structure.
Upvotes: 2