Reputation: 4587
We have a transact sql statement that queries 4 tables with millions of rows in each.
It takes several minutes, even though it has been optimized with indexes and statistics according to TuningAdvisor.
The structure of the query is like:
SELECT E.EmployeeName , SUM(M.Amount) AS TotalAmount , SUM(B.Amount) AS BudgetAmount , SUM(T.Hours) AS TotalHours , SUM(TB.Hours) AS BudgetHours , SUM(CASE WHEN T.Type = 'Waste' THEN T.Hours ELSE 0 END) AS WastedHours FROM Employees E LEFT JOIN MoneyTransactions M ON E.EmployeeID = M.EmployeeID LEFT JOIN BudgetTransactions B ON E.EmployeeID = B.EmployeeID LEFT JOIN TimeTransactions T ON E.EmployeeID = T.EmployeeID LEFT JOIN TimeBudgetTransactions TB ON E.EmployeeID = TB.EmployeeID GROUP BY E.EmployeeName
Since each transaction table contains millions of rows, I consider splitting it up into one query per transaction table, using table variables like @real
, @budget
, and @hours
, and then joining these in a final SELECT
. But in tests it seems to not speed up.
How would you deal with that in order to speed it up?
Upvotes: 1
Views: 2248
Reputation: 611
Try this one:
SELECT E.EmployeeName, TA.TotalAmount, BA.BudgetAmount, TWH.TotalHours, BH.BudgetHours, TWH.WastedHours
FROM Employees E
LEFT JOIN
(SELECT E.EmployeeID, SUM(M.Amount) AS TotalAmount
FROM Employees E INNER JOIN MoneyTransactions M ON E.EmployeeID = M.EmployeeID GROUP BY E.EmployeeID)TA
ON E.EmployeeID = TA.EmployeeID
LEFT JOIN
(SELECT E.EmployeeID , SUM(B.Amount) AS BudgetAmount
FROM Employees E INNER JOIN BudgetTransactions B ON E.EmployeeID = B.EmployeeID GROUP BY E.EmployeeID)BA
ON E.EmployeeID = BA.EmployeeID
LEFT JOIN
(SELECT E.EmployeeID , SUM(T.Hours) AS TotalHours , SUM(CASE WHEN T.Type = 'Waste' THEN T.Hours ELSE 0 END) AS WastedHours
FROM Employees E INNER JOIN TimeTransactions T ON E.EmployeeID = T.EmployeeID GROUP BY E.EmployeeID)TWH
ON E.EmployeeID = TWH.EmployeeID
LEFT JOIN
(SELECT E.EmployeeID , SUM(TB.Hours) AS BudgetHours
FROM Employees E INNER JOIN TimeBudgetTransactions TB ON E.EmployeeID = TB.EmployeeID GROUP BY E.EmployeeID)BH
ON E.EmployeeID = BH.EmployeeID
Upvotes: 0
Reputation: 501
I don't know if you have all the indexes on your tables that will speed up things, but having big tables could have this impact on a query time. I would recommend partitioning the tables if possible. It is more work, but everything you do to speed up the query now it won't be enough after few millions new records.
Upvotes: 1
Reputation: 425401
I'm not sure the query you posted will yield the results you're expecting.
It will cross join all the dimension tables (MoneyTransactions etc.) and multiply all the results.
Try this:
SELECT E.EmployeeName,
(
SELECT SUM(amount)
FROM MoneyTransactions m
WHERE M.EmployeeID = E.EmployeeID
) AS TotalAmount,
(
SELECT SUM(amount)
FROM BudgetTransactions m
WHERE M.EmployeeID = E.EmployeeID
) AS BudgetAmount,
(
SELECT SUM(hours)
FROM TimeTransactions m
WHERE M.EmployeeID = E.EmployeeID
) AS TotalHours,
(
SELECT SUM(hours)
FROM TimeBudgetTransactions m
WHERE M.EmployeeID = E.EmployeeID
) AS BudgetHours
FROM Employees E
Upvotes: 8