Gautam G
Gautam G

Reputation: 494

select individual records based on aggregated values

Assume I have following table

EmployeeID            Salary                      Date
-----------           ----------------            -----------
37                    45000.00                    2015-03-11 
102                   36500.00                    2015-03-11 
103                   43000.00                    2015-03-11 
104                   45000.00                    2015-03-11 
105                   40000.00                    2015-03-11 
37                    45000.00                    2015-04-11 
102                   36500.00                    2015-04-11 
103                   43000.00                    2015-04-11 
104                   45000.00                    2015-04-11 
105                   40000.00                    2015-04-11 

I want to retrieve records whose "Sum Of Salary" crosses 80000, so the desired output is

EmployeeID            Salary                      Date
-----------           ----------------            -----------
37                    45000.00                    2015-03-11 
103                   43000.00                    2015-03-11 
104                   45000.00                    2015-03-11 
37                    45000.00                    2015-04-11 
103                   43000.00                    2015-04-11 
104                   45000.00                    2015-04-11 

I have achieved it in following way:

  1. Store the aggregated results into a table variable.
  2. INNER join between the original table & table variable

    DECLARE @tmpAggregatedSalaries TABLE
    (
        EmployeeID          INT,
        SumOfSalary         DECIMAL(18, 2)
    )
    
    INSERT INTO @tmpAggregatedSalaries
    SELECT sal.EmployeeID
        , SUM(sal.Salary) AS SumOfSalary
    FROM Salaries sal
    GROUP BY sal.EmployeeID
    
    SELECT sal.*
        FROM Salaries sal
        INNER JOIN @tmpAggregatedSalaries aggrSal ON sal.EmployeeID = aggrSal.EmployeeID
            AND aggrSal.SumOfSalary > 80000
    

As far as I know, storing temporary results to work upon is preferred over Inline queries, so I chose "table variable". Please suggest me more optimized versions.

Upvotes: 3

Views: 55

Answers (2)

Evaldas Buinauskas
Evaldas Buinauskas

Reputation: 14077

You also could use SUM() window function:

CREATE TABLE dbo.Salaries
(
    EmployeeID INT
    , Salary DECIMAL(10, 2)
    , [Date] DATE
    , CONSTRAINT PK_Salaries PRIMARY KEY (EmployeeID, [Date])
);

INSERT INTO dbo.Salaries (EmployeeID, Salary, [Date])
VALUES (37, 45000.00, '2015-03-11')
    , (102, 36500.00, '2015-03-11')
    , (103, 43000.00, '2015-03-11')
    , (104, 45000.00, '2015-03-11')
    , (105, 40000.00, '2015-03-11')
    , (37,  45000.00, '2015-04-11')
    , (102, 36500.00, '2015-04-11')
    , (103, 43000.00, '2015-04-11')
    , (104, 45000.00, '2015-04-11')
    , (105, 40000.00, '2015-04-11');

SELECT S.EmployeeID, S.Salary, S.[Date]
FROM (
    SELECT EmployeeID, Salary, [Date], SUM(Salary) OVER(PARTITION BY EmployeeID) AS SalarySum
    FROM dbo.Salaries
    ) AS S
WHERE S.SalarySum > 80000;

This query will calculate total salary for each employee and later one just filter out ones that earned less than 80000 in total.

Result:

EmployeeID  Salary                                  Date
----------- --------------------------------------- ----------
37          45000.00                                2015-03-11
37          45000.00                                2015-04-11
103         43000.00                                2015-03-11
103         43000.00                                2015-04-11
104         45000.00                                2015-04-11
104         45000.00                                2015-03-11

Of course you can use an inline query like Tim has suggested:

SELECT S.EmployeeID, S.Salary, S.[Date]
FROM dbo.Salaries AS S
INNER JOIN (
    SELECT EmployeeID, SUM(Salary) AS SalarySum
    FROM dbo.Salaries
    GROUP BY EmployeeID
    ) AS T
    ON T.EmployeeID = S.EmployeeID
WHERE T.SalarySum > 80000;

For such small result set, the latter one seems to perform better by looking at execution plans. You'd have to compare it on your actual data to decide which one performs better. Attaching plans for both queries:

Using window functions:

enter image description here

Classic one:

enter image description here

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521629

One option is to use a GROUP BY subquery to identify all employees whose sum of salary be greater than 80000 to filter off records from your table which you don't want to see.

SELECT t1.EmployeeID,
       t1.Salary,
       t1.Date
FROM Salaries t1
INNER JOIN
(
    SELECT EmployeeID
    FROM Salaries
    GROUP BY EmployeeID
    HAVING SUM(Salary) > 80000
) t2
    ON t1.EmployeeID = t2.EmployeeID

Upvotes: 1

Related Questions