Andrew
Andrew

Reputation: 760

SQL: Select top 1 of each group returning multiple records for each group

I have a table which looks like this:

| Id | InvestorFundId | Name | AccountKey | AsOfDate | AddedOn  |
| 1  | 11111          | Name1| Key1       | 9/5/2018 | 8/5/2018 |
| 2  | 11111          | Name2| Key1       | 9/3/2018 | 8/5/2018 |
| 3  | 22222          | Name3| Key2       | 9/2/2018 | 8/5/2018 |
| 4  | 33333          | Name4| Key3       | 9/2/2018 | 8/5/2018 |
| 5  | 33333          | Name5| Key3       | 9/4/2018 | 8/5/2018 |

I need to be able to return the most recent InvestorFundId, Name, and AccountKey for each group, based on ordered AsOfDate and AddedOn descending.

Expected result should look like this:

| InvestorFundId | Name | AccountKey |
| 11111          | Name1| Key1       | 
| 22222          | Name3| Key2       |
| 33333          | Name5| Key3       | 

I've looked at some posts but I can't correctly return the rows, here is what I have so far:

SELECT Name, AccountKey, H.InvestorFundId FROM
[Investor.Fund.History] H
CROSS APPLY(SELECT TOP 1 InvestorFundId 
FROM [Investor.Fund.History] 
WHERE  DataStatusId = 1 AND AsOfYearMonth <= 201806
ORDER BY AsOfDate DESC, AddedOn DESC) HS
ORDER BY H.InvestorFundId

But this is returning me multiple records for each group instead of the most recent.

I just want to add that the current implementation uses ROW_NUMBER but it is too slow, so I am investigating into other solutions.

Thank you

NOTE: I know that I have two extra columns in my where, I just opted to exclude those from the diagram but you should be able to get the gist

Upvotes: 3

Views: 3755

Answers (2)

S3S
S3S

Reputation: 25112

You can use WITH TIES and simply apply ROW_NUMBER in the ORDER BY

Select top 1 with ties *
From History
Where DataStatusId = 1 and AsOfYearMonth = 201806
Order by 
Row_Number() over (partition by InvestorFundID order by AsOfDate desc)

Or with a sub query

Select  *
    From (select * , RN= Row_Number() over (partition by InvestorFundID order by AsOfDate desc)
             From History
             Where DataStatusId = 1 and AsOfYearMonth = 201806) x
Where x.RN = 1

If you find this slower, then we’d need to see the execution plan to determine WHY it’s slow. A non-clustered index on InvestorFundId, AsOfDate desc would make this super fast.

Create nonclustered index indexName on
History (InvestorFundID, AsOfDate desc)

Upvotes: 1

Pedro Martins
Pedro Martins

Reputation: 874

You can use a CTE to get the data you need.

USE tempdb;
GO

DECLARE @table TABLE (Id INT IDENTITY(1, 1), InvestorFundId INT, Name VARCHAR(50), AccountKey VARCHAR(50), AsOfDate DATE, AddedOn DATE);
INSERT INTO @table VALUES (11111, 'Name1', 'Key1', '9/5/2018', '8/5/2018');
INSERT INTO @table VALUES (11111, 'Name2', 'Key1', '9/3/2018', '8/5/2018');
INSERT INTO @table VALUES (22222, 'Name3', 'Key2', '9/2/2018', '8/5/2018');
INSERT INTO @table VALUES (33333, 'Name4', 'Key3', '9/2/2018', '8/5/2018');
INSERT INTO @table VALUES (33333, 'Name5', 'Key3', '9/4/2018', '8/5/2018');

;WITH CTE AS
(
    SELECT InvestorFundId, Name, AccountKey, ROW_NUMBER() OVER (PARTITION BY InvestorFundID ORDER BY AsOfDate DESC) AS RowId FROM @table
)
SELECT InvestorFundId, Name, AccountKey
    FROM CTE
    WHERE RowId = 1;

Here is a working SQLFiddle

Hope it helps.

Upvotes: 3

Related Questions