BruceyBandit
BruceyBandit

Reputation: 4324

Alternative to using ROW_NUMBER for better performance

I have a small query below where it outputs a row number under the RowNumber column based on partitioning the 'LegKey' column and ordering by UpdateID desc. This is so the latest updated row (UpdateID) per legkey is always number 1

SELECT *
, ROW_NUMBER() OVER(PARTITION BY LegKey ORDER BY UpdateID DESC) AS RowNumber 
FROM Data.Crew

Data outputted:

UpdateID    LegKey  OriginalSourceTableID   UpdateReceived          RowNumber
7359        6641    11                     2016-08-22 16:35:27.487  1
7121        6641    11                     2016-08-15 00:00:47.220  2
8175        6642    11                     2016-08-22 16:35:27.487  1
7122        6642    11                     2016-08-15 00:00:47.220  2
8613        6643    11                     2016-08-22 16:35:27.487  1
7123        6643    11                     2016-08-15 00:00:47.220  2

The problem I have with this method is that I am getting slow performance because I assume I am using the ORDER BY.

My question is that is there an alternative way to produce a similar result but have my query run faster? I am thinking a MAX() may work but I didn't get the same output as before. Maybe I did the MAX() statement incorrectly so was wondering if this is a good alternative if somebody can provide an example on how they would write the MAX() statement for this example?

Thank you

Upvotes: 4

Views: 26430

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269803

Presumably this is the query you want to optimize:

SELECT c.*
FROM (SELECT c.*,
             ROW_NUMBER() OVER (PARTITION BY LegKey ORDER BY UpdateID DESC) AS RowNumber 
      FROM Data.Crew c
     ) c
WHERE RowNumber = 1;

Try an index on Crew(LegKey, UpdateId).

This index will also be used if you do:

SELECT c.*
FROM Data.Crew c
WHERE c.UpdateId = (SELECT MAX(c2.UpdateId)
                    FROM Data.Crew c2
                    WHERE c2.LegKey = c.LegKey
                   );

Upvotes: 5

Ahmed Saeed
Ahmed Saeed

Reputation: 851

You can try one of the following:

declare @Table table(UpdateID int,   LegKey int,  OriginalSourceTableID int,  UpdateReceived datetime)

Here using the MAX Date in subquery.

select * from @Table as a where a.UpdateReceived = (Select MAX(UpdateReceived) from @Table as b Where b.LegKey = a.LegKey)

Here you can use it in cte with group by.

with MaxDate as( Select LegKey, Max(UpdateReceived) as MaxDate from @Table group by LegKey ) 
select * from MaxDate as a   
inner join @Table as b 
     on b.LegKey=a.LegKey 
    and b.UpdateReceived=a.MaxDate

Upvotes: 2

Related Questions