Sunday Ironfoot
Sunday Ironfoot

Reputation: 13050

SQL Server - Select Rows, Remove Duplicates But Keep row with highest date

Considering the following database table in SQL Server 2008:

ActionID (PK)    ActionType    ActionDate              UserID  ContentID
1                'Create'      '2013-05-26 18:40:00'   1       10
2                'Create'      '2013-05-26 18:30:00'   2       10
3                'Edit'        '2013-05-26 12:30:00'   5       12
4                'Edit'        '2013-05-26 12:25:00'   5       12
5                'Delete'      '2013-05-26 12:22:00'   6       12

I want to write a SQL query that groups by ContentID and ActionType but where the row with the latest ActionDate is returned and other rows ignored, even if they have different UserID or other column values.

So what it should return is:

ActionID (PK)    ActionType    ActionDate              UserID  ContentID
1                'Create'      '2013-05-26 18:40:00'   1       10
3                'Edit'        '2013-05-26 12:30:00'   5       12
5                'Delete'      '2013-05-26 12:22:00'   6       12

But I can't quite figure out how to write the query to do it.

Upvotes: 4

Views: 4819

Answers (2)

marc_s
marc_s

Reputation: 754240

One approach would be to use a CTE (Common Table Expression).

With this CTE, you can partition your data by some criteria - i.e. your ContentID and Actiontype - and have SQL Server number all your rows starting at 1 for each of those "partitions", ordered by the ActionDate.

So try something like this:

;WITH Actions AS
(
   SELECT 
       ActionID, ActionType, ActionDate, UserID, ContentID,
       RowNum = ROW_NUMBER() OVER(PARTITION BY ContentID, ActionType ORDER BY ActionDate DESC)
   FROM 
       dbo.YourTable
   WHERE
      ......
)
SELECT 
   ActionID, ActionType, ActionDate, UserID, ContentID,
FROM 
   Actions
WHERE
   RowNum = 1
ORDER BY 
   ActionDate DESC

Does that approach what you're looking for??

Upvotes: 7

select t1.*
from Table1 t1
inner join (select ContentID, ActionType, max(ActionDate) as MaxDate
            from Table1
            group by ContentID, ActionType) t2
        on t1.ContentID = t2.ContentID
       and t1.ActionType = t2.ActionType
       and t1.ActionDate = t2.MaxDate;

Any query that answers your question is liable to produce unexpected results if you have duplicate rows for a {ContentID, ActionType} pair.

Upvotes: 4

Related Questions