user3604795
user3604795

Reputation: 29

SQL query of duplicate rows filtered by max date within one table

I have a sql database, "persons', that has duplicate entries based on the IDNum column. I need to query the entries and only show the rows or duplicate entries based on the newest creation date. Here is the query:

SELECT IDNum
      ,PersonPGUID
      ,CreatedDateTime
      ,FirstName
      ,MiddleName
      ,LastName
  FROM [Persons]
  WHERE IDNum in (298928, 509520, 528203);

It returns this:

IDNum   PersonPGUID                             CreatedDateTime         FirstName   MiddleName  LastName
298928  C43DEB67-EB04-4066-A374-B8369D508CF6    2007-10-22 10:13:45.000 Jonathan    McKinley    Bennett-Tisdell
298928  75B63500-7C67-4B54-A2C1-F74EAF98B861    2007-10-22 10:15:35.000 Jonathan    McKinley    Bennett-Tisdell
509520  57A1FB38-93B5-4D7B-8A17-243EC9330766    2005-02-10 11:55:13.000 Lannon      Caleb       Morgan
509520  176667F4-318E-415B-AFFC-A80348325A9D    2004-08-24 06:26:27.000 Lannon      Caleb       Morgan
528203  36C372CB-C9AD-4CEC-8553-8147C7FEDE20    2009-06-08 09:26:43.000 Pedro       Trigueros
528203  5B71C0D5-10EB-4375-8F80-E8F01381E08A    2011-12-15 10:28:11.000 Pedro       Trigueros

I need to filter to only show on of the duplicates and it need to be the record with the MAX date for each set of duplicates.

Thanks, William

Upvotes: 2

Views: 10968

Answers (5)

russ
russ

Reputation: 624

If CreatedDateTime is an entirely unique column throughout (i.e. no two values duplicate):

SELECT * FROM PERSONS WHERE CreatedDateTime IN
(SELECT MAX(CreatedDateTime) from Persons group by IDnum having count(IDnum) > 1)

Hi William,

Thanks for the 10 points, I just thought to myself that the table may also contain rows where the IdNum value is NOT duplicated. Your question shows 3 duplicate IdNum's so I don't know whether unduplicated IdNums exist within it but I assume maybe so. If so you could use this:

SELECT * FROM PERSONS WHERE CreatedDateTime IN (SELECT MAX(CreatedDateTime)
FROM Persons GROUP BY IDnum HAVING COUNT(IDnum) > 1 OR COUNT(IDNUM) = 1)

Upvotes: 2

Hamlet Hakobyan
Hamlet Hakobyan

Reputation: 33391

You can use CTE in conjunction with windowing function to get the latest records for each IDNum

WITH CTE
AS
(
  SELECT
       IDNum
      ,PersonPGUID
      ,CreatedDateTime
      ,FirstName
      ,MiddleName
      ,LastName 
      , ROW_NUMBER() OVER(PARTITION BY IDNum ORDER BY CreatedDateTime DESC) RN
  FROM [Persons]
  WHERE IDNum in (298928, 509520, 528203)
)
SELECT IDNum
      ,PersonPGUID
      ,CreatedDateTime
      ,FirstName
      ,MiddleName
      ,LastName
FROM CTE
WHERE RN = 1

Upvotes: 1

Joseph B
Joseph B

Reputation: 5679

Try this INNER JOIN. It would give you good performance, because an inline view is used.

SELECT [Persons].*
FROM [Persons]
INNER JOIN
(
    SELECT 
          IDNum, 
          MAX(CreatedDateTime) AS CreatedDateTime
      FROM [Persons]
      WHERE IDNum in (298928, 509520, 528203)
      GROUP BY IDNum
      HAVING COUNT(*) > 1
) latest_records
ON [Persons].IDNum = latest_records.IDNum AND [Persons].CreatedDateTime = latest_records.CreatedDateTime;

Upvotes: 3

TheNorthWes
TheNorthWes

Reputation: 2739

Does this work for you? Basically selects the max time and joins on the result

select * 
from persons p inner join (
      select max(CreatedDatetime) as maxTime, IDNum as maxId from persons group by IDNum) maxTimeSet
           on maxTimeSet.maxId = p.IDNum

Upvotes: 0

Charles Bretana
Charles Bretana

Reputation: 146603

This is generally done with a correlated subquery:

SELECT IDNum ,PersonPGUID ,CreatedDateTime ,FirstName ,MiddleName ,LastName 
FROM [Persons] p
WHERE IDNum in (298928, 509520, 528203)
   and CreatedDateTime = (Select max(CreatedDateTime)
                          Where IDNum = p.IDNum)

Upvotes: 0

Related Questions