kdudeIA
kdudeIA

Reputation: 57

How to remove multiple duplicates based on date in MS Access

I have a table of names and dates. I want to remove rows that have duplicates of the names, and only keep the one with the most recent date. Sometimes there are multiple duplicate rows of the information.

My code below works, however I would like to run an automated loop that stops when no more duplicates are detected OR learn a better/more efficient way of doing this.

With the code below, my current procedure is:

Query1 Query2 Query3

Repeat until No more duplicates deleted.

Table1:

ID  Field1  Field2  Field3
3   Albert  Jacobsen 12/5/2018
5   Mia     Shaw     12/28/2018
6   Chris   Mantle   6/14/2018
7   Albert  Jacobsen 1/8/2019
8   Albert  Jacobsen 11/15/2018
9   Chris   Mantle   11/24/2018

Query 1:
SELECT Table1.Field1, Table1.Field2, Table1.Field3, Table1.ID INTO Table2
FROM Table1
GROUP BY Table1.Field1, Table1.Field2, Table1.Field3, Table1.ID
ORDER BY Table1.Field1 DESC , Table1.Field2 DESC , Table1.Field3 DESC;

Table2:
Field1  Field2  Field3  ID
Mia     Shaw     12/28/2018 5
Chris   Mantle   11/24/2018 9
Chris   Mantle   6/14/2018  6
Albert  Jacobsen 1/8/2019   7
Albert  Jacobsen 12/5/2018  3
Albert  Jacobsen 11/15/2018 8

Query 2:
SELECT Table2.Field1, Table2.Field2, Count(Table2.ID) AS CountOfID,                 
Min(Table2.ID) AS MinOfID INTO Temp_DeleteThese
FROM Table2
GROUP BY Table2.Field1, Table2.Field2
HAVING (((Count(Table2.ID))>1));

Table Temp_DeleteThese:
Field1  Field2  CountOfID MinOfID
Albert  Jacobsen 3  3
Chris   Mantle   2  6

Query 3:
DELETE DISTINCTROW Table1.*
FROM Temp_DeleteThese INNER JOIN Table1 ON Temp_DeleteThese.MinofID =             
Table1.ID;

Resulting Table1:
ID  Field1  Field2  Field3
5   Mia Shaw     12/28/2018
7   Albert  Jacobsen 1/8/2019
8   Albert  Jacobsen 11/15/2018
9   Chris   Mantle   11/24/2018

How can I loop the code until the duplicates are removed and only the most recent record is retained, or do this more efficiently?

Upvotes: 0

Views: 108

Answers (1)

Erik A
Erik A

Reputation: 32672

You could delete all duplicate records in a single query by using an EXISTS subquery to make sure a row exists with the same name and a more recent date:

DELETE *
FROM Table1 t
WHERE EXISTS(
    SELECT 1
    FROM Table1 s
    WHERE s.Field1 = t.Field1
    AND s.Field2 = t.Field2
    AND s.Field3 > t.Field3
)

This deletes all rows that should be deleted in a single go. I don't think you can get more efficient than that.

Upvotes: 1

Related Questions