Reputation: 675
I have a table like this:
Location E-Mail Name Language Creation Time
-----------------------------------------------------------------------
D [email protected] Max de 19:06:00 <-- A
D [email protected] Peter en 19:10:00
D [email protected] Lisa en 17:39:00
E [email protected] Tom fr 05:00:00 <-- B
E [email protected] Carl en 09:01:00
D [email protected] Max en 17:45:00 <-- A
D [email protected] Mike de 17:45:00 <-- A
E [email protected] Tom de 06:00:00 <-- B
I want to identify every record, where Location
AND E-Mail
are the same, delete the "duplicates" and keep only the newest record. If the creation time is the same (A), then it doesn't matter, which record is deleted.
The result-table should be:
Location E-Mail Name Language Creation Time
-----------------------------------------------------------------------
D [email protected] Peter en 19:10:00
D [email protected] Lisa en 17:39:00
E [email protected] Tom fr 05:00:00 <-- B
E [email protected] Carl en 09:01:00
D [email protected] Mike de 17:45:00 <-- A
Because the table is very long, the solution should be speedy. :-)
Thanks for every hint!
Cheers, Holger
Upvotes: 1
Views: 67
Reputation: 754468
One approach would be to use a CTE (Common Table Expression) if you're on SQL Server 2005 and newer (you aren't specific enough in that regard).
With this CTE, you can partition your data by some criteria - i.e. your (Location, EMail)
- and have SQL Server number all your rows starting at 1 for each of those "partitions", ordered by the descending CreationTime
.
So try something like this:
;WITH NewestData AS
(
SELECT
Location, EMail, Name, Language, CreationTime,
RowNum = ROW_NUMBER() OVER(PARTITION BY Location, EMail
ORDER BY CreationTime DESC)
FROM
dbo.YourTableNameHere
)
DELETE FROM NewestData
WHERE RowNum > 1
Here, I am selecting only the "first" (newest) entry for each "partition" (i.e. for each (Location, EMail)
tuple) - ordered by the CreationTime
in a descending order.
So the newest entry for each (Location, EMail)
tuple has a value of 1
for its RowNum
, any "duplicates" have values larger than 1
- so I just delete away all those "duplicates" - and you're done!
Upvotes: 6