Reputation: 4227
I would like to remove any duplicate records from a table, and let the newest record (according to date) remain. In the example below, the 1st record will be remove (hdate = 2012-07-01, id = 16).
Using Sql Server 2008
Thanks
hdate id secId pricesource price
---------- ------------ ----------- -------------------- --------------
2012-07-01 16 126 DFLT NULL
2012-07-02 16 126 DFLT NULL
2012-07-01 CAD 20 DFLT 1
2012-07-01 TWD 99 DFLT 1
Upvotes: 0
Views: 146
Reputation: 460138
With Sql-Server 2005 or greater you can use ROW_NUMBER
with an appropriate OVER
in a CTE
:
WITH CTE AS
(
SELECT hdate, id, secId, pricesource, price,
ROW_NUMBER() OVER (PARTITION BY id, secId, pricesource, price ORDER BY hdate DESC) AS RN
FROM dbo.TableName t
)
DELETE FROM CTE WHERE RN > 1
Upvotes: 2
Reputation: 495
This is not as elegant as Tim's solution, but doesn't require a CTE. It also handles nulls in a column as equivalent.
DELETE
FROM MyTable m1
WHERE EXISTS (
SELECT 1
FROM MyTable m2
WHERE
(m2.id = m1.id OR (m2.id IS NULL AND m1.id IS NULL))
AND (m2.secId = m1.secId OR (m2.secId IS NULL AND m1.secId IS NULL))
AND (m2.pricesource = m1.pricesource OR (m2.pricesource IS NULL AND m1.pricesource IS NULL))
AND (m2.price = m1.price OR (m2.price IS NULL AND m1.price IS NULL))
AND m2.hdate > m1.hdate
);
Upvotes: 0
Reputation: 13056
In case your RDBMS doesn't support CTEs, or being able to delete from them (as you haven't listed what you're using), here's a version for everything else:
DELETE FROM TableName as a
WHERE EXISTS (SELECT '1'
FROM TableName b
WHERE b.id = a.id -- Plus all other 'duplicate' columns
AND b.hdate > a.hdate);
(And Tim's modified Fiddle demo - although for some reason this doesn't work for SQL Server).
Upvotes: 0