Reputation: 347
I have a table with around 32 columns and I want to check duplicate rows but two columns are unique so they will be different for each row. So I need JOIN
condition basically self join with duplicate records where all columns are equal except two.
Upvotes: 3
Views: 14163
Reputation: 16146
To find the rows that are duplicate the query would be (col1...col30 would be excluding the two columns that are always unique for each row)
SELECT
col1,
col2,
...,
col30
FROM
your_table
GROUP BY
col1,
col2,
...,
col30
HAVING
COUNT(*)>1
EDIT: It appears from the remarks you want to delete duplicate rows. Say you have columns id1 and id2 that are unique to every row (ie a primary key) and columns col1..col4 that can be duplicate. Take the following script as an example to delete duplicate rows:
CREATE TABLE #tt(
id1 INT,
id2 INT,
col1 INT,
col2 INT,
col3 INT,
col4 INT,
PRIMARY KEY(id1,id2)
);
INSERT INTO #tt(id1,id2,col1,col2,col3,col4)
VALUES (1,1,1,1,1,1),
(1,2,1,1,1,1),
(1,3,1,1,1,1),
(2,1,1,1,1,1),
(3,1,2,2,2,2),
(4,1,2,2,2,2),
(2,2,2,2,2,2),
(3,3,3,3,3,3),
(4,2,3,3,3,3);
DELETE
t
FROM
#tt AS t
INNER JOIN (
SELECT
id1,
id2,
ROW_NUMBER() OVER (PARTITION BY col1,col2,col3,col4 ORDER BY id1,id2) AS rn
FROM
#tt
) AS trn ON
trn.id1=t.id1 AND
trn.id2=t.id2
WHERE
trn.rn>1;
SELECT*FROM #tt;
DROP TABLE #tt;
Upvotes: 4
Reputation: 16968
You can also use something like this:
SELECT *
FROM yourTable t
WHERE 1 < (
SELECT COUNT(*)
FROM yourTable ti
WHERE t.uniqueCol1 = ti.uniqueCol1 AND t.uniqueCol2 = ti.uniqueCol2
GROUP BY nonUniqueCol1, nonUniqueCol2, ..., nonUniqueColn)
or for filtering just duplicated rows without showing original rows you can use:
SELECT *
FROM (SELECT *,
ROW_NUMBER() OVER (PARTITION BY nonUniqueCol1, nonUniqueCol2, ..., nonUniqueColn
ORDER BY uniqueCol1, uniqueCol2) As seq
FROM yourTable) t
WHERE (seq > 1)
Upvotes: 0