rockstardev
rockstardev

Reputation: 13527

SQL to see all duplicates?

I have a massive table with thousands of rows. Each row has a unique path. For example:

electronics/samsung/tv/lcd-23384
electronics/philips/tv/lcd-12ger
etc...

The problem is, a recent query was executed incorrectly, resulting in some of the fields having duplicate paths. So, what i want to know is, is there a query I can execute to show ALL the duplicates? In other words, if I have this:

ID     | PATH 
1      | path_1
2      | path_2
3      | path_3
4      | path_3
5      | path_3
6      | path_4
7      | path_4
8      | path_5
9      | path_6

I want this result:

ID     | PATH 
3      | path_3
4      | path_3
5      | path_3
6      | path_4
7      | path_4

Notice that all NON-duplicates have been removed. Ie.... these ones:

ID     | PATH 
1      | path_1
2      | path_2
8      | path_5
9      | path_6

What SQL query can accomplish this?

Upvotes: 0

Views: 201

Answers (4)

eggyal
eggyal

Reputation: 125835

SELECT * FROM mytable NATURAL JOIN (
  SELECT PATH FROM mytable GROUP BY PATH HAVING COUNT(*) > 1
) dupes

See it on sqlfiddle.


To perform the update you request in the comments below:

UPDATE mytable NATURAL JOIN (
  SELECT PATH FROM mytable GROUP BY PATH HAVING COUNT(*) > 1
) dupes, (SELECT @r:=0) init
SET mytable.PATH = CONCAT(PATH, '-', @r:=@r+1);

See it on sqlfiddle.

Upvotes: 4

manurajhada
manurajhada

Reputation: 5380

It will give you all duplicate results

select column_name, count(column_name) occurrence from table_name 
group by column_name having count(column_name) > 1;

And if you want to get all non duplicate results..

select column_name, count(column_name) occurrence from table_name 
group by column_name having count(column_name) = 1;

Upvotes: 0

Sir Hally
Sir Hally

Reputation: 2358

You can see all duplicates by using such query

SELECT * FROM YourTable WHERE Path in
  (SELECT PATH
   FROM YourTable
   GROUP BY PATH
   HAVING COUNT(*)>1)

What kind of rows do you want to remove?

Upvotes: 0

Yaroslav
Yaroslav

Reputation: 6534

Try this, substitute needed table and column values:

SELECT YourColumn, COUNT(*) TotalCount
  FROM YourTable
 GROUP BY YourColumn
HAVING COUNT(*) > 1
 ORDER BY COUNT(*) DESC

Upvotes: 0

Related Questions