GiuseppeP
GiuseppeP

Reputation: 81

SQL show duplicates after join

I'm trying to select all duplicated (removing the distinct record) entry of a table, result of a join.

I'm trying to do it with:

SELECT items.id, b.title FROM b WHERE b.title IN (
SELECT b.title FROM b LIMIT 20 GROUP BY b.title HAVING COUNT(*)>1
) INNER JOIN items USING (number) 

Note: number is a Index common to 'items' and 'b' tables used for the join.

But obvious that doesn't work. I'm trying to understand how to get the duplicates and return the join result.

An example of output should be:

 id | title
----|----
001 | House
002 | House
005 | Tree
010 | Tree
006 | Car
007 | Car

Upvotes: 2

Views: 112

Answers (3)

Scarabee
Scarabee

Reputation: 5704

You can do it this way:

SELECT id, title
FROM b INNER JOIN items USING number
WHERE title IN 
    (SELECT title
     FROM b
     GROUP BY title
     HAVING COUNT(*) > 1)

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269953

Using COUNT(*) in a subquery incurs overhead. I would suggest:

SELECT i.id, b.title
FROM b JOIN
     items i
     USING (number)
WHERE EXISTS (SELECT 1
              FROM b b2
              WHERE b2.title = b.title AND b2.id <> b.id
             );

This does assume assume that b has a unique id on each row, which is used in the subquery. This query can make use of an index on b(title, id).

Upvotes: 2

Nathan M.
Nathan M.

Reputation: 286

I've accomplished something similar a few times using a query similar to this.

SELECT id
       ,title
  FROM some_table AS T1
 WHERE (SELECT COUNT(1)
          FROM some_table AS T2
         WHERE T1.title = T2.title) > 1

Upvotes: 0

Related Questions