Reputation: 81
I'm trying to select all duplicated (removing the distinct record) entry of a table, result of a join.
I'm trying to do it with:
SELECT items.id, b.title FROM b WHERE b.title IN (
SELECT b.title FROM b LIMIT 20 GROUP BY b.title HAVING COUNT(*)>1
) INNER JOIN items USING (number)
Note: number
is a Index common to 'items' and 'b' tables used for the join.
But obvious that doesn't work. I'm trying to understand how to get the duplicates and return the join result.
An example of output should be:
id | title
----|----
001 | House
002 | House
005 | Tree
010 | Tree
006 | Car
007 | Car
Upvotes: 2
Views: 112
Reputation: 5704
You can do it this way:
SELECT id, title
FROM b INNER JOIN items USING number
WHERE title IN
(SELECT title
FROM b
GROUP BY title
HAVING COUNT(*) > 1)
Upvotes: 1
Reputation: 1269953
Using COUNT(*)
in a subquery incurs overhead. I would suggest:
SELECT i.id, b.title
FROM b JOIN
items i
USING (number)
WHERE EXISTS (SELECT 1
FROM b b2
WHERE b2.title = b.title AND b2.id <> b.id
);
This does assume assume that b
has a unique id on each row, which is used in the subquery. This query can make use of an index on b(title, id)
.
Upvotes: 2
Reputation: 286
I've accomplished something similar a few times using a query similar to this.
SELECT id
,title
FROM some_table AS T1
WHERE (SELECT COUNT(1)
FROM some_table AS T2
WHERE T1.title = T2.title) > 1
Upvotes: 0