Will
Will

Reputation: 31

filter duplicates in SQL join

When using a SQL join, is it possible to keep only rows that have a single row for the left table?

For example:

select * from A, B where A.id = B.a_id;

a1 b1
a2 b1
a2 b2

In this case, I want to remove all except the first row, where a single row from A matched exactly 1 row from B.

I'm using MySQL.

Upvotes: 3

Views: 1034

Answers (4)

Donnie
Donnie

Reputation: 46923

This avoids the cost of counting matching rows, which can be expensive for large tables.

As usual, when comparing various possible solutions, benchmarking / comparing the execution plans is suggested.

select
  *
from
  A
  join B on A.id = B.a_id
where
  not exists (
    select
      1
    from
      B B2
    where 
      A.id = b2.a_id 
      and b2.id != b.id
  )

Upvotes: 0

Thomas
Thomas

Reputation: 64645

First, I would recommend using the JOIN syntax instead of the outdated syntax of separating tables by commas. Second, if A.id is the primary key of the table A, then you need only inspect table B for duplicates:

Select ...
From A
    Join B
        On B.a_id = A.id
Where Exists    (
                Select 1
                From B B2
                Where B2.a_id = A.id
                Having Count(*) = 1
                )

Upvotes: 0

nvogel
nvogel

Reputation: 25526

It helps if you specify the keys of your tables when asking a question such as this. It isn't obvious from your example what the key of B might be (assuming it has one).

Here's a possible solution assuming that ID is a candidate key of table B.

SELECT *
FROM A, B
WHERE B.id =
 (SELECT MIN(B.id)
  FROM B
  WHERE A.id = B.a_id);

Upvotes: 0

Jack Edmonds
Jack Edmonds

Reputation: 33171

This should work in MySQL:

select * from A, B where A.id = B.a_id GROUP BY A.id HAVING COUNT(*) = 1;

For those of you not using MySQL, you will need to use aggregate functions (like min() or max()) on all the columns (except A.id) so your database engine doesn't complain.

Upvotes: 4

Related Questions