D_Bester
D_Bester

Reputation: 5931

Query with Left Outer Join

I'm having trouble figuring this out.

According to Jeff Atwood A Visual Explanation of SQL Joins Left outer join produces a complete set of records from Table A, with the matching records (where available) in Table B. If there is no match, the right side will contain null.

The left table (TableA) doesn't have duplicates. The right tableB has 1 or 2 entries for each client number. The PrimaryTP designates one as primary with 1 and the other has 0.

I shouldn't have to include the line And B.PrimaryTP = 1 because TableA doesn't have duplicates. Yet if I leave it out I get duplicate client numbers. Why?

Can you help me understand how this works. It's being very confusing to me. The logic of And B.PrimaryTP = 1 escapes me. Yet it seems to work. Still, I'm scared to trust it if I don't understand it. Can you help me understand it. Or do I have a logic error hidden in the query?

SELECT A.ClientNum --returns a list with no duplicate client numbers
FROM (...<TableA>
) as A

Left Outer Join

<TableB> as B
on A.ClientNum = B.ClientNum

--eliminate mismatch of (ClientNum <> FolderNum)
Where A.ClientNum Not In
(
    Select ClientNum From <TableB>
    Where ClientNum Is Not Null
    And ClientNum <> IsNull(FolderNum, '')
)

--eliminate case where B.PrimaryTP <> 1
And B.PrimaryTP = 1 

Upvotes: 1

Views: 423

Answers (5)

Bogdan Bogdanov
Bogdan Bogdanov

Reputation: 1723

On the link that you gave the joins are explained very good. So the problem is that you have several records from table A (no matter that there are no duplicates) is that to 1 record in A there are 2 records in B (in some cases). To avoid this you can use either DISTINCT clause, either GROUP BY clause.

Upvotes: 1

Aaron Bertrand
Aaron Bertrand

Reputation: 280570

I shouldn't have to include the line And B.PrimaryTP = 1 because TableA doesn't have duplicates. Yet if I leave it out I get duplicate client numbers. Why?

Because both rows in the right table match a row in the left table. There is no way for SQL Server to output a triangular result; it must show the columns from both tables for every joined row. And this is true for INNER JOIN as well.

DECLARE @a TABLE(a INT);
DECLARE @b TABLE(b INT);

INSERT @a VALUES(1),(2);
INSERT @b VALUES(1),(1);

SELECT a.a, b.b FROM @a AS a 
LEFT OUTER JOIN @b AS b ON a.a = b.b;

SELECT a.a, b.b FROM @a AS a 
INNER JOIN @b AS b ON a.a = b.b;

Results:

a   b
--  ----
1   1
1   1
2   NULL

a   b
--  --
1   1
1   1

Upvotes: 1

Santhosh
Santhosh

Reputation: 1791

If you have one client number in A and two matches in Table B, then you will get duplicates.

Suppose you have the following data,

Table-A(client Num) Table-B(client Num)
1          2
2          2

The left Join Results

Table-A(client Num) Table-B(client Num)
1                       (null)
2                       2
2                       2

This is the cause of duplicates. So you need to take distinct values form Table B or perform Distinct on the result set.

Upvotes: 1

CompuChip
CompuChip

Reputation: 9232

The LEFT OUTER JOIN will give you all the records from A with all the matching records from B. The difference with an INNER JOIN is that if there are no matching records in B, an INNER join will omit the record from A entirely, while the LEFT join will then still include a row with the results from A.

In your case, however, you may also want to check out the DISTINCT keyword.

Upvotes: 0

Christian Specht
Christian Specht

Reputation: 36451

The difference between an INNER JOIN and a LEFT JOIN is just that the LEFT JOIN still returns the rows in Table A when there are no corresponding rows in Table B.

But it's still a JOIN, which means that if there is more than one corresponding row in Table B, it will join the row from Table A to each one of them.

So if you want to make sure that you get no more than one result for each row in Table A, you have to make sure that no more than one row from Table B is found - hence the And B.PrimaryTP = 1.

Upvotes: 1

Related Questions