user1006027
user1006027

Reputation: 11

Sql Outer Join duplicate rows

I have two two table joined by the following query.

SELECT     
   dbo.ORDERLINE.DDCVNB, 
   dbo.ORDERLINE.DDAFAB, 
   dbo.ORDERLINE.DDAITX, dbo.ORDERLINE.DDALDT, 
   dbo.ORDERLINE.DDDUVA, dbo.ORDERLINE.DDARQT, 
   dbo.ORDERSHIPH.DHCANB, dbo.ORDERSHIPH.DHGGNB, 
   dbo.ORDERLINE.DDAAGM, dbo.ORDERLINE.DDCHNB, 
   dbo.ORDERLINE.DDAAGQ
FROM         
    dbo.ORDERLINE 
LEFT OUTER JOIN
    dbo.ORDERSHIPH ON dbo.ORDERLINE.DDAFAB = dbo.ORDERSHIPH.DHAFAB AND dbo.ORDERLINE.DDCVNB = dbo.ORDERSHIPH.DHCVNB

The ORDERLINE table has 7 million rows and the ORDERSHP has 2 million rows.

My query result-set has 14 million rows.

How is this possible with a left outer join? Shouldn't the result set be equal to the number of rows in the ORDERLINE which is 7 million?

Upvotes: 1

Views: 2335

Answers (3)

Sivan
Sivan

Reputation: 13

I found this out the hard way about outer joins:

LEFT OUTER JOIN
dbo.ORDERSHIPH ON dbo.ORDERLINE.DDAFAB = dbo.ORDERSHIPH.DHAFAB AND dbo.ORDERLINE.DDCVNB = dbo.ORDERSHIPH.DHCVNB

If the ORDERSHIPH table has multiple records with the same DHAFAB and DHCVNB values (let's say two), then the select clause will return a SET of rows FOR BOTH RECORDS IN ORDERSHIPH. Which means 7 million times two. If ORDERSHIPH has lets say 3 records with those same values you'll receive 3 sets of data in return.

Upvotes: 1

Dennis Hostetler
Dennis Hostetler

Reputation: 624

Your ON clause does not have enough join criteria and is matching 2 right rows for every left row. Essentially a cross product.

Also, you may not even want a LEFT OUTER JOIN. If you know via RI or such that there will be AT LEAST one row in the right table for every one in the left you should use an INNER JOIN.

Upvotes: 2

Andrew Jansen
Andrew Jansen

Reputation: 196

Your probably missing other join criteria. Without knowing all fields in your database, its going to be difficult for us to assist. Its sounds like you also need empty records from the ordership table. If not, Go with Cory's suggestion and use an inner join.

Upvotes: 3

Related Questions