Reputation: 11
I have two two table joined by the following query.
SELECT
dbo.ORDERLINE.DDCVNB,
dbo.ORDERLINE.DDAFAB,
dbo.ORDERLINE.DDAITX, dbo.ORDERLINE.DDALDT,
dbo.ORDERLINE.DDDUVA, dbo.ORDERLINE.DDARQT,
dbo.ORDERSHIPH.DHCANB, dbo.ORDERSHIPH.DHGGNB,
dbo.ORDERLINE.DDAAGM, dbo.ORDERLINE.DDCHNB,
dbo.ORDERLINE.DDAAGQ
FROM
dbo.ORDERLINE
LEFT OUTER JOIN
dbo.ORDERSHIPH ON dbo.ORDERLINE.DDAFAB = dbo.ORDERSHIPH.DHAFAB AND dbo.ORDERLINE.DDCVNB = dbo.ORDERSHIPH.DHCVNB
The ORDERLINE
table has 7 million rows and the ORDERSHP
has 2 million rows.
My query result-set has 14 million rows.
How is this possible with a left outer join? Shouldn't the result set be equal to the number of rows in the ORDERLINE which is 7 million?
Upvotes: 1
Views: 2335
Reputation: 13
I found this out the hard way about outer joins:
LEFT OUTER JOIN
dbo.ORDERSHIPH ON dbo.ORDERLINE.DDAFAB = dbo.ORDERSHIPH.DHAFAB AND dbo.ORDERLINE.DDCVNB = dbo.ORDERSHIPH.DHCVNB
If the ORDERSHIPH table has multiple records with the same DHAFAB and DHCVNB values (let's say two), then the select clause will return a SET of rows FOR BOTH RECORDS IN ORDERSHIPH. Which means 7 million times two. If ORDERSHIPH has lets say 3 records with those same values you'll receive 3 sets of data in return.
Upvotes: 1
Reputation: 624
Your ON clause does not have enough join criteria and is matching 2 right rows for every left row. Essentially a cross product.
Also, you may not even want a LEFT OUTER JOIN. If you know via RI or such that there will be AT LEAST one row in the right table for every one in the left you should use an INNER JOIN.
Upvotes: 2
Reputation: 196
Your probably missing other join criteria. Without knowing all fields in your database, its going to be difficult for us to assist. Its sounds like you also need empty records from the ordership table. If not, Go with Cory's suggestion and use an inner join.
Upvotes: 3