Reputation: 3307
I have the following sql statement:
select
a.desc
,sum(bdd.amount)
from t_main c
left outer join t_direct bds on (bds.repid=c.id)
left outer join tm_defination def a on (a.id =bds.sId)
where c.repId=1000000134
group by a.desc;
When I run it I get the following result:
desc amount
NW 12.00
SW 10
When I try to add another left outer join to get another set of values:
select
a.desc
,sum(bdd.amount)
,sum(i.amt)
from t_main c
left outer join t_direct bds on (bds.repid=c.id)
left outer join tm_defination def a on (a.id =bdd.sId)
left outer join t_ind i on (i.id=c.id)
where c.repId=1000000134
group by a.desc;
It basically doubles the amount field like:
desc amount amt
NW 24.00 234.00
SE 20.00 234.00
While result should be:
desc amount amt
NW 12.00 234.00
SE 10.00 NULL
How do I fix this?
Upvotes: 4
Views: 90496
Reputation: 1038
Left Outer Join - Driving Table Row Count
A left outer join may return more rows than there are in the driving table if there are multiple matches on the join clause.
Using MS SQL-Server:
DECLARE @t1 TABLE ( id INT )
INSERT INTO @t1 VALUES ( 1 ),( 2 ),( 3 ),( 4 ),( 5 );
DECLARE @t2 TABLE ( id INT )
INSERT INTO @t2 VALUES ( 2 ),( 2 ),( 3 ),( 10 ),( 11 ),( 12 );
SELECT * FROM @t1 t1
LEFT OUTER JOIN @t2 t2 ON t2.id = t1.id
This gives:
1 NULL
2 2
2 2
3 3
4 NULL
5 NULL
There are 5 rows in the driving table (t1), but 6 rows are returned because there are multiple matches for id 2.
So if an aggregate function is used, eg SUM() etc, grouped by the driving table column(s), this will give the wrong results.
To fix this, use derived tables or sub-queries to calculate the aggregate values, as already stated.
Left Outer Join - Multiple Tables
Where there are left outer joins over multiple tables, or any join for that matter, the query generates a series of derived tables in the order of joins.
SELECT * FROM t1
LEFT OUTER JOIN t2 ON t2.col2 = <...>
LEFT OUTER JOIN t3 ON t3.col3 = <...>
This is equivalent to:
SELECT * FROM
(
SELECT * FROM t1
LEFT OUTER JOIN t2 ON t2.col2 = <...>
) dt1
LEFT OUTER JOIN t3 ON t3.col3 = <...>
Here, for both queries, the results of the 1st left outer join are put into a derived table (dt1) which is then left outer joined to the 3rd table (t3).
For left outer joins over multiple tables, the order of the tables in the join clauses is critical.
Upvotes: 1
Reputation: 776
If you really need to receive the data as you mentioned, your can use sub-queries to perform the needed calculations. In this case you code may looks like the following:
select x.[desc], x.amount, y.amt
from
(
select
c.[desc]
, sum (bdd.amount) as amount
, c.id
from t_main c
left outer join t_direct bds on (bds.repid=c.id)
left outer join tm_defination_def bdd on (bdd.id = bds.sId)
where c.repId=1000000134
group by c.id, c.[desc]
) x
left join
(
select t.id, sum (t.amt) as amt
from t_ind t
inner join t_main c
on t.id = c.id
where c.repID = 1000000134
group by t.id
) y
on x.id = y.id
In the first sub-select you will receive the aggregated data for the two first columns: desc
and amount
, grouped as you need.
The second select will return the needed amt
value for each id
of the first set.
Left join between those results will gives the needed result. The addition of the t_main
table to the second select was done because of performance issues.
Another solution can be the following:
select
c.[desc]
, sum (bdd.amount) as amount
, amt = (select sum (amt) from t_ind where id = c.id)
from #t_main c
left outer join t_direct bds on (bds.repid=c.id)
left outer join tm_defination_def bdd on (bdd.id = bds.sId)
where c.repId = 1000000134
group by c.id, c.[desc]
The result will be the same. Basically, instead of using of nested selects the calculating of the amt
sum is performing inline per each row of the result joins. In case of large tables the performance of the second solution will be worse that the first one.
Upvotes: 7
Reputation: 6344
Your new left outer join is forcing some rows to be returned in the result set a few times due to multiple relations most likely. Remove your SUM and just review the returned rows and work out exactly which ones you require (maybe restrict it to on certain type of t_ind record if that is applicable??), then adjust your query accordingly.
Upvotes: 6