Reputation: 58
As per subject - I am trying to replace slow SQL IN statement with an INNER or LEFT JOIN. What I am trying to get rid of:
SELECT
sum(VR.Weight)
FROM
verticalresponses VR
WHERE RespondentID IN
(
SELECT RespondentID FROM verticalstackedresponses VSR WHERE VSR.Question = 'Brand Aware'
)
The above I tried replacing with
SELECT
sum(VR.Weight)
FROM
verticalresponses VR
LEFT/INNER JOIN verticalstackedresponses VSR ON VSR.RespondentID = VR.RespondentID AND VSR.Question = 'Brand Aware'
but unfortunately I'm getting different results. Can anyone see why and if possible advise a solution that will do the job just quicker? Thanks a lot!
Upvotes: 1
Views: 1348
Reputation: 1270021
The problem may be related to duplicates. Try this:
SELECT sum(VR.Weight)
FROM verticalresponses VR inner join
(select distinct RespondentID
from verticalstackedresponses VSR
where SR.Question = 'Brand Aware'
) vsr
ON VSR.RespondentID = VR.RespondentID;
You want an inner join
because you want to filter the VR
table. A left join
will keep everything in the first table.
Upvotes: 1
Reputation: 432311
Either way, you need suitable indexes, probably
See Using 'IN' with a sub-query in SQL Statements for more
Upvotes: 1
Reputation: 6426
The subquery
SELECT RespondentID FROM verticalstackedresponses VSR WHERE VSR.Question = 'Brand Aware'
could maybe be returning multiple rows for any RespondentID, then you would get different results between join and in versions
Something along the lines of this may give the same results
SELECT
sum(VR.Weight)
FROM
verticalresponses VR
JOIN( SELECT distinct RespondentID FROM verticalstackedresponses
WHERE VSR.Question = 'Brand Aware'
) VSR
ON VSR.RespondentID = VR.RespondentID
Upvotes: 2