Reputation: 3091
My query needs to return all usage records whose pipeline rate is not 'No Usage'.
What's the difference between NOT EXISTS vs. NOT IN vs. LEFT JOIN WHERE IS NULL?
I have seen the above question and decided to use IN over EXISTS as the values in the tables are nullable. Which one of the following is better and more efficient or is there any other way which more efficient than the following two?
SELECT *
FROM usagerecords UR
WHERE UR.usagerateid NOT IN (SELECT id
FROM pipelinerate PR
WHERE PR.name = 'No Usage')
SELECT *
FROM usagerecords UR
WHERE UR.usagerateid IN (SELECT id
FROM pipelinerate PR
WHERE PR.name <> 'No Usage')
Upvotes: 2
Views: 2924
Reputation: 280644
NOT IN
is going to give you the wrong results if id
is nullable (which I hope it is not, otherwise it has a terrible name).
Why would you choose IN
over EXISTS
when it has been proven time and time again that EXISTS
is more efficient (or at least no less efficient), since it can short-circuit? IN
has to materialize the entire set.
SELECT * -- stop doing this
FROM dbo.usagerecords AS UR
WHERE EXISTS
(
SELECT 1 FROM dbo.pipelinerate AS pr
WHERE pr.id = ur.usagerateid
AND pr.name <> 'No Usage'
);
You can also express your other query like this:
SELECT * -- again, stop doing this
FROM dbo.usagerecords AS UR
WHERE NOT EXISTS
(
SELECT 1 FROM dbo.pipelinerate AS pr
WHERE pr.id = ur.usagerateid
AND pr.name = 'No Usage'
);
But I have no idea which, if either, gets the correct results. This is why we typically ask for sample data and desired results.
Your use of SELECT *
is likely to have a greater negative impact on performance than whether you use IN
or EXISTS
. FWIW.
Upvotes: 4
Reputation: 1314
"What's the difference between NOT EXISTS vs. NOT IN vs. LEFT JOIN WHERE IS NULL?"
Not exists exits as soon as it sees one match.
Not in doesn't. People get concerned about in or not in lists that contain large numbers of results, but some testing by Jeff Moden shows that they work fine up to the million item range, which is usually sufficient.
Left join where is null is set based, so it's the "classic" solution. "In" is basically turned into a giant or list. Left join where is null doesn't have any specific advantages when you're just testing for something being missed, I like to use a left self join/is null pattern when I'm looking for say the most recent event per user.
Not in is super simple, and any novice developer will understand what it does.
Not exists is almost as clear, but maybe above novice.
Left join/is null is routinely misunderstood even by mid-level developers. So personally I find not in the most maintainable.
Upvotes: 0