Ram
Ram

Reputation: 3091

Difference between NOT IN and equals vs. IN and not equals

My query needs to return all usage records whose pipeline rate is not 'No Usage'.

What's the difference between NOT EXISTS vs. NOT IN vs. LEFT JOIN WHERE IS NULL?

I have seen the above question and decided to use IN over EXISTS as the values in the tables are nullable. Which one of the following is better and more efficient or is there any other way which more efficient than the following two?

SELECT *
FROM   usagerecords UR
WHERE  UR.usagerateid NOT IN (SELECT id
                              FROM   pipelinerate PR
                              WHERE  PR.name = 'No Usage')

SELECT *
FROM   usagerecords UR
WHERE  UR.usagerateid IN (SELECT id
                          FROM   pipelinerate PR
                          WHERE  PR.name <> 'No Usage') 

Upvotes: 2

Views: 2924

Answers (2)

Aaron Bertrand
Aaron Bertrand

Reputation: 280644

NOT IN is going to give you the wrong results if id is nullable (which I hope it is not, otherwise it has a terrible name).

Why would you choose IN over EXISTS when it has been proven time and time again that EXISTS is more efficient (or at least no less efficient), since it can short-circuit? IN has to materialize the entire set.

SELECT * -- stop doing this
  FROM dbo.usagerecords AS UR
  WHERE EXISTS
  (
    SELECT 1 FROM dbo.pipelinerate AS pr
      WHERE pr.id = ur.usagerateid
      AND pr.name <> 'No Usage'
  );

You can also express your other query like this:

SELECT * -- again, stop doing this
  FROM dbo.usagerecords AS UR
  WHERE NOT EXISTS 
  (
    SELECT 1 FROM dbo.pipelinerate AS pr
      WHERE pr.id = ur.usagerateid
      AND pr.name = 'No Usage'
  );

But I have no idea which, if either, gets the correct results. This is why we typically ask for sample data and desired results.

Your use of SELECT * is likely to have a greater negative impact on performance than whether you use IN or EXISTS. FWIW.

Upvotes: 4

Brian White
Brian White

Reputation: 1314

"What's the difference between NOT EXISTS vs. NOT IN vs. LEFT JOIN WHERE IS NULL?"

Not exists exits as soon as it sees one match.

Not in doesn't. People get concerned about in or not in lists that contain large numbers of results, but some testing by Jeff Moden shows that they work fine up to the million item range, which is usually sufficient.

Left join where is null is set based, so it's the "classic" solution. "In" is basically turned into a giant or list. Left join where is null doesn't have any specific advantages when you're just testing for something being missed, I like to use a left self join/is null pattern when I'm looking for say the most recent event per user.

Not in is super simple, and any novice developer will understand what it does.

Not exists is almost as clear, but maybe above novice.

Left join/is null is routinely misunderstood even by mid-level developers. So personally I find not in the most maintainable.

Upvotes: 0

Related Questions