rheitzman
rheitzman

Reputation: 2297

SQL Not returning Expected Records

This has to be a simple error on my part. I have a table with permits (applicants have one permit) - about ~600 expired last season and ~900 the season before. I need to generate a mailing list of unique applicants that had permits in last two seasons.

SELECT COUNT(*) FROM Backyard_Burn WHERE YEAR(Expiration_Date)= 2014

SELECT COUNT(*) FROM Backyard_Burn WHERE YEAR(Expiration_Date)= 2013

SELECT COUNT(*) FROM Backyard_Burn WHERE YEAR(Expiration_Date)= 2013
AND Applicant_Mail_ID NOT IN(
  SELECT Applicant_Mail_ID 
  FROM Backyard_Burn 
  WHERE YEAR(Expiration_Date)= 2014)

Which returns : 618, 923, and 0

Why 0 and not a number somewhere near 923 - 618 assuming most are repeat applicants?

Upvotes: 2

Views: 55

Answers (2)

bobthedeveloper
bobthedeveloper

Reputation: 3783

  1. you need applicants from the last two seasons - you need to use a greater than operator
  2. its better to check on a full date instead of getting the year value with year
  3. to get the unique applicants you can use distinct

Which results in:

 select count(distinct Applicant_Mail_ID)
 from Backyard_Burn
 where Expiration_Date >= '20130101';

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269743

NOT IN can be dangerous. The problem is probably caused because Applicant_Mail_id takes on NULL values. You can fix this readily with:

SELECT COUNT(*)
FROM Backyard_Burn
WHERE YEAR(Expiration_Date) = 2013 AND
      Applicant_Mail_ID NOT IN (SELECT Applicant_Mail_ID 
                                FROM Backyard_Burn 
                                WHERE YEAR(Expiration_Date) = 2014 AND Applicant_Mail_ID IS NOT NULL
                               );

If any of those values are NULL, then NOT IN can only return FALSE or NULL -- the condition can never allows records through.

For this reason, I think it is better to use NOT EXSTS, which has the semantics you expect when some of the values might be NULL:

SELECT COUNT(*)
FROM Backyard_Burn bb
WHERE YEAR(Expiration_Date) = 2013 AND
      NOT EXISTS (SELECT 1 
                  FROM Backyard_Burn bb2
                  WHERE YEAR(bb2.Expiration_Date) = 2014 AND
                        bb2.Applicant_Mail_ID = bb.Applicant_Mail_ID
                 );

EDIT:

By the way, an alternative way of formulating this is to use group by and having:

select Applicant_Mail_ID
from Backyard_Burn
group by Applicant_Mail_ID
having sum(case when year(Expiration_Date) = 2013 then 1 else 0 end) > 0 and
       sum(case when year(Expiration_Date) = 2014 then 1 else 0 end) > 0;

This avoids the problem with NULLs and makes it easy to add new conditions, such as applicants who did not have any records in 2012.

Upvotes: 4

Related Questions