Reputation: 45
Here's my debacle. If I run:
SELECT pnumber, count(distinct ssn) as num_emps,nvl(sum(hours),0) as thours, nvl(sum(hours*salary/2000),0) as tcost
FROM project
LEFT JOIN (works_on join (employee
LEFT JOIN Department on DNO=DNUMBER) on essn=ssn)
ON pnumber=pno
GROUP BY pnumber
ORDER BY pnumber
I get
PNUMBER NUM_EMPS THOURS TCOST
---------- ---------- ---------- ----------
1 4 67.5 1027.5
2 3 37.5 562.5
3 2 50 960
10 3 55 762.5
20 3 25 522.5
30 4 60 967.5
40 0 0 0
50 0 0 0
60 0 0 0
When I run:
SELECT pnumber, count(distinct ssn) as num_emps,nvl(sum(hours),0) as thours, nvl(sum(hours*salary/2000),0) as tcost
FROM project
LEFT JOIN (works_on join (employee
LEFT JOIN Department on DNO=DNUMBER) on essn=ssn)
ON pnumber=pno
WHERE ssn IN (select MGRSSN from DEPARTMENT)
GROUP BY pnumber
ORDER BY pnumber
I get
PNUMBER NUM_EMPS THOURS TCOST
---------- ---------- ---------- ----------
1 1 5 100
2 1 10 200
3 1 10 200
10 1 10 200
20 3 25 522.5
30 2 25 530
Good so far, lets try this:
SELECT pnumber, count(distinct ssn) as num_emps,nvl(sum(hours),0) as thours, nvl(sum(hours*salary/2000),0) as tcost
FROM project
LEFT JOIN (works_on join (employee
LEFT JOIN Department on DNO=DNUMBER) on essn=ssn)
ON pnumber=pno
WHERE ssn IN (select SUPERSSN from EMPLOYEE) AND ssn NOT IN (select MGRSSN from DEPARTMENT)
GROUP BY pnumber
ORDER BY pnumber
I get:
PNUMBER NUM_EMPS THOURS TCOST
---------- ---------- ---------- ----------
1 1 32.5 487.5
2 1 7.5 112.5
Also good, but when I try this (all i did was add another Not in the Where clause):
SELECT pnumber, count(distinct ssn) as num_emps,nvl(sum(hours),0) as thours, nvl(sum(hours*salary/2000),0) as tcost
FROM project
LEFT JOIN (works_on join (employee
LEFT JOIN Department on DNO=DNUMBER) on essn=ssn)
ON pnumber=pno
WHERE ssn NOT IN (select SUPERSSN from EMPLOYEE) AND ssn NOT IN (select MGRSSN from DEPARTMENT)
GROUP BY pnumber
ORDER BY pnumber
I get "no rows selected"! How? There should still be 2 employees that are not managers or supervisors for pnumber 1. Please help? Thanks,
EDIT
Here is the result with just one NOT IN:
SQL> SELECT pnumber, count(distinct ssn) as num_emps,nvl(sum(hours),0) as thours, nvl(sum(hours*salary/2000),0) as tcost
2 FROM project
3 LEFT JOIN (works_on join (employee
4 LEFT JOIN Department on DNO=DNUMBER) on essn=ssn)
5 ON pnumber=pno
6 WHERE ssn NOT IN (select MGRSSN from DEPARTMENT)
7 GROUP BY pnumber
8 ORDER BY pnumber;
PNUMBER NUM_EMPS THOURS TCOST
---------- ---------- ---------- ----------
1 3 62.5 927.5
2 2 27.5 362.5
3 1 40 760
10 2 45 562.5
30 2 35 437.5
Upvotes: 0
Views: 98
Reputation: 1270503
NOT IN
returns no rows where any value is NULL
. So:
WHERE ssn NOT IN (select SUPERSSN from EMPLOYEE)
will filter everything out if even one row as SUPERSSN
as NULL
. Here are two ways to fix it:
WHERE ssn NOT IN (select SUPERSSN from EMPLOYEE where SUPERSSN is not null)
or:
WHERE NOT EXISTS (select 1 from EMPLOYEE e where e.SUPERSSN = ssn)
(You might need an alias on the ssn
, but I have no idea what table it comes from.) In other words, the semantics of NOT EXISTS
and NOT IN
are different when there is a NULL
value. NOT EXISTS
(in my opinion) has the more intuitive behavior.
Now, why does this never return true if one of the values is NULL
?
WHERE ssn NOT IN (select SUPERSSN from EMPLOYEE)
If ssn
is in the list of SUPERSSN
, then it returns false. That is easy. If ssn
is not in the list and none of the values are NULL
, then it returns true. That is easy.
If ssn
is in the list and one of the values of SUPERSSN
is NULL
, then there's a conundrum. Is ssn
equal to the NULL
value or not. Well, neither. Comparisons to NULL
return NULL
. And, NULL
is treated as "not true" in WHERE
clauses. In short, if SUPERSSN
has a NULL
value, the expression can only return false or NULL
-- and everything gets filtered out.
Upvotes: 1
Reputation: 369
Try:
WHERE ssn NOT IN (select SUPERSSN from EMPLOYEE union select MGRSSN from DEPARTMENT)
Upvotes: 0