jackstraw22
jackstraw22

Reputation: 641

Issue with NULL values using NOT IN

I have a query where I'm trying to remove all records where if the combination of practice and entity only shows up once, and that one record has a status of Terminated or Withdrawn, then I want it removed. If the practice/entity has more than one record, and at least one of those records is status of other than Terminated/Withdrawn, then I want to keep those other records. I wrote a query that does this, which is shown below.

However, the issue I'm having is that the query is removing NULL records. I understand that this is because of the Exists/NOT IN clauses, where NULL <> (Terminated, Withdrawn) evaluates to UNKNOWN.

I tried writing the query in the opposite way, using NOT EXISTS/IN, but that removed thousands of other records. Does anyone know how to include the NULL values in this situation?

    declare @tab Table
    (practice  varchar(100),
    entity    varchar(100),
    assignedto varchar(100),
    statusname varchar(50)
     )
    insert into @tab values ('F&S', 'Sinai', 'AnnM', NULL);
    insert into @tab values ('F&S', 'Levin', 'AnnM', NULL);
    insert into @tab values ('F&S', 'Hopkins','AnnM', NULL);

    select *,
    ROW_NUMBER()over(partition by entity, statusname order by entity) as rn
    from @tab t1
    where exists(select *
                 from @tab t2
                 where t1.practice = t2.practice
                 and t1.entity= t2.entity
                 and t1.assignedto = t2.assignedto
                 and t2.statusname not in( 'Withdrawn', 'Terminated'));

Upvotes: 0

Views: 57

Answers (2)

Juan Carlos Oropeza
Juan Carlos Oropeza

Reputation: 48197

I rather use t2.statusname IS NULL so optimizer still can use any index.

select *,
       ROW_NUMBER()over(partition by entity, statusname order by entity) as rn
from @tab t1
where exists(select 1
             from @tab t2
             where t1.practice = t2.practice
               and t1.entity= t2.entity
               and t1.assignedto = t2.assignedto
               and (t2.statusname IS NULL OR
                    t2.statusname not in ('Withdrawn', 'Terminated')
                   )
            );

Upvotes: 2

Lukasz Szozda
Lukasz Szozda

Reputation: 175766

You could use ISNULL with dummy value:

select *,
ROW_NUMBER()over(partition by entity, statusname order by entity) as rn
from @tab t1
where exists(select *
             from @tab t2
             where t1.practice = t2.practice
             and t1.entity= t2.entity
             and t1.assignedto = t2.assignedto
             and ISNULL(t2.statusname, '@@@') 
                 not in( 'Withdrawn', 'Terminated'));

Rextester.com

Upvotes: 1

Related Questions