Remove duplicates based on a value from a different column

Question

In MS Visual Studio, I have following table 1, where I want to remove duplicate caseNum based on PrimIndicator column. If case number has a duplicate caseNum, I want to keep the row with "True" PrimIndicator, if caseNum is not duplicate then keep the CaseNum.

Table1:

ID  CaseNum PrimIndicator
1   AAA-123 TRUE
2   AAA-123 FALSE
3   BBB-546 
4   CCC-753 TRUE
5   CCC-753 
6   ABC-347 
7   DEF-546 TRUE
8   DEF-346

Want Table2:

ID  CaseNum PrimIndicator
1   AAA-123 TRUE
3   BBB-546 
4   CCC-753 TRUE
6   ABC-347 
7   DEF-546 TRUE

So far I have tried this but it seems that max() function is not working if the PrimIndicator is blank.

Select ID, distinct CaseNum, Max(PrimIndicator)
from Table1
group by CaseNum, PrimIndicator

Gordon Linoff · Accepted Answer

If you want only one row per casenum, I would suggest row_number():

select t.*
from (select t.*,
             row_number() over (partition by casenum
                                order by case when primeindicator = 'true' then 1 else 2 end
                               ) as seqnum
      from t
     ) t
where seqnum = 1;

This guarantees one row per casenum in the result set.

Given your data, you could also use where logic:

select t.*
from t
where t.primeindicator = 'true' or 
      not exists (select 1
                  from t t2
                  where t2.casenum = t.casenum and
                        t2.primeindicator = 'true'
                 );

This does not guarantee one row per casenum -- but it does work for the data you have provided.

Remove duplicates based on a value from a different column

Answers (2)

Related Questions