proc sort with nodupkey based on the value of a variable

Question

suppose to have the following dataset:

data DB;
  input ID: $20. Admission :date09. Discharge :date09. Index;
  format Admission date09. Discharge date09.;
cards;
0001 13JAN2019 13JUL2019  1 
0001 13JAN2019 13JUL2019  0 
0002 11MAY2023 12JUN2023  1
0002 01DEC2023 22DEC2023  0
0002 01DEC2023 22DEC2023  0
0003 13JAN2014 17JAN2014  1
0003 23APR2015 28APR2015  0
0003 23APR2015 28APR2015  0
run;

Is there a way to get the following output?

data DB1;
  input ID: $20. Admission :date09. Discharge :date09. Index;
  format Admission date09. Discharge date09.;
cards;
0001 13JAN2019 13JUL2019  1 
0001 13JAN2019 13JUL2019  0 
0002 11MAY2023 12JUN2023  1
0002 01DEC2023 22DEC2023  0
0003 13JAN2014 17JAN2014  1
0003 23APR2015 28APR2015  0
run;

In other words I would like to retain one of the duplicated rows but only when Index = 0. For ID 0001 since Index is 1 but also 0 for duplicated rows nothing will happen. Instead, for ID 0002 for example I would like to retain one time the row corresponding to dates: 01DEC2023-22DEC2023. I tried with proc sort nodupkey but I don't know how to specify, for example with "where", the condition on the index value.

Thank you in advance

PeterClemmensen · Accepted Answer

Try this

proc sort data = db;
   by ID Admission Discharge;
run;

data want;
   set db;
   by ID Index notsorted;
   if first.Index;
run;

Result:

ID    Admission  Discharge  Index
0001  13JAN2019  13JUL2019  1
0001  13JAN2019  13JUL2019  0
0002  11MAY2023  12JUN2023  1
0002  01DEC2023  22DEC2023  0
0003  13JAN2014  17JAN2014  1
0003  23APR2015  28APR2015  0

proc sort with nodupkey based on the value of a variable

Answers (1)

Related Questions