Reputation: 2583
suppose to have the following dataset:
data DB;
input ID: $20. Admission :date09. Discharge :date09. Index;
format Admission date09. Discharge date09.;
cards;
0001 13JAN2019 13JUL2019 1
0001 13JAN2019 13JUL2019 0
0002 11MAY2023 12JUN2023 1
0002 01DEC2023 22DEC2023 0
0002 01DEC2023 22DEC2023 0
0003 13JAN2014 17JAN2014 1
0003 23APR2015 28APR2015 0
0003 23APR2015 28APR2015 0
run;
Is there a way to get the following output?
data DB1;
input ID: $20. Admission :date09. Discharge :date09. Index;
format Admission date09. Discharge date09.;
cards;
0001 13JAN2019 13JUL2019 1
0001 13JAN2019 13JUL2019 0
0002 11MAY2023 12JUN2023 1
0002 01DEC2023 22DEC2023 0
0003 13JAN2014 17JAN2014 1
0003 23APR2015 28APR2015 0
run;
In other words I would like to retain one of the duplicated rows but only when Index = 0. For ID 0001 since Index is 1 but also 0 for duplicated rows nothing will happen. Instead, for ID 0002 for example I would like to retain one time the row corresponding to dates: 01DEC2023-22DEC2023. I tried with proc sort nodupkey but I don't know how to specify, for example with "where", the condition on the index value.
Thank you in advance
Upvotes: 1
Views: 28
Reputation: 4937
Try this
proc sort data = db;
by ID Admission Discharge;
run;
data want;
set db;
by ID Index notsorted;
if first.Index;
run;
Result:
ID Admission Discharge Index
0001 13JAN2019 13JUL2019 1
0001 13JAN2019 13JUL2019 0
0002 11MAY2023 12JUN2023 1
0002 01DEC2023 22DEC2023 0
0003 13JAN2014 17JAN2014 1
0003 23APR2015 28APR2015 0
Upvotes: 1