Reputation: 51
I have 2 columns Start_date and End_date for one particular individual there are multiple rows on start_date and end_date, I want to calculate the no of days between sTART_DATE OF the SECOND row - END_DATE OF FIRST ROW.
Person Start End
A 5/11/2019 9/11/2019 I need to get 14/11/2019-9/11/2019 = 4
A 14/11/2019 20/11/2019 Like wise 10
A 1/12/2019 10/12/2019 2
A 12/12/2019 19/12/2019 0
B 1/1/2018 5/1/2018 4
B 9/1/2018 11/1/2018 0
Upvotes: 1
Views: 51
Reputation: 1770
You can use retain
statement. The rest of the algorithm repeats @Reeza solution:
proc sort data=have;
by person descending start;
run;
data want(drop=tStart);
set have;
by person;
retain tStart;
if not first.person then do;
diff = tStart-End;
end;
else do;
diff = 0;
end;
tStart = Start;
run;
proc sort data=want;
by person start;
run;
Input:
+--------+-----------+-----------+
| Person | Start | End |
+--------+-----------+-----------+
| A | 12DEC2019 | 19DEC2019 |
| A | 01DEC2019 | 10DEC2019 |
| A | 14NOV2019 | 20NOV2019 |
| A | 05NOV2019 | 09NOV2019 |
| B | 09JAN2018 | 11JAN2018 |
| B | 01JAN2018 | 05JAN2018 |
+--------+-----------+-----------+
Output:
+--------+-----------+-----------+------+
| Person | Start | End | diff |
+--------+-----------+-----------+------+
| A | 05NOV2019 | 09NOV2019 | 5 |
| A | 14NOV2019 | 20NOV2019 | 11 |
| A | 01DEC2019 | 10DEC2019 | 2 |
| A | 12DEC2019 | 19DEC2019 | 0 |
| B | 01JAN2018 | 05JAN2018 | 4 |
| B | 09JAN2018 | 11JAN2018 | 0 |
+--------+-----------+-----------+------+
Upvotes: 1
Reputation: 21294
Calculate the difference
*create sample data for testing;
data have;
input Person $ Start : ddmmyy10. End : ddmmyy10.;
format start end date9.;
cards;
A 5/11/2019 9/11/2019
A 14/11/2019 20/11/2019
A 1/12/2019 10/12/2019
A 12/12/2019 19/12/2019
B 1/1/2018 5/1/2018
B 9/1/2018 11/1/2018
;
run;
*sort by descending date to use LAG();
proc sort data=have;
by person descending start;
run;
data want;
set have;
*analyze by person;
by person;
*calculate the previous start date;
lagStart=lag(Start);
*do the calculation if not the first record for the person;
if not first.person then
diff=lagStart - end;
else
diff=.;
format lagStart date9.;
run;
*sort back to desired order;
proc sort data=want;
by person start;
run;
Upvotes: 2