Akhilesh Yadav
Akhilesh Yadav

Reputation: 51

SAS Date counter

I have 2 columns Start_date and End_date for one particular individual there are multiple rows on start_date and end_date, I want to calculate the no of days between sTART_DATE OF the SECOND row - END_DATE OF FIRST ROW.

Person  Start   End     
A   5/11/2019   9/11/2019   I need to get 14/11/2019-9/11/2019 =    4
A   14/11/2019  20/11/2019  Like wise   10
A   1/12/2019   10/12/2019      2
A   12/12/2019  19/12/2019      0
B   1/1/2018    5/1/2018        4
B   9/1/2018    11/1/2018       0

Upvotes: 1

Views: 51

Answers (2)

Llex
Llex

Reputation: 1770

You can use retain statement. The rest of the algorithm repeats @Reeza solution:

proc sort data=have;
    by person descending start;
run;

data want(drop=tStart);
   set have;
   by person;
   retain tStart;
   if not first.person then do;
      diff = tStart-End;  
   end;
   else do;
      diff = 0;
   end;
   tStart = Start;
run;

proc sort data=want;
    by person start;
run;

Input:

+--------+-----------+-----------+
| Person |   Start   |    End    |
+--------+-----------+-----------+
| A      | 12DEC2019 | 19DEC2019 |
| A      | 01DEC2019 | 10DEC2019 |
| A      | 14NOV2019 | 20NOV2019 |
| A      | 05NOV2019 | 09NOV2019 |
| B      | 09JAN2018 | 11JAN2018 |
| B      | 01JAN2018 | 05JAN2018 |
+--------+-----------+-----------+

Output:

+--------+-----------+-----------+------+
| Person |   Start   |    End    | diff |
+--------+-----------+-----------+------+
| A      | 05NOV2019 | 09NOV2019 |    5 |
| A      | 14NOV2019 | 20NOV2019 |   11 |
| A      | 01DEC2019 | 10DEC2019 |    2 |
| A      | 12DEC2019 | 19DEC2019 |    0 |
| B      | 01JAN2018 | 05JAN2018 |    4 |
| B      | 09JAN2018 | 11JAN2018 |    0 |
+--------+-----------+-----------+------+

Upvotes: 1

Reeza
Reeza

Reputation: 21294

  1. Sort depending by date so that you can use the LAG() function
  2. Get the lagged start date
  3. Calculate the difference

    *create sample data for testing;
    
    data have;
        input Person $ Start : ddmmyy10.  End  : ddmmyy10.;
        format start end date9.;
        cards;
    A   5/11/2019   9/11/2019   
    A   14/11/2019  20/11/2019  
    A   1/12/2019   10/12/2019      
    A   12/12/2019  19/12/2019      
    B   1/1/2018    5/1/2018        
    B   9/1/2018    11/1/2018   
    ;
    run;
    
    *sort by descending date to use LAG();
    
    proc sort data=have;
        by person descending start;
    run;
    
    data want;
        set have;
        *analyze by person;
        by person;
        *calculate the previous start date;
        lagStart=lag(Start);
        *do the calculation if not the first record for the person;
    
        if not first.person then
            diff=lagStart - end;
        else
            diff=.;
        format lagStart date9.;
    run;
    
    *sort back to desired order;
    
    proc sort data=want;
        by person start;
    run;
    

Upvotes: 2

Related Questions