Reputation: 705
I have 4 columns in my SAS dataset as shown in first image below. I need to compare the dates of consecutive rows by ID. For each ID, if Date2 occurs before the next row's Date1 for the same ID, then keep the Bill amount. If Date2 occurs after the Date1 of the next row, delete the bill amount. So for each ID, only keep the bill where the Date2 is less than the next rows Date1. I have placed what the result set should look like at the bottom.
Upvotes: 0
Views: 855
Reputation: 28411
If you sort your data so that you are looking to the previous obs to compare your dates, you can use a the LAG Function in a DATA STEP.
Upvotes: 1
Reputation: 873
You'll want to create a new variable that moves the next row's DATE1 up one row to make the comparison. Assuming your date variables are in a date format, use PROC EXPAND
and make the comparison ensuring that you're not comparing the last value which will have a missing LEAD value:
DATA TEST;
INPUT ID: $3. DATE1: MMDDYY10. DATE2: MMDDYY10. BILL: 8.;
FORMAT DATE1 DATE2 MMDDYY10.;
DATALINES;
AA 07/23/2015 07/31/2015 34
AA 07/30/2015 08/10/2015 50
AA 08/12/2015 08/15/2015 18
BB 07/23/2015 07/24/2015 20
BB 07/30/2015 08/08/2015 20
BB 08/06/2015 08/08/2015 20
;
RUN;
PROC EXPAND DATA = TEST OUT=TEST1 METHOD=NONE;
BY ID;
CONVERT DATE1 = DATE1_LEAD / TRANSFORMOUT=(LEAD 1);
RUN;
DATA TEST2; SET TEST1;
IF DATE1_LEAD NE . AND DATE2 GT DATE1_LEAD THEN BILL=.;
RUN;
Upvotes: 1