Reputation: 13
In SAS, I have ID, Date1, and Date2 sorted by ascending ID, Date1, and Date2. The sort causes the missing Date2 values to be where they are, as desired. How can I calculate the Date2 difference between rows with valid dates and obtain the results displayed in D_Date2?
In words it is: BY ID, skip a missing date value in Date2, read the next valid date under it, subtract the earlier date from the later, and write the difference as D_Date2 to the row that has the valid Date2 value. Thanks.
Obs ID Date1 Date2 D_Date2
1 1 20090815 20090818 .
2 1 20090815 20090818 0
3 1 20090816 20090820 2
4 1 20090816 . .
5 1 20090816 20090820 0
6 2 20090101 . .
7 2 20090105 20090105 .
8 2 20090105 . .
9 2 20090105 20090106 1
10 2 20090105 20090110 4
11 3 20080720 . .
12 3 20080720 20080917 .
13 3 20080720 20080918 1
14 3 20081010 . .
15 3 20081010 20080925 7
16 3 20081010 20080925 0
Upvotes: 0
Views: 3982
Reputation: 2762
I'm sure you could use retain
, but I'd use the lag
function. The key here is to understand that the lag
function does not necessarily return the value from the previous row. If it follows an if
condition, the lag
function returns the value from the last row where the condition was true.
I like doing these things step-by-step for clarity. First I create a new variable ldate2
that contains the date that is to be subtracted to get the desired difference, then I perform the subtraction.
data want;
set have;
if not missing(date2) then do;
ldate2 = lag(date2);
if id ne lag(id) then ldate2 = .;
end;
d_date2 = date2 - ldate2;
run;
As rbet suggests, using a dif
function is simpler. dif
behaves like lag
, except it subtracts the previous value from the current value, so there's no need to perform the subtraction separately:
data want;
set have;
if not missing(date2) then do;
d_date2 = dif(date2);
if id ne lag(id) then d_date2 = .;
end;
run;
Upvotes: 1