Reputation: 3
I have a database like this. This corresponds to a single person and I have this type of data for multiple persons.
data test;
input date YYMMDD10. real_length min_length;
format date YYMMDD10.;
cards;
2000-02-23 1 7
2000-02-24 12 15
2000-03-07 15 7
2000-03-22 7 15
2000-03-29 13 7
2000-04-11 17 7
2000-04-28 . 7
run;
What I am looking for is : if the interval between 2 dates in consecutive lines (real_length) is inferior to a certain length (min_length), I want to replace the date in the next line by the previous date + min_length. So far, this is not a problem and here is the code I used to achieve it:
data test2;
set test;
format lagdate min_date YYMMDD10.;
retain lagmin lagdate;
if lag(real_length) < lag(min_length) and lag(real_length) ~= . then min_date = lagdate + lagmin;
else min_date = date;
lagdate = min_date;
lagmin = min_length;
run;
Which gives :
date min_date min_length
2000-02-23 2000-02-23 7
2000-02-24 2000-03-01 15
2000-03-07 2000-03-16 7
2000-03-22 2000-03-22 15
...
The problem is that now the interval between 2 consecutive dates could become less than the minimal length, e.g. : 2000-03-22 - 2000-03-16 = 6 days < min_length = 7. And I would like to have 2000-03-23 = 2000-03-16 + 7 (=min_length) instead of 2000-02-22 like this:
date min_date min_length
2000-02-23 2000-02-23 7
2000-02-24 2000-03-01 15
2000-03-07 2000-03-16 7
2000-03-22 2000-03-23 15
...
So I've tried this code, but it does not work... I believe the problem could be in the if condition.
data test2;
set test;
format lagdate min_date YYMMDD10.;
retain lagmin lagdate;
if (lag(real_length) < lag(min_length) and lag(real_length) ~= .) or (adjust_length < lag(min_length) and adjust_length ~=.) then min_date = lagdate + lagmin;
else min_date = date;
adjust_length = min_date - lagdate;
lagdate = min_date;
lagmin = min_length;
run;
Does anybody see why this isn't working or do you hve another way of doing this?
Thank you!
Upvotes: 0
Views: 174
Reputation: 9569
The problem is that each time you adjust one date, you have to move all the subsequent dates as well if they're bunched up together. I think you can do this by keeping a running total of how many days you've added on to all the previous rows and then adding on only what's needed after that to get to the min_length between dates:
data want;
set test;
format t_min_date min_date yymmdd10.;
if _n_ = 1 then total_adj = 0;
t_min_date = date + min_length + total_adj;
min_date = lag1(t_min_date);
total_adj + max(0,min_length - real_length);
run;
Is that what you were aiming for?
N.B. you'll need to replace the if _n_ = 1
with some first.id
and last.id
logic to make this work for multiple individuals in the same dataset.
Upvotes: 0