Reputation: 307
Suppose the dataset has two columns
Date Time Price
01-03 11:00 1
01-03 11:02 2
01-03 11:02 3
01-03 11:03 4
01-03 11:07 5
01-04 11:00 4
01-04 11:01 6
01-04 11:01 7
I need to add a new column that equals to the posted price in the most recently minutes in the same day. If this minutes has multiple price, it should select the first one. For example
Date Time Price New
01-03 11:00 1 2
01-03 11:02 2 4
01-03 11:02 3 4
01-03 11:03 4 5
01-03 11:07 5 .
01-04 11:00 4 6
01-04 11:01 6 .
01-04 11:01 7 .
I use the code (by date; if first.time;
) to solve the problem of multiple date. Since the gap between time is not fixed, I cannot use lag function in this case. So I don't know how to select the first price in next most recently minute. Could anyone give me some solutions? THX
Upvotes: 0
Views: 98
Reputation: 63424
With just base SAS, this isn't particularly hard, though this isn't the most efficient method if you have very large datasets, this should perform okay in general.
We iterate over the dataset starting at the following row, until we find a row that has either a different time OR a different date. If different time, save that price as the new price, and if different date (or EOF) clear the new price variable.
data have;
input Date :$5. Time :time5. Price;
format time time5.;
datalines;
01-03 11:00 1
01-03 11:02 2
01-03 11:02 3
01-03 11:03 4
01-03 11:07 5
01-04 11:00 4
01-04 11:01 6
01-04 11:01 7
;;;;
run;
data want;
set have nobs=nobs;
do _i = _n_ to nobs until (new_date ne date or new_time > time);
set have(rename=(date=new_date price=new_price time=new_time)) point=_i;
end;
if (date ne new_date) or (_i > nobs) then call missing(new_price); *times that it searched beyond limits;
run;
Upvotes: 1