Reputation: 33
Good morning. I've this dataset:
Appendix | Change_Serial_Number| Status | Duration | Mileage | Service
20101234 0 . 60 120000 Z
20101234 1 Proposed 48 110000 Z
20101234 2 Activated 24 90000 Z
20101234 3 Proposed 60 120000 Z
20101234 4 Proposed 50 160000 B
20101234 5 Activated 36 110000 B
Each row is a variation that could be activated or only proposed with the first row with status like blank or the previously activated variation. I need to have this table:
Appendix | Change_Serial_Number| Status | Duration | Mileage | Service |Duration_Prev| Mileage_prev |
20101234 0 . 60 120000 Z .
20101234 1 Proposed 48 110000 Z 60 120000
20101234 2 Activated 24 90000 Z 60 120000
20101234 3 Proposed 60 120000 Z 24 90000
20101234 4 Proposed 50 160000 B 24 90000
20101234 5 Activated 36 110000 B 24 90000
I need to compare the duration, mileage and service of each variation with the previously activated or with the initial condition only if there aren't variation activated.
I tried with lag function to retrieve a data of previous row, but i need to retrieve data of 3 field and retrieve data only from the last activated variation or, if there aren't, from the initial condition.
I used this code:
proc sort data=db_rdg;
by Appendix Change_Serial_Number descending Change_Serial_Number;
run;
data db_rdg2;
set db_rdg;
by Appendix;
Duration_prev=lag(Duration);
if first. Appendix then Durata_prev =.;
run;
With this code, i can retrieve a data only from the previously row (not from the previosly actived row or from the first condition) and only for a duration variable (not at the same time for duration, mileage and service).
I hope I have been clear enough :)
Thank you for your help!
Upvotes: 0
Views: 70
Reputation: 1319
The lag()
function is only really useful for working with values from a specific number of observations earlier. In this case, you don't know whether the values you want to work with are from the previous observation or from five or six observations earlier, so instead of using lag()
, you should RETAIN
the additional variables and update their values when appropriate:
data db_rdg2;
retain duration_prev .;
set db_rdg;
by Appendix;
if first.Appendix or status = 'Activated' then duration_prev = duration;
run;
The RETAIN
statement allows duration_prev
to retain its value as each new observation in read from the input, instead of being reset to missing.
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000214163.htm
Upvotes: 1
Reputation: 27508
Instead of using LAG to retrieve the duration
from the prior row, you will want to store the activate state tracking variables (for duration, mileage and serial) in a variable that is retained and updated after an explicit output.
In these two sample codes I tossed in tracking serial as you may want to know # of changes from prior activate.
data have; input
Appendix Change_Serial_Number Status $ Duration Mileage Service $;
datalines;
20101234 0 . 60 120000 Z
20101234 1 Proposed 48 110000 Z
20101234 2 Activated 24 90000 Z
20101234 3 Proposed 60 120000 Z
20101234 4 Proposed 50 160000 B
20101234 5 Activated 36 110000 B
run;
* NOTE: _APA suffix means @ prior activate;
* version 1;
* implicit loop with by group processing means ;
* explicit first. test needed in order to reset the apa tracking variables;
data want;
set have;
by appendix;
if first.appendix then do;
length csn_apa dur_apa mil_apa 8;
call missing(csn_apa, dur_apa, mil_apa);
end;
output;
if status in (' ' 'Activate') then do;
csn_apa = change_serial_number;
dur_apa = duration;
mil_apa = mileage;
end;
retain csn_apa dur_apa mil_apa;
run;
* version 2;
* DOW version;
* explicit loop over group means first. handling not explicitly needed;
* implicit loop performs tracking variable resets;
* retain not needed because output and tracking variables modified;
* within current iteration of implicit loop;
data want2;
do until (last.appendix);
set have;
by appendix;
output;
if status in (' ' 'Activate') then do;
csn_apa = change_serial_number;
dur_apa = duration;
mil_apa = mileage;
end;
end;
run;
Upvotes: 0