Astro
Astro

Reputation: 614

Need to change previous value

Lets say we have the following data set

obs    check   result
---------------------
aa     0       0
bb     0       0
cc     1       0
dd     0       0
ee     1       1 
ff     1       1 <- we are here. Due to prev. value both eq 1 the "result" variable for both of them should also be 1

What I need is to check if 2 or more cases in "check" variable in a row are equal to 1 and if so, "result" variable for both of them should also be 1. The problem is that the only way I see to set "result" to 1 for both of them is to somehow go back to the previous "observed row" and change "result" variable there to 1 also.

Upvotes: 2

Views: 970

Answers (2)

Joe
Joe

Reputation: 63424

Yick's answer is a very good general approach, but this specific problem has an interesting, simpler answer.

There is one way in SAS of looking ahead without having to read the data a second time in some fashion: The LAST.<variable> automatic variables created by using a BY statement. These involve an automatic lookahead that SAS does without forcing you to do it by hand. Thanks to NOTSORTED, this lets you see if the next value is precisely equal to the current value for a variable.

Using Yick's dataset:

data work.sample;
    input obs $ check result;
    retain x;
    output;
    x = check;
    datalines;
    aa     0       0 
    bb     0       0
    cc     1       0
    dd     0       0
    ee     1       0 
    ff     1       0
    ;
run;

data want;
set sample;
by check notsorted;
if not (first.check and last.check) and (check=1) then result=1;  
*if it is not the first record AND the last record with the same check value in a row,
 then it is one of a sequence of two or more and if check=1 then result should be 1;
put _all_;
run;

Upvotes: 2

Yick Leung
Yick Leung

Reputation: 1078

Because of the way the Data step loop is designed, I don't think we can go back to a previous observation that easily. However, we can use a retain statement and mergeing the data set with itself without a by statement to access previous or next observations:

data work.sample;
    input obs $ check result;
    retain x;
    output;
    x = check;
    datalines;
    aa     0       0 
    bb     0       0
    cc     1       0
    dd     0       0
    ee     1       0 
    ff     1       0
    ;
run;

data work.sample2;
    merge work.sample
    work.sample(firstobs = 2 keep = check 
    rename = (check = c));
    if (check = 1 and x = 1) or (check = 1 and c = 1) then result = 1; 
run;

OUTPUT would look like this:

                      Obs    obs    check    result    x    c

                       1     aa       0         0      .    0
                       2     bb       0         0      0    1
                       3     cc       1         0      0    0
                       4     dd       0         0      1    1
                       5     ee       1         1      0    1
                       6     ff       1         1      1    .

So, variable x represents the values that are retained using the retain statement and output statement. variable c represents the variable that we created when we merged the data set with itself beginning when observation = 2.

Lastly, we follow this up with a conditional statement to put the new values into result variable if there exists a consecutive occurrence of check = 1.

Upvotes: 3

Related Questions