Sarah Reinke
Sarah Reinke

Reputation: 57

Simple Do-Loop: Checking to see if the row data before is the same

I'm used to coding in MatLab, SAS is still new to me when using Do Loops. My goal is to check the row above/before the current row to see if the current and previous row numbers match. If the vector data above the first row matches the current row, I want to be able to check this.

Num (column vector) = 123,123,134,156,290,348,596,569,569

/*---- Do-Loop ----*/
data full;
set work.Test;
 count=_N_;

do i=1 to count;
    Current = Num(i);
    Prev = Num(i-1);
end;

run;

To me "Current" should be referencing the i-th cell of the vector "Num" and "Prev" should be referencing the i-th cell before the "Current" cell. Can anybody help me?

Upvotes: 2

Views: 2950

Answers (3)

Vinith Srinivas
Vinith Srinivas

Reputation: 31

Same can be achieved by lag function too.

Data Me; 
Input Pid Visit Dose;
Cards;
1 1 2
1 2 3
1 3 4
1 4 4
;
Run;

Data Me(Drop=Dose1); 
Set Me;
Dose1 = Lag(Dose);
If _n_ gt 1 and Dose ne Dose1 Then Why = "Reason";
Run;

Upvotes: 0

Vinith Srinivas
Vinith Srinivas

Reputation: 31

I am attaching a code which checks the previous row's value and checks with the present value and update a column if there is a change.

Let me know if this helps.

Data Me;     
Input Pid Visit Dose;    
Cards;    
1 1 2    
1 2 3    
1 3 4    
1 4 4    
;
Run;

Data Me;
Set Me;
Length Dose_Change_Why $20.;
Dose_Change_Why = " ";
Run;

%Macro Chg;

Proc Sql;
Select Count(Dose) into:Cnt From Me;
Quit;

Data _null_;
Cnt = "&Cnt.";
Call Symput('Cnts', Compress(Trim(Cnt)));
Run;

Proc Sql;
Select Dose into:Dos1-:Dos&Cnts. From Me;
Quit;

%Do i = 1 %To &Cnt.-1;

Data _null_;
i = &i.;
j = &i. + 1;
Call symput('j', Compress(Trim(j)));
Run;

%If "&&Dos&i." ne "&&Dos&j." %Then %Do;
%Put &&Dos&i. &&Dos&j.;
Proc Sql;
Update Me set Dose_Change_Why = "Reason for change" Where Visit = &j.;
Quit;
%End;

%End;

%Mend;

%Chg;

Proc Print;
Run;

Upvotes: 0

Joe
Joe

Reputation: 63424

SAS is significantly different than Matlab in that Matlab has vectors and matrices, and base SAS does not. Base SAS is closer to SQL than a matrix language; you manipulate columns inside of rows. If you want to treat SAS like a matrix language, see if you have SAS-IML licensed (PROC SETINIT; RUN; should show you what you have licensed, or ask your SAS administrator). That language is very very similar to Matlab.

As far as using base SAS to solve this, what you do depends on your goal. If it's solely to compare every row to the previous row [or next row, since those two goals are identical], you can do this easily with PROC COMPARE.

For example:

data class_one/view=class_one;
set sashelp.class(firstobs=2);
run;

proc compare base=sashelp.class compare=class_one out=class_compare;
run;

There are lots of options with PROC COMPARE to get it to exactly what you want, see the documentation for more details.

You can also do a match merge to the next row very easily.

data want;
merge sashelp.class(in=a) sashelp.class(in=b firstobs=2 rename=(...) keep=...);
... do stuff ...
run;

You need to rename the variables coming in from the second dataset, though, since they'll overwrite the first dataset's variables otherwise, and use the keep option to only keep the ones you're working with - and keep uses the ORIGINAL name, not the renamed name. Then in ...do stuff... you put whatever code you are using to compare the row to the next row. If 'previous row' is important (ie, you want rows [2..n] as final output, not [1..n-1]), then move the firstobs=2 to the first dataset. (That tells it to skip the first observation, so one dataset starts at 1 and one starts at 2 and then both iterate beyond that.)

If you're just checking the row number, this is a little bit easier. You can do this:

data want;
set have;
by num notsorted; *notsorted is needed unless you do expect it to be in numeric order;
if first.num then first_obs_for_rownum=1;
if last.num then last_obs_for_rownum=1;
run;

Obviously you can do other things with those then clauses, I'm just trying to be clear as to what they do. if first.num and last.num would indicate that a row is unique by rownumber (at least, in the order on the dataset - if it is unsorted, that same num could appear elsewhere, but not immediately before/after that row). first.num is a 1/0 variable that indicates whether the previous row had a different (1) or same (0) value for num; last.num is a 1/0 variable that indicates whether the next row has a different (1) or same (0) value for num. Those can be used for any variable (so first.id first.name etc. work for those variables), so long as they appear on the by statement in some fashion; and if you have a multiple variable by statement, they are reset each time any variable to the left of that variable is reset.

Upvotes: 1

Related Questions