Reputation: 31
Given two simple datasets A and B as follows
DATA A; INPUT X @@;
CARDS;
1 2 3 4
RUN;
DATA B; INPUT Y @@;
CARDS;
1 2
RUN;
I am trying to create two datasets named C and D, one using repeated SET and OUTPUT statements and another using DO loop.
DATA C;
SET B;
K=1; DO; SET A; OUTPUT; END;
K=K+1; DO; SET A; OUTPUT; END;
K=K+1;
RUN;
DATA D;
SET B;
DO K = 1 TO 2;
SET A; OUTPUT;
END;
RUN;
I thought that C and D should be the same as the DO loop is supposed to be repeating those statements as shown in the DATA step for C, but it turns out that they are different.
Dataset C:
Obs Y K X
1 1 1 1
2 1 2 1
3 2 1 2
4 2 2 2
Dataset D:
Obs Y K X
1 1 1 1
2 1 2 2
3 2 1 3
4 2 2 4
Could someone please explain this?
Upvotes: 3
Views: 1546
Reputation: 51566
The two SET A
statements in the first data step are independent. So on each iteration of the data step they will both read the same observation. So it is as if you ran this step instead.
data c;
set b;
set a;
do k=1 to 2; output; end;
run;
The SET A
statement in the second data step will execute twice on the first iteration of the data step. So it will read two observations from A
for each iteration of the data step.
If you really wanted to do a cross-join you would need to use point=
option so that you could re-read one of the data sets.
data want ;
set b ;
do p=1 to nobs ;
set a point=p nobs=nobs ;
output;
end;
run;
Upvotes: 3
Reputation: 1804
Your Table B has two obs so your code will only do two iterations:
Retain
keyword.Debugging:
Iteration 1 current view:
Obs Table X
1 A 1
Obs Table Y k
1 B 1 1
Output:
K=1; DO; SET A; OUTPUT; END;
Obs Y K X
1 1 1 1
K=K+1; DO; SET A; OUTPUT; END;
Obs Y K X
2 1 2 1
Iteration 2 current view:
Obs Table X
2 A 2
Obs Table Y k
2 B 2 1
Output:
K=1; DO; SET A; OUTPUT; END;
Obs Y K X
3 2 1 2
K=K+1; DO; SET A; OUTPUT; END;
Obs Y K X
4 2 2 2
Upvotes: 2