Reputation: 11755
Suppose I have a SAS dataset that looks like this:
id x
1 1234
2 2345
3 3456
I need a new data set that has this data set read in (say) 2 times, with a new variable indicating which "replication" this is:
id x rep
1 1234 1
2 2345 1
3 3456 1
1 1234 2
2 2345 2
3 3456 2
It is important that the data are read in this exact order -- the entire initial data set is read once, then again, etc.
Any ideas on an efficient way to do this in a data step? (In reality my data set is huge, I need to read it several times, and I want to avoid sorting.)
I tried this, but the order of the observations in the new data set is not what I want:
data foo;
set tmp; rep=1; output;
set tmp; rep=2; output;
run;
Upvotes: 3
Views: 3339
Reputation: 106
If you want to keep to data step, then this will work as you described.
data foo;
set tmp (in=INA) tmp (in=INB);
if INA then REP=1;
if INB then REP=2;
run;
Upvotes: 8
Reputation: 164
data rep;
set tmp;
do rep = 1 to 2; /* or 3, or whatever */
output;
end;
proc sort;
by rep id;
run;
That's it.
Upvotes: 1
Reputation: 2174
You could try it using a view and proc append like this:
/* create view for rep=2 */
data rep2 / view=rep2;
set tmp;
rep = 2;
run;
/* create dataset for rep=1 */
data foo;
set tmp;
rep = 1;
run;
/* append rep=2 to rep=1 dataset */
proc append base=foo data=rep2;
run;
Upvotes: 0