Reputation: 3181
I have the following two sas datasets:
data have ;
input a b;
cards;
1 15
2 10
3 40
4 200
1 25
2 15
3 10
4 75
1 1
2 99
3 30
4 100
;
data ref ;
input x y;
cards;
1 10
2 20
3 30
4 100
;
I would like to have the following dataset:
data want ;
input a b outcome ;
cards;
1 15 0
2 10 1
3 40 0
4 200 0
1 25 0
2 15 1
3 10 1
4 75 1
1 1 1
2 99 0
3 30 1
4 100 1
;
I would like to create a variable 'outcome' which is produced by an if statement upon conditions of variables a, b, x and y. As in reality the 'have' dataset is extremely large I would like to avoid a sort and merging the two datasets together (where a = x).
I am trying to use macro variables with the following code:
data _null_ ;
set ref ;
call symput('listx', x) ;
call symput('listy', y) ;
run ;
data want ;
set have ;
if a=&listx and b le &listy then outcome = 1 ; else outcome = 0 ;
run ;
which does not however produce the desired result:
data want ;
input a b outcome ;
cards;
1 15 0
2 10 1
3 40 0
4 200 0
1 25 0
2 15 1
3 10 1
4 75 1
1 1 1
2 99 0
3 30 1
4 100 1
;
Upvotes: 0
Views: 71
Reputation: 1424
redone my solution using hash tables. Below my approach
data ref2(rename=(x=a));
set ref ;
run;
data want;
declare Hash Plan ();
rc = plan.DefineKey ('a'); /*x originally*/
rc = plan.DefineData('a', 'y');
rc = plan.DefineDone();
do until (eof1);
set ref2 end=eof1;
rc = plan.add(); /*add each record from ref2 to plan (hash table)*/
end;
do until (eof2);
set have end=eof2;
call missing(y);
rc = plan.find();
outcome = (rc =0 and b<y);
output;
end;
stop;
run;
hope it helps
Upvotes: 2