Reputation: 13
I am trying to fill in ID1 variable with same ID number when rdq=adq for each permco in SAS. Here is an example of my data.
permco rdq adq ID ID1
1 333 331 1 .
1 333 332 2 .
1 333 333 3 3
1 333 334 4 .
1 333 335 5 .
1 333 336 6 .
1 555 552 1 .
1 555 553 2 .
1 555 554 3 .
1 555 555 4 4
1 555 556 5 .
1 555 557 6 .
1 555 558 7 .
2 333 331 1 .
2 333 332 2 .
2 333 333 3 3
2 333 334 4 .
2 333 335 5 .
2 333 336 6 .
2 555 552 1 .
2 555 553 2 .
2 555 554 3 .
2 555 555 4 4
2 555 556 5 .
2 555 557 6 .
2 555 558 7 .
And what I desire to have is...
permco rdq adq ID ID1
1 333 331 1 3
1 333 332 2 3
1 333 333 3 3
1 333 334 4 3
1 333 335 5 3
1 333 336 6 3
1 555 552 1 4
1 555 553 2 4
1 555 554 3 4
1 555 555 4 4
1 555 556 5 4
1 555 557 6 4
1 555 558 7 4
2 333 331 1 3
2 333 332 2 3
2 333 333 3 3
2 333 334 4 3
2 333 335 5 3
2 333 336 6 3
2 555 552 1 4
2 555 553 2 4
2 555 554 3 4
2 555 555 4 4
2 555 556 5 4
2 555 557 6 4
2 555 558 7 4
I would like to fill in ID1 with ID number when rdq=adq.
Upvotes: 1
Views: 1632
Reputation: 9569
Double DoW loop solution:
data have01;
infile cards truncover expandtabs;
input permco rdq adq ID ID1 ;
cards;
1 333 331 1 .
1 333 332 2 .
1 333 333 3 3
1 333 334 4 .
1 333 333 5 5
1 333 336 6 .
1 555 552 1 .
1 555 553 2 .
1 555 554 3 .
1 555 555 4 4
1 555 556 5 .
1 555 557 6 .
1 555 558 7 .
2 333 331 1 .
2 333 332 2 .
2 333 333 3 3
2 333 334 4 .
2 333 335 5 .
2 333 336 6 .
2 555 552 1 .
2 555 553 2 .
2 555 554 3 .
2 555 555 4 .
2 555 556 5 .
2 555 557 6 .
2 555 558 7 .
;
run;
data want;
do _n_ = 1 by 1 until (last.rdq);
set have01;
by permco rdq;
if first.rdq then call missing(ID1);
if adq = rdq then t_ID1 = ID1;
drop t_ID1;
end;
do _n_ = 1 to _n_;
set have01;
ID1 = t_ID1;
output;
end;
run;
This assumes that if there are multiple matches, the last one should take precedence. If there are no matches then every row for that group gets a missing value.
Upvotes: 1
Reputation: 331
I assume you want to have the same number within the by-group defined by permco and rdq. The possibility to have two or more matches within one by group is handled with a variable matchespergroup. If no matches are found in one group id1 is a missing value.
data have01;
infile cards truncover expandtabs;
input permco rdq adq ID ID1 ;
cards;
1 333 331 1 .
1 333 332 2 .
1 333 333 3 3
1 333 334 4 .
1 333 333 5 5
1 333 336 6 .
1 555 552 1 .
1 555 553 2 .
1 555 554 3 .
1 555 555 4 4
1 555 556 5 .
1 555 557 6 .
1 555 558 7 .
2 333 331 1 .
2 333 332 2 .
2 333 333 3 3
2 333 334 4 .
2 333 335 5 .
2 333 336 6 .
2 555 552 1 .
2 555 553 2 .
2 555 554 3 .
2 555 555 4 4
2 555 556 5 .
2 555 557 6 .
2 555 558 7 .
run;
data want(drop=rv);
if 0 then set have01;
if _N_=1 then
do;
declare hash hh(dataset:"have01(where=(adq=rdq))",ordered:'A',multidata:'Y');
hh.definekey('permco','rdq');
hh.definedata('id1');
hh.definedone();
end;
do until(theend);
set have01 end=theend;
rv = hh.find();
hh.has_next(result: matchespergroup);
if rv=0 then do; matchespergroup+1; output;end;
else do; id1 = .;output;end;
end;
run;
Upvotes: 0
Reputation: 1120
I think loops are unnecessary here. All you need to do is find the value you want in each group and merge it back onto the original dataset:
proc sql;
create table temp as select distinct
permco, rdq, id
from have (where = (rdq = adq));
quit;
proc sql;
create table want as select distinct
a.*, b.id as id_filled
from have as a
left join temp as b
on a.permco = b.permco and a.rdq = b.rdq;
quit;
Upvotes: 0
Reputation: 1
If your data is like for a combination of Permco and RDQ you have unique value of ID 1 and you need to fill same value through out for for that combination :
An Alternative can be :
Merge your input data with this data on Permco and RDQ .
Data intermediate ; set input_data (keep Permco RDQ ID1); If ID1=. then delete; Run;
proc sort data=input_data out=input_data_1(drop ID1);by Permco RDQ;run;
proc sort data=intermediate;by Permco RDQ;run;
data final; merge input_data_1(in=a) intermediate(in=b); by Permco RDQ; if a; run;
Upvotes: 0