Reputation: 15
I need to interleave to SAS dataset, but only if the patient ID exist in both of them. In a merge statement I'd use "in" and "if", however, I need to stack the data. Data is equivalent in terms of variables.
Any ideas?
Upvotes: 0
Views: 120
Reputation: 63434
If you have duplicates per ID in one or both datasets, then you have a bunch of other solutions. This is the one most similar to your MERGE idea.
In a Double DoW loop, you loop through the datasets twice, once to check for your condition, then once to actually output. This lets you look at all rows for each ID, see if your condition is valid, then look at them all again to act on that condition.
data have_1;
do id = 1 to 20 by 2;
output;
output;
end;
run;
data have_2;
do id = 1 to 20 by 3;
output;
output;
end;
run;
data want;
_a=0; *initialize temporary variables;
_b=0; *they will be cleared once for each ID;
do _n_ = 1 by 1 until (last.id);
set have_1(in=a) have_2(in=b);
by id;
if a then _a=1; *save that value temporarily;
if b then _b=1; *again temporary;
end;
do _n_ = 1 by 1 until (last.id);
set have_1 have_2;
by id;
if _a and _b then output; *only output the rows that have both _a and _b;
end;
run;
Upvotes: 0
Reputation: 63434
If you have precisely one row on either dataset, this is fairly easy to do in the data step.
data have_1;
do id = 1 to 20 by 2;
output;
end;
run;
data have_2;
do id = 1 to 20 by 3;
output;
end;
run;
data want;
set have_1 have_2;
by id;
if not (first.id and last.id);
run;
Basically, you only output a row if it is not the first or not the last row for that ID - which will be true iff it is in both datasets. This doesn't work if you have more than one row in either dataset per ID.
Upvotes: 0
Reputation: 361
It's a bit of a faf work around but if the datasets are the same then you could try the below. Assuming you're matching on the variable ID.
proc sql;
select t1.*
from
TABLE_A t1
where ID in (select ID from TABLE_B)
union all
select t2.*
from
TABLE_B t2
where ID in (select ID from TABLE_A)
;quit;
Upvotes: 1