Interleaving SAS Data Sets (by common patient number)

I need to interleave to SAS dataset, but only if the patient ID exist in both of them. In a merge statement I'd use "in" and "if", however, I need to stack the data. Data is equivalent in terms of variables.

Any ideas?

Upvotes: 0

Views: 120

Answers (3)

Joe
Joe

Reputation: 63434

If you have duplicates per ID in one or both datasets, then you have a bunch of other solutions. This is the one most similar to your MERGE idea.

In a Double DoW loop, you loop through the datasets twice, once to check for your condition, then once to actually output. This lets you look at all rows for each ID, see if your condition is valid, then look at them all again to act on that condition.

data have_1;
  do id = 1 to 20 by 2;
    output;
    output;
  end;
run;

data have_2;
  do id = 1 to 20 by 3;
    output;
    output;
  end;
run;



data want;
  _a=0;  *initialize temporary variables;
  _b=0;  *they will be cleared once for each ID;
  do _n_ = 1 by 1 until (last.id);
    set have_1(in=a) have_2(in=b);
    by id;
    if a then _a=1;  *save that value temporarily;
    if b then _b=1;  *again temporary;
  end;
  do _n_ = 1 by 1 until (last.id);
    set have_1 have_2;
    by id;
    if _a and _b then output;  *only output the rows that have both _a and _b;
  end;
run;

Upvotes: 0

Joe
Joe

Reputation: 63434

If you have precisely one row on either dataset, this is fairly easy to do in the data step.

data have_1;
  do id = 1 to 20 by 2;
    output;
  end;
run;

data have_2;
  do id = 1 to 20 by 3;
    output;
  end;
run;

data want;
  set have_1 have_2;
  by id;
  if not (first.id and last.id);
run;

Basically, you only output a row if it is not the first or not the last row for that ID - which will be true iff it is in both datasets. This doesn't work if you have more than one row in either dataset per ID.

Upvotes: 0

Amit Patel
Amit Patel

Reputation: 361

It's a bit of a faf work around but if the datasets are the same then you could try the below. Assuming you're matching on the variable ID.

proc sql;
select t1.*
from
  TABLE_A t1
where ID in (select ID from TABLE_B)
union all
select t2.*
from
  TABLE_B t2  
where ID in (select ID from TABLE_A)
;quit;

Upvotes: 1

Related Questions