Rohit
Rohit

Reputation: 139

How to compare 2 tables for matching values of a column in SAS?

I have a source table with ~40,000,000 unique/distinct entries and a new table with ~80,000,000 unique/distinct entries, both one columns each.

I need to create a flag column in the new table, which displays 1 if the corresponding observation's unique entry is also present(has matches) in the source table, and 0 if it has no matches.

How do I go about it most efficiently?

Is using proc sql inevitable, or can it be solved using SAS alone?

Upvotes: 1

Views: 1888

Answers (2)

Snot
Snot

Reputation: 79

Here's my solution for your reference:

/*step1: create a new dataset containing unique values of your new table*/
data new1;
    set you_new_table(keep=var1 var2 var3...varx);
proc sort nodup;
    by var1 var2 var3...varx;
run;

/*step2: create a new dataset containing unique values of your old table*/
data old1;
    set you_old_table(keep=var1 var2 var3...varx);
proc sort nodup;
    by var1 var2 var3...varx;
run;

/*step3: merge the two table that just created (inner join)*/
data compare1;
    merge old1(in=a) new1(in=b);
    by var1 var2 var3...varx;
    if a and b;
proc sort;
    by var1 var2 var3...varx;
run;

/*step4: merge the inner join dataset with the new source table and generate the flag*/
data new2;
    set you_new_table;
proc sort;
    by var1 var2 var3...varx;
run;

data new3;
    /*merge new2(in=a) new1(in=b); sry, a mistake, shame.*/
    merge new2(in=a) compare1(in=b);
    by var1 var2 var3...varx;
    if a;
    if b then flag=1;
    else flag=0;
run;

Upvotes: 2

Tom
Tom

Reputation: 51566

To add a new variable to a dataset you need to make a new dataset. Assuming the data is already sorted on the key variable just do a MERGE. You can use the IN= dataset option to check which datasets contribute observations to the merge.

data want ;
  merge new(in=innew) old(in=inold);
  by key_var;
  if innew;
  new_var = inold;
run;

Upvotes: 3

Related Questions