Yicun Zhen
Yicun Zhen

Reputation: 139

SAS: how to check if the same variable from two datasets are different?

I have two datasets DA and DB, both contains the variable ssn and are sorted by ssn. How do I check whether DA contains some ssn that is not listed in DB and vice versa? I considered proc compare but it seems that the compare procedure can only compare the observations at the same position. For example, if

data DA;
    input ssn &;
    datalines;
    100-00-0001
    100-00-0003
    100-00-0004
;


data DB;
    input ssn &;
    datalines;
    100-00-0001
    100-00-0002
    100-00-0003
;

Then the results of compare will list 100-00-0003 and 100-00-0004 although both datasets contain 100-00-0003. How to solve this issue?

Upvotes: 0

Views: 1663

Answers (3)

Shenglin Chen
Shenglin Chen

Reputation: 4554

Use proc sql:

proc sql;
    select * from DA where ssn not in(select ssn from DB);
    select * from DB where ssn not in(select ssn from DA);
quit;

Upvotes: 0

Quentin
Quentin

Reputation: 6378

You can use an ID statement in PROC COMPARE. Untested:

proc compare base=da compare=db listobs ;
  id SSN ;
run ;

There are options to output to a dataset as well.

Upvotes: 0

user667489
user667489

Reputation: 9569

Merge the two datasets together by SSN and output to different datasets, e.g.

data A_only B_only;
  merge DA(in = a keep = SSN) 
        DB(in = b keep = SSN);
  by SSN;
  if a and not(b) then output a_only;
  if b and not(a) then output b_only;
run;

Upvotes: 1

Related Questions