Rogue258
Rogue258

Reputation: 135

Compare data from different data sets SAS

I have two tables table_A and table_B.

table_A

Queue_nm Cust_nm
ABCD VERI
PQRS RSTQ

table_B

Queue_nm Act_Cust_nm
PQRS TMOB

I want to compare the two datasets and if they don't match, then replace the data in table_B in table_A using SAS

data compare_test_2;
set table_A;
set table_B;

if table_A.Queue_nm = table_B.Queue_nm
and tableA.Cust_nm != table_B.Act_Cust_name
then do;
tableA.Cust_nm = table_B.Act_Cust_name
where table_A.Queue_nm = table_B.Queue_nm;
run;

I want the following output after the comparison and the data step

table_A

Queue_nm Cust_nm
ABCD VERI
PQRS TMOB

I get the following error

ERROR: DATA STEP Component Object failure.  Aborted during the COMPILATION phase.
ERROR 557-185: Variable table_A is not an object.

Upvotes: 1

Views: 736

Answers (3)

Richard
Richard

Reputation: 27498

You can use SQL to update a table with values from a second table via a correlated subquery using a key variable.

Example:

proc sql;
  update table_a as outer
  set cust_nm = (select act_cust_nm from table_b as inner where inner.Queue_nm = outer.Queue_nm)
  where exists (select * from table_b as inner where inner.Queue_nm = outer.Queue_nm)
  ;

Another way to update a table in place is the MODIFY statement.

proc datasets nolist lib=work;
  modify table_a;
  create index Queue_nm;
quit;

data table_a;
  set table_b;
  modify table_a key=Queue_nm;
  cust_num = act_cust_num;
run;

Upvotes: 1

Tom
Tom

Reputation: 51566

Just merge. Make sure that both variables use the same name in both datasets. If you need to you can use dataset option to rename variables before they make it into the data step.

 data want;
   merge tablea tableb(rename=(Act_Cust_nm=Cust_nm));
   by queue_nm;
 run;

If you do not want to keep observations that only came from TABLEB then use the IN= dataset option to create a variable that will indicate if TABLEA is contributing to the observation.

 data want;
   merge tablea(in=ina) tableb;
   by queue_nm;
   if ina;
 run;

If you have multiple observations per QUEUE_NM in either dataset then using MERGE will not work.

Upvotes: 1

Stu Sztukowski
Stu Sztukowski

Reputation: 12849

You have not written valid SAS data step code. Your code is more SQL-like. The corrected code is below, but it can get even simpler than this.

data want;
    merge table_a(in=a)
          table_b(in=b)
    ;
    by queue_nm;

    if(a = b AND cust_nm NE act_cust_nm) then cust_nm = act_cust_nm;

    drop act_cust_nm;
run;

In the data step, the in= option is equivalent to a.queue_nm in SQL since we are merging by queue_nm. Think of it like an implied version of that.

We can simplify this even further by renaming act_cust_nm to cust_nm in table_b and overwriting any instance of cust_nm in table_a with the value from table_b. Since table_b is second in the merge statement, it will overwrite matching values of queue_nm in table_a.

data want;
    merge table_a
          table_b(rename=(act_cust_nm = cust_nm) )
    ;
    by queue_nm;
run;

The rules of how this works gets a little more complex when you have a one-to-many or many-to-many merge. I highly recommend reading more about how that works:

https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lestmtsref/n1i8w2bwu1fn5kn1gpxj18xttbb0.htm

Upvotes: 1

Related Questions