Left join table on multiple tables in SAS

Question

I've got multiple master tables in the same format with the same variables. I now want to left join another variable but I can't combine the master tables due to limited storage on my computer. Is there a way that I can left join a variable onto multiple master tables within one PROC SQL? Maybe with the help of a macro?

The LEFT JOIN code looks like this for one join but I'm looking for an alternative than to copy and paste this 5 times:

PROC SQL;
CREATE TABLE New AS
    SELECT a.*, b.Value
        FROM Old a LEFT JOIN Additional b
            ON a.ID = b.ID;
QUIT;

Joe · Accepted Answer

You can't do it in one create table statement, as it only creates one table at a time. But you can do a few things, depending on what your actual limiting factor is (you mention a few).

If you simply want to avoid writing the same code five times, but otherwise don't care how it executes, then just write the code in a macro, as you reference.

%macro update_table(old=, new=);
PROC SQL;
CREATE TABLE &new. AS
    SELECT a.*, b.Value
        FROM &old. a LEFT JOIN Additional b
            ON a.ID = b.ID;
QUIT;
%mend update_table;
%update_table(old=old1, new=new1)
%update_table(old=old2, new=new2)
%update_table(old=old3, new=new3)

Of course, if the names of the five tables are in a pattern, you can perhaps automate this further based on that pattern, but you don't give sufficient information to figure that out.

If you on the other hand need to do this more efficiently in terms of processing than running the SQL query five times, it can be done a number of ways, depending on the specifics of your additional table and your specific limitations. It looks to me that you have a good use case for a format lookup here, for example; see for example Jenine Eason's paper, Proc Format, a Speedy Alternative to Sort/Merge. If you're just merging on the ID, this is very easy.

data for_format;
  set additional;
  start = ID;
  label = value;
  fmtname='AdditionalF'; *or '$AdditionalF' if ID is character-valued;
  output;
  if _n_=1 then do;  *creating an "other" option so it returns missing if not found;
    hlo='o';
    label = ' ';
    output;
  end;
run;

And then you just have five data steps with a PUT statement adding the value, or even you could simply format the ID variable with that format and it would have that value whenever you did most PROCs (if this is something like a classifier that you don't truly need "in" the data).

Left join table on multiple tables in SAS

Answers (2)

Related Questions