colbyjackson
colbyjackson

Reputation: 175

sas: how to set up if-then statement for multiple variables

Say I have a dataset that consists of schoolID SAT_code and student_name. What I want to classify is that for those schoolID with 'ABC' would have same SAT_code of 'East'. While those with schoolID with 'XYZ' would have same SAT_code of 'Midwest'.

For a dataset like this:

schoolID    SAT_code    student_name
ABC                     Jasmine Smith
ABC                     Michael Jordan
ABC                     Madison Trump
XYZ                     Sarah Potter
XYZ                     Jim Fowler
XYZ                     Jack Black
 .                          .
 .                          .
 .                          .

There are more than 30 schoolID besides that.

The easiest, but notorious one I could think of was to use if-then for 30 times.

data stateSAT;
 set statestats;
 if schoolID eq 'ABC' then SAT_code 'East';
 else if schoolID eq 'XYZ' then SAT_code 'Midwest';
 else if schoolID eq 'MNO' then SAT_code 'East';
         and so forth.....
 run;

Are there more efficient way, possibly having some kind of for-loop to resolve this issue?

Thanks.

Upvotes: 1

Views: 742

Answers (2)

Richard
Richard

Reputation: 27518

Your long sequence of if/then statements can be more cleanly stated with a SELECT statement. It is similar to SQL CASE or C switch

length SAT_code $20;

select (schoolID);
  when ("ABC") SAT_code = 'East';
  when ("XYZ") SAT_code = 'Midwest';
  when ("MNO") SAT_code = 'East';
  …
  otherwise SAT_code = '???';
end;

However, with 30+ schoolIDs, you would be better off following @mjsqu advice of moving those schoolID mappings to another data structure.

Upvotes: 1

mjsqu
mjsqu

Reputation: 5452

Lookup table

This is best done using a lookup table:

Create a table with schoolID and SAT_code and then perform a merge with your source table:

data schoolsat;
  infile datalines delimiter=',';
  input schoolID $3 SAT_code $25;
  datalines;
ABC,East
XYZ,Midwest
MNO,East
;
run;

This creates a table which maps the schoolID values to SAT_code values. Add all of the required combinations to this table.

Once you've done that, there are two ways of merging data in the SAS world. These are both examples of 'LEFT JOINs', which will keep all records from your statestats table, regardless of whether there's a corresponding row in your new schoolsat mapping table created above. If there's no corresponding schoolID value in schoolsat, SAS will leave a missing value for SAT_code in the resulting table.

SQL

proc sql;
  create table stateSAT as
  select a.*,
  b.SAT_code
  from statestats a
  left join schoolsat b
  on a.schoolID = b.schoolID;
quit;

DATA Step

proc sort data=schoolsat;
  by schoolID;
run;

proc sort data=statestats;
  by schoolID;
run;

data stateSAT;
  merge statestats (in=a)
        schoolsat;
  by schoolID;
  if a;
run;

Upvotes: 2

Related Questions