nick_dataFE
nick_dataFE

Reputation: 57

Apply SAS Program to several dataset with macro

I am very new to SAS so I apologize in advance. I am using the SAS university edition.

I have 20 datasets each from a certain year (1997-2017), all containing information captured in 30 variables. Now, I want to apply the same code to all of the datasets, however some code chunks only to variables of certain years. Therefore, I wanted to use a macro that ranges from 1997-2017 doing something like...

LIBNAME IN '/folders/myfolders/fake_data';

%let j= 1997 to 2017; 

    data fake_&j;
     set fake_data;

proc import out= fake_&j datafile = "/folders/myfolders/fake_data/mz_&j.dta" replace

* Year;
year = j;

to access the dataset fake_1997.dta, create a year variable that takes on the value of the dataset's name (1997) apply the code (see below) to it, then do the same with mz_1998.dta and so on.

An example of the code that I want to apply to all of the data would be

* Weights;
    if (j GE 1997 AND j LE 2004) then 
      shrf = x; 
    else if (j GE 2005 AND j LE 2017) then
      shrf = y;

Thank you so much in advance!

Upvotes: 0

Views: 973

Answers (2)

AdamO
AdamO

Reputation: 4910

An approach that doesn't involve nesting macros is the magical data null and call execute. I use this all the time. It's most helpful if the datasets are already in a SAS format.

libname HAVELIB "path-to-sas-datasets";

data _null_;
  set sashelp.vtable(where=(libname="HAVELIB"));
  call execute("%mymacro(HAVELIB." || strip(memname) || ");");
run;

Creating a loop that imports a bunch of .dta files as sas7bdat is just as easy, create a dataset based on the output of infile pipedir and do a similar loop using call execute.

More info here:

https://www.lexjansen.com/phuse/2014/cc/CC06.pdf

Upvotes: 0

Richard
Richard

Reputation: 27498

Macro code, in part, is code that writes code. The writing is not so much an active process like a 'write' or 'print' or 'echo', but more akin to a boilerplate or template system.

The macro %DO loop can not exist in 'open code', so it must be coded inside a macro definition. The macro is 'invoked' in order to have it write (or generate) the code. You might sometimes see the term 'gencode' to mean generated code produced by invoking a macro.

Proc IMPORT is great for reading consistently data on a regular basis, or for dealing with first explorations. IMPORT does not do any data transformations or allow you to add new variables during the import. You will need a second step, a DATA step, to perform those actions.

Name you macros according to their purpose. Any macro variables used inside the macro should be declared as %LOCAL to prevent unwanted interaction with global macro variables.

Example:

%macro getData(fromYear=, toYear=);

  %local year;

  %DO year = &fromYear %to &toYear;

    * step 1;
    * get initial data set from raw data file;
    * double dot needed because &<NAME>. is a token specifying macro variable resolution;

    proc import 
      datafile = "/folders/myfolders/fake_data/mz_&YEAR..dta" /* double dot */
      replace
      out= import_fake_&YEAR.
    ;

    * step 2;

    data fake_&YEAR;
      set import_fake_&YEAR.;
      year = &YEAR;

      %* macro %if codegens a data step statement specific to year;

      %if &YEAR GE 1997 AND &YEAR LE 2004 %then %do;
        /* anything that is not consumed by macro processing is emitted as a codegen */
        /* so here the macro is emitting a data step assignment statement */
        shrf = x;   
      %end; 
      %else
      %if &YEAR GE 2005 AND &YEAR LE 2017 %then %o
        shrf = y;
      %end;
      %else %do;
        shrf = 1; * uniform weighting ;
      %end;
    run;
  %END;
%mend;

%* invoke;
%getData(fromYear=1997, toYear=2017);

%* this point your might want to combine (stack) all the data sets together
%* so that other Procs can use the 'all' data and utilize CLASS, BY and WHERE
%* statements that are so effective in SAS;

data fake_duodecade;
  set fake_1997-fake2017;  %* special data set name list construct;
run;

Specify macro parameters in your macro definition to make it more useful and reuseable. Don't write macros the reproduce the capabilities of existing Procedures. Don't write macros when you don't need to. Don't gencode that you can't write yourself. Don't mix (misunderstand) the scope of macro variables and know how they are not the same as DATA step variables.

Upvotes: 0

Related Questions